Physical AIFunding RoundJun 18, 2026, 9:47 PM· 4 min read· #4 of 6 in business

Robotics Startup XDOF Emerges With $70M to Solve AI's Physical Data Bottleneck

XDOF has raised $70 million to build the physical data infrastructure required to train general-purpose robots, releasing the world's largest open-source manipulation dataset.

By Factlen Editorial Team

Frontier AI Labs 40%Robotics Researchers 35%Venture Capitalists 25%
Frontier AI Labs
Values outsourcing the massive operational complexity of physical data collection to focus entirely on software and model architecture.
Robotics Researchers
Celebrates the release of open-source resources like the ABC-130K dataset, which democratizes access to high-fidelity teleoperation data.
Venture Capitalists
Views physical data infrastructure as the next highly defensible layer of the AI ecosystem, offering unique moats compared to pure software startups.

What's not represented

  • · Labor advocates monitoring the working conditions of global teleoperators
  • · Hardware manufacturers building the robots that rely on this data

Why this matters

While artificial intelligence has mastered text and images, it remains clumsy in the physical world due to a severe lack of training data. By industrializing how robots learn physical tasks, this infrastructure breakthrough accelerates the timeline for capable, general-purpose robots entering warehouses, hospitals, and homes.

Key points

  • XDOF emerged from stealth with $70 million in seed funding to build physical data infrastructure for robotics.
  • The startup acts as an outsourced data factory, allowing AI labs to avoid managing warehouses and hardware fleets.
  • XDOF co-released ABC-130K, the world's largest open-source robot manipulation dataset, featuring 130,000 trajectories.
  • The company uses a three-tier data collection strategy involving direct teleoperation, GELLO devices, and wearable sensors.
  • The funding will be used to hire a global workforce of teleoperators and develop proprietary hand-tracking sensors.
$70M
Seed funding raised
130,000
Trajectories in the ABC-130K dataset
20
Active enterprise customers

The artificial intelligence industry has spent the last three years conquering digital tasks, but teaching a machine to fold a t-shirt remains a monumental challenge. Now, a newly launched startup called XDOF is emerging from stealth to solve the physical data bottleneck holding back the robotics revolution. Backed by a massive $70 million seed round, the company is building the outsourced data infrastructure required to train the next generation of general-purpose robots.[1][3]

The funding round, which officially closed this week, drew participation from a roster of heavyweight venture capital firms. Thrive Capital, Spark Capital, Andreessen Horowitz (a16z), Lux Capital, and WndrCo all backed the San Mateo-based startup. The sheer size of the seed investment underscores a growing consensus in Silicon Valley: the next defensible, highly lucrative layer of the AI boom will be deeply physical.[2][5][6]

Founded in October 2024 by UC Berkeley alumni Philipp Wu, Fred Shentu, and Nemo Jin, XDOF operates on a simple premise. While frontier AI laboratories excel at developing complex software models, they lack the operational scale and desire to independently manage massive, messy physical data operations.[3][4]

XDOF's $70 million seed round drew participation from major Silicon Valley venture capital firms.
XDOF's $70 million seed round drew participation from major Silicon Valley venture capital firms.

Physical manipulation data is the specific bottleneck preventing frontier AI labs from training generalist robots. Building an in-house data pipeline requires hundreds of thousands of square feet of warehouse space, fleets of expensive robotic hardware, continuous mechanical calibration, and a globally distributed workforce of trained human operators.[2][4]

Rather than forcing software companies to become heavy-industrial operators, XDOF serves as an outsourced data factory. The company provides end-to-end data pipelines, collection hardware, and annotation systems, allowing AI labs to keep warehouse-scale operational complexity off their balance sheets while still advancing their embodied intelligence programs.[2]

Rather than forcing software companies to become heavy-industrial operators, XDOF serves as an outsourced data factory.

To demonstrate its capabilities and immediately impact the broader research community, XDOF has partnered with UC Berkeley's AI Research lab to release ABC-130K. The company describes the release as the largest and highest-quality open-source robot manipulation dataset ever made available to the public.[1][3][4]

The ABC-130K dataset includes thousands of trajectories of robots performing precise tasks like folding laundry.
The ABC-130K dataset includes thousands of trajectories of robots performing precise tasks like folding laundry.

The ABC-130K dataset contains 130,000 distinct physical trajectories, supplemented by 300 hours of simulation data and 100 hours of rigorous evaluation metrics. The dataset captures robots performing tasks that require extreme precision and spatial awareness, such as flattening cardboard boxes, folding laundry, and carefully placing wireless earbuds into their charging cases.[1][2][4]

Gathering this volume of high-fidelity physical data requires a multi-tiered acquisition strategy. XDOF deploys a three-pronged approach to capture human-level dexterity and translate it into machine-readable formats that foundational models can easily ingest.[2][4]

The first tier involves direct teleoperation, where human operators remotely control the exact deployment robots that will eventually run the AI models. The second tier utilizes "GELLO" devices—specialized, low-cost teleoperation rigs that mimic robotic joints. The final tier relies on egocentric wearable sensors worn by humans as they go about everyday tasks, capturing the subtle mechanics of human movement.[2][4]

XDOF utilizes a three-pronged approach to capture human dexterity and translate it into machine-readable training data.
XDOF utilizes a three-pronged approach to capture human dexterity and translate it into machine-readable training data.

The startup's infrastructure-as-a-service model is already proving highly attractive to the industry's biggest players. Despite operating under the radar for less than two years, XDOF has already secured approximately 20 active enterprise customers, including several leading frontier AI research groups.[1][2][4]

The timing of XDOF's launch aligns perfectly with a broader industry pivot toward physical AI. Just weeks ago, OpenAI announced the revival of its own robotics training program, signaling that the race to build embodied intelligence is accelerating. By standardizing data collection and cleaning, XDOF aims to eliminate the historical lag between rapid robot hardware advancements and the software required to operate them.[1][4]

With $70 million in fresh capital, XDOF plans to rapidly scale its operations. The company will use the funds to hire and train a global workforce of teleoperators and egocentric data gatherers. Additionally, XDOF is developing its own proprietary wearable sensors to ensure that its hand-tracking algorithms perfectly match the mechanical realities of the robots being trained.[1][5]

How we got here

  1. Oct 2024

    XDOF is founded by UC Berkeley alumni Philipp Wu, Fred Shentu, and Nemo Jin.

  2. Early 2026

    OpenAI revives its robotics training program, accelerating the industry's focus on physical AI.

  3. Jun 17, 2026

    XDOF emerges from stealth, announcing its $70M seed round and the open-source ABC-130K dataset.

Viewpoints in depth

Frontier AI Labs

Values outsourcing the massive operational complexity of physical data collection.

For the organizations building the world's most advanced AI models, the physical world represents a frustrating bottleneck. Developing software requires immense compute power, but it does not require leasing hundreds of thousands of square feet of warehouse space or maintaining fleets of delicate robotic hardware. By partnering with infrastructure providers like XDOF, these labs can keep heavy-industrial operations off their balance sheets. This allows them to focus their capital and engineering talent entirely on model architecture, treating physical data as a service they can simply purchase on demand.

Robotics Researchers

Celebrates the release of open-source resources that democratize access to high-fidelity data.

Academic institutions and independent researchers have long struggled to compete with corporate labs due to the sheer cost of generating physical training data. The release of the ABC-130K dataset is viewed as a watershed moment for the community. By open-sourcing 130,000 high-quality manipulation trajectories, XDOF and UC Berkeley are lowering the barrier to entry for robotics research. Researchers argue that this democratization of data will accelerate breakthroughs in embodied intelligence, allowing smaller teams to test new algorithms without needing to build their own teleoperation rigs.

Venture Capitalists

Views physical data infrastructure as the next highly defensible layer of the AI ecosystem.

Investors are increasingly wary of funding pure software wrappers that can be easily replicated or rendered obsolete by the next major foundational model update. Physical infrastructure, however, offers a robust competitive moat. Venture capitalists backing XDOF recognize that building a global network of teleoperators, securing warehouse space, and developing proprietary wearable sensors requires significant capital and operational expertise. This hardware-heavy approach creates a durable business model that is difficult for new entrants to quickly disrupt, making it an attractive target for massive seed-stage investments.

What we don't know

  • How quickly XDOF can scale its global workforce of teleoperators to meet the surging demand from frontier AI labs.
  • Whether the proprietary wearable sensors currently in development will significantly outperform existing off-the-shelf hand-tracking technology.
  • Which specific frontier AI labs make up the bulk of XDOF's 20 active enterprise customers.

Key terms

Teleoperation
The remote control of a robot by a human operator, used to generate training data by demonstrating exactly how a task should be performed.
Embodied Intelligence
Artificial intelligence that interacts with the physical world through a robotic body, rather than just processing digital text or images.
Egocentric Data
Information captured from a first-person perspective, often using wearable sensors or cameras, to record how humans naturally move and interact with objects.
Frontier AI Labs
The leading research organizations and companies developing the most advanced, cutting-edge artificial intelligence models.

Frequently asked

What exactly does XDOF do?

XDOF provides the physical data infrastructure needed to train robots. They use human operators and wearable sensors to record exactly how physical tasks are done, then supply that data to AI companies.

Why can't AI companies collect this data themselves?

Collecting high-quality robotics data requires massive warehouses, fleets of robots, constant hardware maintenance, and thousands of trained human operators. AI software labs prefer to outsource this heavy operational burden.

What is the ABC-130K dataset?

It is a massive, open-source collection of 130,000 robot manipulation trajectories released by XDOF and UC Berkeley to help researchers worldwide train better general-purpose robots.

Sources

Source coverage

6 outlets

3 viewpoints surfaced

Frontier AI Labs 40%Robotics Researchers 35%Venture Capitalists 25%
  1. [1]SiliconANGLERobotics Researchers

    Robotic teleoperation data startup XDOF launches with $70M in funding

    Read on SiliconANGLE
  2. [2]AI WeeklyFrontier AI Labs

    XDOF Lands $70M to Build Robot Training Data Pipelines

    Read on AI Weekly
  3. [3]Pulse 2.0Robotics Researchers

    XDOF Raises $70 Million To Build Infrastructure For Robot Foundation Models

    Read on Pulse 2.0
  4. [4]Hyper AIFrontier AI Labs

    XDOF Raises $70M to Build Data Pipelines for Robot Training

    Read on Hyper AI
  5. [5]The SaaS NewsVenture Capitalists

    XDOF raises $70M from Thrive Capital, a16z, and others to build data infrastructure and annotation systems for robot training

    Read on The SaaS News
  6. [6]AxiosVenture Capitalists

    Venture Capital Deals: XDOF

    Read on Axios
Stay informed

Every angle. Every day.

Get business stories with full source coverage and perspective breakdowns delivered to your inbox.