At a Glance
- Researchers from Nvidia, Carnegie Mellon, and UC Berkeley have introduced ENPIRE, a system that enables AI coding agents to handle the entire process of training robots to perform new tasks—with zero human oversight required.
- AI agents powered by Codex, Claude Code, and Kimi Code guided a team of eight robots to achieve a 99% success rate on complex tasks such as pin insertion, GPU seating, and zip-tie cutting.
- Expanding from a single robot to eight reduced the time to master a task by over 50%, though computing token costs grew even more steeply than the savings in time.
Over recent weeks, eight robotic arms at Nvidia’s GEAR lab autonomously learned how to insert pins, seat graphics cards, and cut zip ties. The only humans who got involved were those who later authored the research paper.
This breakthrough was made possible by ENPIRE, a framework outlined in a paper released on Tuesday by a team from Nvidia, Carnegie Mellon University, and UC Berkeley. ENPIRE gives the entire robot-training workflow over to AI coding agents—the same software capable of writing and testing its own code—and lets them execute that process directly on real hardware.
AI coding agents like OpenAI’s Codex, Anthropic’s Claude Code, and Moonshot’s Kimi Code have been engaged in what researchers describe as “self-directed experimentation”—writing programs, running tests, and refining their work entirely without human input for the past year. So far, this cycle has largely been confined to software simulations that can be reset at virtually no cost. ENPIRE now extends this capability into the physical world, where recovery from a failed trial means physically repositioning a real robot arm.
How ENPIRE Works
The system operates in two distinct phases. During the initial phase, a human guides the agent through constructing two key, reusable components: a reset procedure that restores the workspace to a standard starting state, and a reward mechanism that analyzes camera feeds to evaluate task performance—essentially a tireless, error-free judge. This configuration is completed once, then leveraged for every subsequent trial.
With those components in place, the agent operates independently. It reviews published literature to gather insights, selects among training strategies such as imitation learning, reinforcement learning, or hand-crafted rules, then modifies its own code and evaluates outcomes on the physical robot. Throughout this loop, no human supervision is necessary—which is either an exciting efficiency gain or a mildly unnerving thought if you’re imagining a robot working with scissors completely on its own.
Nvidia conducted the experiment across eight dual-arm robot workstations, each equipped with its own dedicated hardware, computer, and coding agent. These stations share their progress using Git—the same version-control system developers rely on to merge code—so any successful approach discovered by one robot can spread to the entire fleet in just minutes.
The team evaluated performance on two tasks: “Push-T,” where a robot must slide a T-shaped block into a designated area using only pushing motions, and pin insertion, which involves threading pins into tiny 4-millimeter holes. By scaling from a single robot to eight, the time needed to master Push-T dropped from about five hours to just two, while pin insertion went from over 90 minutes down to roughly 40.

According to the research paper, the agents achieved a 99% success rate across all four real-world tasks tested. For pin insertion specifically, the agents reached near-perfect reliability faster than a human-in-the-loop approach—the kind that still requires a person to be present and actively involved.
Jim Fan, co-lead of Nvidia’s GEAR Lab and head of the company’s AI research, described the project as the first-ever implementation of AutoResearch in the physical world. Fan explained that the team provided the agents with a fleet of robots, a GPU allocation, and a token budget, then stepped aside and let the robots take charge.
Today, we enable AutoResearch in the physical world for the first time! Introducing ENPIRE: we give 8 Codex agents a fleet of robots, an allocation of GPUs, and generous token budget. We set them free with a simple goal: solve the task as quickly as possible, keep the robots busy… pic.twitter.com/zC0OQNzDBs
— Jim Fan (@DrJimFan) June 16, 2026
The difference between simulation and reality became apparent almost right away. While all three coding agents successfully solved the Push-T task in a simulator, two out of the three struggled when the same task was transferred to an actual physical robot, the paper reports.
Simulators don’t deal with friction. Real tables do.
Nvidia also put ENPIRE to the test inside RoboCasa, a simulated kitchen environment that evaluates robots on household chores like opening cabinets or turning off stoves—measuring success rates without any actual risk of setting anything on fire. In that benchmark, ENPIRE outperformed both Nvidia’s own end-to-end model GR00T and CaP-X, a tool-using agent that bypasses the autoresearch loop entirely.
ENPIRE builds on a concept Nvidia first introduced with Eureka, a 2023 system that used a language model to automatically generate reward functions for robots in simulation, eliminating the need for human engineers to write them manually. ENPIRE takes that self-improvement cycle out of the simulator and runs it on real hardware, with the agent now designing its own experiments rather than just its own reward functions.
The announcement comes the same week Alibaba launched its own embodied-AI initiative, the Qwen-Robot Suite—a set of three foundation models for robot navigation, manipulation, and physics simulation. Alibaba is developing software brains for robot bodies it doesn’t build itself, while Nvidia is exploring whether AI agents can manage the entire research process on hardware it fully controls from start to finish. Both efforts signal the same broader shift: physical robots are emerging as the next frontier where coding agents will prove their worth.
Daily Debrief Newsletter
Start every day with the top news stories right now, plus original features, a podcast, videos and more.



