Genie Envisioner World Simulator takes video information to assist management robots. 2.0 Supply: AGIBOT
AGIBOT at the moment introduced the discharge of Genie Envisioner 2.0, or GE 2-Sim, which it stated marked a major step ahead within the evolution of world fashions — from world motion fashions to completely interactive “world simulators.”
The brand new system introduces what the corporate described as a “physical evolution engine” for embodied AI. It’s a model-based surroundings the place robots will be skilled, evaluated, and optimized at scale, with out relying solely on expensive real-world trial and error.
From understanding the world to studying inside it
In 2025, AGIBOT launched what it claimed was the trade’s first action-driven world mannequin, Genie Envisioner. The open-source platform enabled robots to grasp the world via built-in modeling of imaginative and prescient, language, and motion, stated the Shanghai-based firm.
With Genie Envisioner 2.0, AGIBOT stated it has shifted the paradigm additional, from enabling robots to grasp the world after which to enabling them to be taught inside a world generated by fashions.
The corporate asserted that this transition displays a broader shift in embodied AI — from representing the world to simulating the world itself. As world fashions evolve into steady, high-fidelity environments that reply to actions in bodily constant methods, they unlock the flexibility to coach robots at scale in artificial environments.
AGIBOT stated it believes GE 2-Sim marks a essential inflection level towards attaining a real scaling legislation in embodied intelligence.

World motion fashions can present state evolution. Click on right here to enlarge. Supply: AGIBOT
From world motion fashions to world simulators
On the core of this evolution is AGIBOT’s continued improvement of the world motion mannequin (WAM) framework, which extends conventional world fashions by explicitly incorporating actions as a first-class variable.
Reasonably than modeling solely state, WAM captures the total loop of:
- State → Motion → State Evolution
This permits world fashions to function a foundational layer for each coverage studying and motion era. Constructing on this basis, AGIBOT has progressively developed a collection of techniques:
- EnerVerse: Extends embodied environments right into a computable 4D world mannequin
- Genie Envisioner Act (GE-Act): Bridges world illustration and motion trajectory era
- Act2Goal: Allows long-horizon, goal-driven management
Whereas these advances allowed world fashions to help coverage studying, real-world deployment uncovered key limitations: excessive reliance on bodily environments, expensive analysis, and information scalability constraints.
This led to a basic realization. The following breakthrough lies not in stronger illustration, however in reworking world fashions into totally purposeful simulators.
Making the world runnable: Towards interactive simulation
To allow this transition, AGIBOT introduces a set of recent capabilities that push world fashions towards interactive simulation:
- EnerVerse-AC: Introduces action-conditioned world modeling for future prediction
- Genie Envisioner Sim (GE-Sim): A neural simulator for closed-loop coverage analysis
- EWMBench: A complete benchmark evaluating simulation constancy, motion correctness, and semantic alignment
On the identical time, AGIBOT establishes a brand new information and coaching paradigm:
- Real2Edit2Real: Actual-world information turns into editable and extensible, considerably rising scale and variety
- Constancy-Conscious Information Composition: Combines actual and generated information to stability realism and generalization
Collectively, these developments remodel world fashions from illustration techniques into environment-level infrastructure.

A world simulator could make simulation extra interactive and productive. Click on right here to enlarge. Supply: AGIBOT
Genie Envisioner 2.0: A ‘bodily evolution engine’
Genie Envisioner 2.0 represents the end result of this evolution—a system that’s now not simply generative, however operational. Key capabilities embrace:
Motion-driven world dynamics
The system responds on to robotic actions, producing high-fidelity environmental adjustments that comply with bodily and semantic constraints. The world turns into a course of formed by interplay, reasonably than a static illustration.
Lengthy-horizon temporal modeling
Helps minute-level steady simulation, enabling steady era of full job sequences reasonably than fragmented clips.
Embodied spatial consistency
Unifies multi-view notion, cross-view 3D consistency, and robotic proprioception right into a single illustration—reworking notion from photos into a totally interactive embodied world.
Constructed-in analysis and reward modeling
A local common reward mannequin permits self-evaluation and optimization based mostly on textual suggestions, supporting reinforcement studying on the earth mannequin with out human-designed rewards.
Towards real-time interplay
With improved inference effectivity, GE 2-Sim approaches real-time operation, enabling:
- Eval in World Mannequin
- RL in World Mannequin
- Teleoperation in World Mannequin
This marks the transition of world fashions from offline instruments to interactive system environments.

The core simulation engine can present information to feed AI. Click on right here to enlarge. Supply: AGIBOT
A paradigm shift: When fashions turn out to be worlds
As these capabilities converge, embodied AI is present process a basic transformation, from “using models to understand the world” to “learning and making decisions within model-generated worlds.”
On one facet, the mixing of WAM and vision-language-action (VLA) fashions permits a shift from reactive management to generative, predictive decision-making.
On the opposite, world simulators enable robots to discover, iterate, and optimize at scale—now not restricted by real-world information availability, however by the constancy of simulation itself.
When these two trajectories converge, robots transfer past replicating human demonstrations to constantly exploring, adapting, and evolving inside model-generated environments.
Towards a brand new basis for embodied intelligence
AGIBOT envisions world fashions evolving from instruments for understanding, to platforms for studying, and finally to infrastructure that drives steady evolution.
When fashions turn out to be worlds, actuality is now not the one coaching floor. When worlds will be constructed, studying will be scaled. And when evolution occurs inside fashions, the boundaries of embodied AI will be essentially redefined.
Editor’s notice: On the 2026 Robotics Summit & Expo on Might 27 and 28 in Boston, there can be classes on embodied and bodily AI. Registration is now open.

The submit AGIBOT unveils Genie Envisioner 2.0 to advance world fashions into scalable simulators for embodied AI appeared first on The Robotic Report.



