AGIBOT Launches World Challenge 2026: Testing AI Models On Real-World Tasks

Participants in the challenge tested and debugged robots working on different tasks. | Source: AGIBOT

AGIBOT Innovation Technology Co. recently hosted the AGIBOT World Challenge 2026 alongside ICRA 2026 in Vienna. The event gathered 526 research and enterprise teams from 27 countries to compete in two embodied AI categories: “Reasoning to Action” and “World Model.”

Shanghai-based AGIBOT noted that the competition underscored a significant shift in how embodied AI is assessed. The company explained that the industry is moving past simulation-based scores toward closed-loop testing on actual robots, real-world tasks, and standardized benchmarks.

The competition used a benchmark-driven approach that merged online automated evaluation with an offline real-robot final in Vienna. Leveraging AGIBOT’s EWMBench and Genie Sim Benchmark, the unified framework allowed for automated testing, standardized metrics, and reproducible outcomes.

During the offline final, finalist teams performed tasks using the AGIBOT G2 humanoid robot. By integrating real-robot validation into the evaluation process, the competition emphasized robot stability, real-world adaptability, and long-horizon task reliability in its scoring system. The company, also known as Zhiyuan Robotics Co., stated that this approach more closely aligns technical evaluation with real-world deployment requirements.

The challenge attracted research and industry teams from prominent institutions and companies, including the Chinese Academy of Sciences, Tsinghua University, the University of Science and Technology of China, the University of California San Diego, Russia’s Sber Robotics Center, Alibaba, Amap, and vivo. Over 100 teams exceeded the official baseline.

What distinguishes the R2A and WM tracks?

The two tracks at the AGIBOT World Challenge 2026 mirrored the broader progression of embodied AI from task execution toward understanding, prediction, and decision-making, according to AGIBOT.

The Reasoning to Action (R2A) track assessed how robots interpret tasks, plan actions, and carry them out in physical environments. The R2A track, an evolution of the 2025 Manipulation track, broadened the evaluation from action execution to the complete process of environment understanding, task planning, and physical execution.

The World Model (WM) track concentrated on how AI systems forecast physical-world changes and model interactions based on robot actions and sensor inputs.

Teams developed reasoning-and-manipulation models using the AGIBOT WORLD open-source dataset and assessed them through Genie Sim 3.0, with the benchmark encompassing language understanding, spatial reasoning, atomic skills, disturbance adaptation, and zero-shot transfer.

In the final standings, PrismBot from vivo claimed the championship with 43.47 points, followed by Shanghai RoboParty’s RP-VLA with 35.66 points and Russia’s GreenVLA with 33.19 points.

AGIBOT focuses on supermarket tasks with the challenge

Alongside the competition, AGIBOT and Dexmal introduced a supermarket benchmark track centered on end-to-end decision-making and whole-body control. This track included non-ideal physical interactions, such as object drops and grasping failures, to better represent the complexity of real-world interaction and offer a more practical evaluation framework for world model research.

Set in a realistic retail environment, the track required models to complete the full mobile manipulation process, from autonomous navigation and item picking to item transport and placement, under physical constraints like shelf height limits and randomized item placement. Through API-based remote control, participants’ algorithms directly operated real robots, establishing a practical benchmark for assessing embodied intelligence in deployment-focused scenarios.

In the World Model (WM) track, NeoVerse-ABot, a joint team from the Institute of Automation of the Chinese Academy of Sciences and Amap CV Lab, secured first place. The PAI@IAII team from the Institute of Industrial Artificial Intelligence at the Chinese Academy of Sciences ranked second. The Loop team from the University of Science and Technology of China placed third.

With the World Challenge, AGIBOT hoped to contribute to a more practical and reproducible evaluation framework for embodied AI. | Source: AGIBOT

AGIBOT unveils full-stack toolchain for robot validation

Beyond the competition itself, AGIBOT released a full-stack toolchain covering real-world data, simulation evaluation, and real-robot testing. The toolchain included the AGIBOT WORLD open-source dataset, Genie Sim 3.0, and the AGIBOT G2 robot platform, assisting developers in validating models across the journey from training to simulation and physical deployment.

EWMBench and Genie Sim Benchmark provided standardized metrics, automated evaluation, and comparable results across simulation and physical testing. They tackled common challenges such as inconsistent evaluation criteria and the disparity between simulated performance and real-world deployment.

AGIBOT stated that it will merge the technical and ecosystem resources developed through the competition with its ongoing benchmark development and open-source initiatives. The company also plans to launch an online simulation leaderboard, introduce additional test tasks and diversified benchmarks, and support more comprehensive quantitative evaluation of model capabilities.

Furthermore, AGIBOT stated it will continue to refine its benchmarks and full-stack toolchain, collaborating with global research institutions, developers, and industry partners. Its stated objective is to help embodied AI transition from individual algorithmic breakthroughs toward systems that can be deployed and scaled in real-world environments.

In other benchmark news, Fraunhofer IPA recently introduced a new test benchmark for humanoid robots, and NIST proposed its own baseline performance benchmark for humanoids.

ITE AD for the 2026 RoboBusiness call for speakers

Submit your session idea for the 2026 RoboBusiness

Top Posts

When OpenAI Models Breach Hugging Face: Feedback Friday Aftershocks

Bitcoinist.App Unleashes Secret Mobile Mining Power on iOS

Microsoft Supercharges Azure AI: Databricks & Mistral Unleashed

AGIBOT Launches World Challenge 2026: Testing AI Models on Real-World Tasks

Ropedia Bags $22M to Harvest the World’s Data for Robotic Brains

AMD’s Robo-Brain: Kria Module Ignites Real-Time Control with Unified Memory Revolution

Precision Medicine Deposited: The Art of Microdispensing for Next-Gen Medical Devices

The Magic of Friction: Engineering Smarter Robot World Models

Feel the Future: Generative Bionics Reveals a Robot You Can Touch

MISUMI Americas: Reshoring Report Champions New Manufacturing Training Bill

When OpenAI Models Breach Hugging Face: Feedback Friday Aftershocks

Bitcoinist.App Unleashes Secret Mobile Mining Power on iOS

Microsoft Supercharges Azure AI: Databricks & Mistral Unleashed

The AI Heist: How OpenAI’s Rogue Agent Breached Hugging Face and the Digital Hunt for the Guardian Who Stopped It

From Pixels to PDFs: Mastering Baidu’s Unlimited-OCR for Ultra-High-Res & Multi-Page Magic

Nature’s Living Album: Covering Earth’s Past, Present, and Future

Ransomware Siege: Unlocking the Lockdown on Your VPN Weak Spots

Zhibao Bets on Bitcoin: $220M Stock Sale to Forge Nasdaq Crypto Treasury

Trending

When OpenAI Models Breach Hugging Face: Feedback Friday Aftershocks

Bitcoinist.App Unleashes Secret Mobile Mining Power on iOS

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

AGIBOT Launches World Challenge 2026: Testing AI Models on Real-World Tasks

What distinguishes the R2A and WM tracks?

AGIBOT focuses on supermarket tasks with the challenge

AGIBOT unveils full-stack toolchain for robot validation

Related Posts