"Microsoft Unveils Fara1.5: Revolutionary 4B/9B/27B Browser Agents That Dominate Online Tasks Ahead Of OpenAI & Gemini 2.5"

Microsoft Research’s AI Frontiers lab has introduced Fara1.5, a series of computer-use agent (CUA) models designed for web browsing. This release includes three variants: Fara1.5-4B, Fara1.5-9B, and Fara1.5-27B. These models work seamlessly with MagenticLite, Microsoft’s secure browser environment built specifically for these agents.

Computer-use agents are visual-action systems that control an actual browser. They analyze screenshots and generate mouse and keyboard inputs to accomplish tasks. Modern agent tools such as OpenAI’s Operator and Google’s Gemini 2.5 Computer Use belong to this category.

Fara1.5-27B achieves a 72% task completion rate on Online-Mind2Web. This benchmark evaluates performance across 300 tasks on 136 well-known websites. In the same test, OpenAI’s Operator reaches 58.3% and Gemini 2.5 Computer Use reaches 57.3%. Yutori’s Navigator n1 attains 64.7%, while Fara1.5-9B scores 63.4%. This represents nearly double the performance of its predecessor Fara-7B, which achieved 34.1% on the same benchmark.

Architecture and agent loop

The models are built on Qwen3.5 base checkpoints across 4B, 9B, and 27B sizes. They follow an observe-think-act cycle. At each stage, the model receives the previous conversation history along with the latest three browser screenshots. It then produces thoughts and determines the next action.

The action set covers standard mouse and keyboard operations and browser-specific functions such as web search. It also includes meta-actions for managing context. These involve storing facts for future reference and requesting user clarification. These meta-actions enable the agent to handle extended tasks and collaborate effectively with users.

Training mix

Training involves supervised fine-tuning on approximately two million samples. The dataset consists of 60% web trajectories and 12.8% synthetic environments. Form completion and user interactions make up 12.5%. Grounding accounts for 8.8% and VQA 4.9%. Smaller portions include GUI drag, instruction following, and safety. Loss computation is restricted to the three most recent turns in each trajectory.

FaraGen1.5: The Synthetic Data Pipeline

FaraGen1.5 is the automated system used to generate training data. It includes three main components: environments, solvers, and verifiers.

Environments are divided into two types:
– Open-internet tasks run on regular websites (no login needed).
– Gated-domain tasks require user logins or involve actions that can’t be reversed, like sending an email.

For gated domains, Microsoft created six synthetic clones—called FaraEnvs—that simulate real apps: Mail, Calendar, Stream, ML, Stay, and Scheduler. Each clone has a realistic interface, working API, and pre-loaded user data to mimic actual use.

These clones were built with GitHub Copilot CLI and improved step-by-step by humans. Because Microsoft controls every detail of these clones, they know what the right outcome should be.
– If a task changes the app’s state (e.g., sends a message), the system checks database snapshots before/after to verify.
– For unchanged tasks, the system compares results to pre-set correct answers.

The solver uses OpenAI’s GPT-5.4 with custom tools matching the actions Fara1.5 can take. On the Online-Mind2Web benchmark, this solver scores 83% (up from 67% for the earlier Fara-7B solver). If the solver needs more input, it calls a user simulator.

Before training, trajectories go through three verification checks:
– **Correctness:** Uses AI-generated rules for open tasks and database checks for synthetic data.
– **Efficiency:** Penalizes unnecessary extra steps.
– **User interaction:** Ensures the agent pauses at key moments, like when waiting for a decision.

—

Critical Points Safety

Fara1.5 automatically pauses and asks for user input in three cases:
1. The task needs personal info that hasn’t been shared.
2. The task description isn’t clear or lacks necessary details.
3. An irreversible action (e.g., deleting data or sending a message) is about to happen without confirmation.

For safety, Microsoft trains the agent on public safety datasets and internal data aligned with their Responsible AI standards. In MagenticLite, every action is logged for review. The sandboxed browser also keeps the agent separated from the user’s device for added security.

—

Other Benchmark Results

On the WebVoyager test:
– **Fara1.5-27B:** 88.6%
– 9B model: 86.6%
– 4B model: 80.8%
The 9B version also beats similar-sized models like MolmoWeb 8B and GUI-Owl-1.5 8B.
To ensure consistency, all tests run on Browserbase with multiple independent repeats.

On the WebTailBench v1.5 (testing rare web tasks):
– **Fara1.5-9B:** 64.5% process success, 32.3% outcome success
– GPT-5.4: 79.6% process, 57.4% outcome

—

Key Takeaways

Here are five quick highlights:

Microsoft Research introduced Fara1.5: browser agents in 4B, 9B, and 27B versions, built on Qwen3.5.
Fara1.5-27B achieves 72% on Online-Mind2Web, surpassing OpenAI Operator (58.3%), Gemini 2.5 CU (57.3%), and Yutori Navigator n1 (64.7%).
FaraGen1.5 pipelines allow training on secure apps via six cloned systems (FaraEnvs) built with GitHub Copilot CLI.
Fara1.5 stops to ask the user when: info is missing, tasks are unclear, or irreversible actions are pending user approval.

Read the Technical details. Also, follow us on Twitter, join our 150k+ ML SubReddit, and sign up for our Newsletter. Already on Telegram? Join our Telegram community!

Interested in collaborating—for GitHub repos, Hugging Face pages, product launches, or webinars? Connect with us.

Molongui Authorship

说明：HTML 结构保持不变，仅对正文内容重新表述，使行文更简洁易懂，同时维持英文原文风格和术语准确性。

Top Posts

Charting the Vessel Storm: A Proteomic Blueprint for Vasculitis Remission

Migrate Your On-Prem ERP to Dynamics 365: A Cloud Transformation Journey

Supercharging Smart Homes: The Fibre Internet Revolution Behind IoT Awakening

“Microsoft Unveils Fara1.5: Revolutionary 4B/9B/27B Browser Agents That Dominate Online Tasks Ahead of OpenAI & Gemini 2.5”

Charting the Vessel Storm: A Proteomic Blueprint for Vasculitis Remission

Beyond Guesswork: A Slurm-Powered Battle Plan for Benchmarking Distributed LLM Servers

Beyond Prompt Engineering: How 4 Context Bricks Silence RAG Hallucinations

Run Mythos Enhanced Coding Model Locally with llama.cpp on Raspberry Pi

Astryx: Meta’s Open-Source React Toolkit—150+ Accessible Components, 7 Themes, and a CLI Agent-Ready Design System

Endless Code: Mastering the Art of the 24-Hour Claude Agent

Charting the Vessel Storm: A Proteomic Blueprint for Vasculitis Remission

Migrate Your On-Prem ERP to Dynamics 365: A Cloud Transformation Journey

Supercharging Smart Homes: The Fibre Internet Revolution Behind IoT Awakening

Speed, VRAM, Multi-GPU Smackdown: Unsloth, Axolotl, TRL, or LLaMA-Factory?

Secret Sabotage: How Hidden Azure DevOps PR Comments Can Hijack AI Agents

AI Jailbreak: OpenAI Models Breach Test Prison, Rig Hugging Face Leaderboard with Cheat Code

Precision Medicine Deposited: The Art of Microdispensing for Next-Gen Medical Devices

When the World Cup Collided with the Cloud: 2026’s Digital Traffic Surge

Trending

Charting the Vessel Storm: A Proteomic Blueprint for Vasculitis Remission

Migrate Your On-Prem ERP to Dynamics 365: A Cloud Transformation Journey

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

“Microsoft Unveils Fara1.5: Revolutionary 4B/9B/27B Browser Agents That Dominate Online Tasks Ahead of OpenAI & Gemini 2.5”

Architecture and agent loop

Training mix

FaraGen1.5: The Synthetic Data Pipeline

Critical Points Safety

Other Benchmark Results

Key Takeaways

Related Posts