Sakana Marlin Debuts With AB-MCTS, Empowering Enterprises To Auto-Generate Comprehensive 100-Page Reports And Slide Decks

This week, Tokyo-based Sakana AI released its first commercial offering, ‘Sakana Marlin.’ The team behind it describes the tool as a Virtual CSO (Chief Strategy Officer)—an autonomous B2B research agent designed specifically for enterprise use.

Unlike a typical chatbot that responds in seconds, Marlin works differently. You provide a single research topic, and it operates independently for as long as eight hours. Once complete, it delivers a comprehensive report along with a set of presentation slides. According to Sakana, each session can involve hundreds or even thousands of queries to large language models.

What is Sakana Marlin

Marlin is not a conversational assistant—it’s an enterprise-grade research agent. Feed it a single topic or question, and it takes over from there: formulating hypotheses, scanning sources, and cross-checking findings on its own. The goal is to condense weeks of strategic analysis into just a few hours.

The output is built for executives and decision-makers. The Japanese press release mentions reports spanning dozens of pages, while the English version references reports reaching up to approximately 100 pages. During a hands-on press session, reports came in between 60 and 100 pages, citing 60 to 80 sources. Every report is structured with a main body, a reference section, and appendices. Presentation slides are automatically created using AI image-generation tools.

Sakana fine-tuned Marlin through a closed beta in April 2026, where roughly 300 professionals put it to work on real-world assignments. Those assignments covered strategy development, market research, risk assessment, and competitive analysis. Sakana has also formed a partnership with MUFG and secured strategic investment from Citigroup.

Inside AB-MCTS: Wider or Deeper

At the core of Marlin lies AB-MCTS—Adaptive Branching Monte Carlo Tree Search. This technique stems from Sakana’s earlier research paper titled “Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search.”

AB-MCTS frames reasoning as a tree-search challenge. At every step, the algorithm faces a choice: it can go wider by producing a brand-new candidate answer, or it can go deeper by refining an answer that already looks promising. Traditional repeated sampling only goes wider in parallel and then hopes one of the answers turns out to be correct.

A multi-LLM variation introduces another option—it can assign a step to a completely different model. In Sakana’s ARC-AGI-2 benchmark tests, this multi-model collaboration made a measurable difference. Using a combination of o4-mini, Gemini 2.5 Pro, and DeepSeek-R1, the system solved roughly 27.5% of tasks, compared to about 23% with o4-mini alone. Marlin leverages this same adaptive search approach for extended, long-horizon research tasks.

The second foundational piece behind Marlin is workflow automation drawn from Sakana’s AI Scientist project, which demonstrated fully autonomous scientific discovery and was published in the journal Nature.

Interactive demo: The embeddable widget (marlin-abmcts-demo.html) visualizes the “wider or deeper” decision-making process in real time. Hit Run and watch the tree expand. Nodes shaded in green represent higher scores, and the optimal path is highlighted. Switch on the “Multi-LLM” toggle to observe how steps get distributed across different models.

AB-MCTS: “Wider or Deeper?” — interactive search

A simplified visual of Sakana AI’s Adaptive Branching Monte Carlo Tree Search. Each step the policy chooses to widen (new candidate) or deepen (refine a promising line).

Search state

Budget used0 / 24

Nodes (candidates)1

Best score0.00

Wider / Deeper0 / 0

low score
high score
best path

Gemini 2.5 Pro
o4-mini
DeepSeek-R1

How Marlin Compares

Marlin is built for depth, not speed. Standard deep-research tools return answers in minutes or tens of minutes. Marlin intentionally invests hours to produce higher-quality output. The competitor run times listed below are approximate and drawn from publicly reported figures, not official specifications.

Tool	Typical run time	Output	Primary user
Sakana Marlin	Up to ~8 hours	Report (dozens to ~100 pages) + slides	Enterprise strategy teams
OpenAI Deep Research	~Minutes to tens of minutes	Cited text report	General and pro users
Perplexity Deep Research	~A few minutes	Cited text answer	General users
Google Gemini Deep Research	~Minutes	Cited text report	General and workspace users

The trade-off is straightforward: you wait longer and pay per session, but in exchange you receive more thorough hypothesis testing and a polished, ready-to-use deliverable. You can stop a run at any point, though credits are consumed regardless.

Pricing

Sakana provides pay-as-you-go access alongside Pro, Team, and Enterprise plans. Pay-as-you-go starts at 100 credits per run, priced at ¥98 per credit. The Pro plan costs ¥150,000 per month and includes 2,000 credits. The Team plan is ¥400,000 per month with 6,000 credits. Enterprise pricing is customized and comes with dedicated support.

Use Cases, With Examples

Marlin is best suited for high-stakes questions where research is the main bottleneck. Below are concrete examples aligned with its intended use cases.

Market entry: “Evaluate Japan’s stablecoin and tokenized-payments market following recent regulatory changes.” Marlin identifies key drivers, risks, and structured strategic options in a detailed report.
Risk analysis: “Model potential resolution scenarios for a Strait of Hormuz blockade.” Rather than simply summarizing information, it weighs competing hypotheses before arriving at conclusions.
Competitive analysis: “Profile three competitors and rank our positioning gaps.” It produces presentation slides ready for a strategy review meeting.

Each of these scenarios fits within a single prompt and a single unattended run. A human should still review the cited output before making any decisions based on it.

Try the Engine Yourself: TreeQuest

Marlin itself is not available for self-hosting, but you can experiment with its core algorithm right now. Sakana has open-sourced AB-MCTS as TreeQuest under the Apache 2.0 license. Install it, define a generate function,

Then execute a predetermined search budget.

import random
import treequest as tq

# Each node contains a state you define; score must be between 0 and 1.
def generate(parent_state):
    if parent_state is None:               # None means start from the root node
        new_state = "First draft"
    else:
        new_state = f"Improved version of: {parent_state}"
    score = random.random()                # replace this with an LLM-based evaluation
    return new_state, score

algo = tq.ABMCTSA()                         # Adaptive Branching MCTS (version A)
search_tree = algo.init_tree()

for _ in range(10):                         # allocate 10 generation cycles
    search_tree = algo.step(search_tree, {"generate": generate})

best_state, best_score = tq.top_k(search_tree, algo, k=1)[0]
print("TOP RESULT:", best_state, round(best_score, 3))

Replace the random score with an LLM evaluator to match the real-world usage pattern. TreeQuest also provides multi-LLM search capabilities and checkpoint support for extended sessions. Checkpoint support is important since prolonged runs may encounter API failures during execution.

Strengths and Weaknesses

Strengths

Research backed by peer review: AB-MCTS featured at NeurIPS and AI Scientist published in Nature.
Complete outputs, including references, supplementary materials, and presentation slides.
Smart resource allocation focuses compute power on the most promising paths.
The open-source core (TreeQuest) allows AI researchers to explore the methodology.

Weaknesses

Extended processing times make rapid iteration slower compared to research tools that operate in minutes.
Automated reports may include subtle errors that require manual verification.
Pricing and features are designed for enterprise clients, not solo developers.
Marlin as a product is proprietary; only the core algorithm is publicly available.

Key Takeaways

Sakana Marlin handles autonomous research tasks lasting up to roughly eight hours each.
A single run generates a report spanning dozens of pages, along with presentation slides.
It builds on AB-MCTS (NeurIPS 2025 Spotlight) and AI Scientist workflows (Nature).
Pricing starts with a pay-per-use model: 100 credits per run at ¥98 per credit.
It serves finance departments, corporate strategy teams, consulting firms, and think-tank organizations.

Sources

Sakana AI — Sakana Marlin release:
Sakana AI — Sakana Marlin product page:
Sakana AI — AB-MCTS research and TreeQuest:
SakanaAI/treequest (GitHub, Apache 2.0):

Top Posts

3 Clever Pandas Tricks to Supercharge Your Data Cleaning & Preparation

Boosting Arm64 support in CNCF projects using OCI credits

Industrial Automation Propels Private 5G Beyond 2,000 Deployments

Sakana Marlin Debuts with AB-MCTS, Empowering Enterprises to Auto-Generate Comprehensive 100-Page Reports and Slide Decks

sktime in Python: A Practical Guide to Building Time-Series Machine Learning Models

Windows Subsystem for Linux 3: The Game-Changer That Makes Developers Loyal to Microsoft

Anthropic Export Controls Spark Global AI Sovereignty Scramble

GPU Time-Slicing for Concurrent LLM Agents on Kubernetes

3 Sneaky Signs Your Wi-Fi Is Being Hacked — Plus How to Shut It Down for Good

4 Essential Lines Every Claude Skill Must Have

3 Clever Pandas Tricks to Supercharge Your Data Cleaning & Preparation

Boosting Arm64 support in CNCF projects using OCI credits

Industrial Automation Propels Private 5G Beyond 2,000 Deployments

Sakana Marlin Debuts with AB-MCTS, Empowering Enterprises to Auto-Generate Comprehensive 100-Page Reports and Slide Decks

Spain’s Upset Rocks Polymarket: The $1 Million Wager That Vanished

DOJ Cracks Down on CFAKE and SOCFAKE Deepfake Nude Sites Using New TAKE IT DOWN Law

AWS Weekly Highlights: FinOps Agent Goes Live in Preview, Gemini 4 Lands on Bedrock, Kiro Pro Max Debuts & Fresh Updates — June 15, 2026

Revolutionizing Legacy IoT: Telit Cinterion’s SE869eK2L GNSS Module Takes Center Stage

Trending

3 Clever Pandas Tricks to Supercharge Your Data Cleaning & Preparation

Boosting Arm64 support in CNCF projects using OCI credits

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

Sakana Marlin Debuts with AB-MCTS, Empowering Enterprises to Auto-Generate Comprehensive 100-Page Reports and Slide Decks

What is Sakana Marlin

Inside AB-MCTS: Wider or Deeper

AB-MCTS: “Wider or Deeper?” — interactive search

Search state

How Marlin Compares

Pricing

Use Cases, With Examples

Try the Engine Yourself: TreeQuest

Strengths and Weaknesses

Strengths

Weaknesses

Key Takeaways

Sources

Related Posts