In brief
- Anthropic has reportedly embedded roughly half a dozen engineers at the NSA to deploy its Mythos AI model for offensive cyber operations—potentially including attacks on networks in China and Iran.
- Anthropic also warned that AI is approaching recursive self-improvement and called for a coordinated global pause mechanism.
- Both landed as Anthropic files for an IPO that could value it above $1 trillion.
Anthropic has placed about six engineers inside the National Security Agency to help deploy Mythos—its most capable AI model—for offensive cyber operations, the Financial Times reported Thursday.
The engineers are forward-deployed staff, customizing the model for specific applications. One source told the FT it could be useful for infiltrating networks in countries like China and Iran.
Whether those engineers are involved in active operations isn’t confirmed. What is: Mythos is the same model Anthropic has declined to release publicly, citing misuse risk. The company limited it to vetted partners through Project Glasswing—a restricted coalition that includes Microsoft, Apple, and Amazon.
Anthropic is also suing the Pentagon. In late February, Defense Secretary Pete Hegseth designated the company a supply-chain risk—a label historically reserved for foreign adversaries like Huawei—after a $200 million contract collapsed. The sticking point: Anthropic refused to let the DoD use Claude for fully autonomous weapons or domestic mass surveillance. The NSA contract was exempt from that ban.
A California judge blocked the blacklisting as an apparent First Amendment retaliation. A D.C. appeals court denied Anthropic’s bid to halt it while litigation plays out. The NSA kept using Mythos the whole time, according to the FT’s reporting.
How to stop AI that builds AI
On the same day the NSA story broke, Anthropic’s internal research institute published “When AI Builds Itself,” a look at how far Claude has come at automating its own development. In it, the company argues for essentially a global moratorium in the AI arms race—and even likened it to Cold War-era nuclear treaties struck between the United States and Russia.
To understand why, the company provided this bit of context:
Claude now writes more than 80% of the code merged into Anthropic’s production codebase—up from low single digits before Claude Code launched in early 2025. Engineers ship roughly eight times as much code per day as they did in 2024.
The report’s authors—Anthropic Institute lead Marina Favaro and co-founder Jack Clark—warn that this path is steering toward what they term recursive self-improvement: AI systems capable of independently designing, building, and training their own next-generation versions, with human involvement shrinking at each stage.
In a visual timeline, the researchers illustrate how AI use in the workplace begins with humans prompting a computer to obtain a result, then progresses through increasing levels of automation until AI agents are prompting sub-agents to complete tasks—with no human participation at all.

The most striking data point they reference: In April, Claude agents were given an open AI safety challenge—determining whether a weaker model can reliably oversee a stronger one—and left to tackle it independently. Over roughly a week, two human researchers managed to close 23% of the performance gap between the models. The agents closed 97%, running for over 800 cumulative compute hours. Humans posed the question. The agents planned and executed every experiment themselves. This marks the first published instance of Claude demonstrating genuine research judgment, rather than merely carrying out tasks defined by someone else.
That’s the threshold Anthropic is concerned about crossing. Once
AI doesn’t merely carry out experiments—it decides which ones are worth doing at all. That shift strips humans of the last truly meaningful function in the research pipeline. Subtle flaws in today’s models risk snowballing through successive rounds of self-improvement until they become beyond anyone’s ability to fix them.
Their suggested remedy is a globally enforceable pause—multiple leading labs stopping all frontier work at the same time, with outside observers confirming that every party genuinely halts. Anthropic has committed to taking part. They openly admit that slowing down alone only cedes the advantage to those who carry on racing ahead.
This pattern is familiar. The very labs racing to build more powerful AI are the ones sounding the alarm about its dangers. Yet AI represents the decade’s most lucrative industry, so halting progress isn’t appealing—even for those voicing the loudest concerns.
In 2023, more than a hundred leading AI researchers put their names to an open letter calling on the world to jointly address the extinction-level risks embedded in AI progress. Months earlier, a separate open letter had urged OpenAI to put the brakes on ChatGPT development given its potential hazards.
Nobody heeded those 2023 letters. OpenAI carried on. Anthropic carried on. The Pentagon’s deadline to remove Claude from military systems lands in August—roughly when Anthropic’s IPO is anticipated to open its books to the public.
Daily Debrief Newsletter
Start every day with the top news stories right now, plus original features, a podcast, videos and more.



