Undertaking Glasswing Proved AI Can Discover The Bugs. Who's Going To Repair Them?

Final week, Anthropic introduced Undertaking Glasswing, an AI mannequin so efficient at discovering software program vulnerabilities that they took the extraordinary step of suspending its public launch. As an alternative, the corporate has given entry to Apple, Microsoft, Google, Amazon, and a coalition of others to discover and patch bugs earlier than adversaries can.

Mythos Preview, the mannequin that led to Undertaking Glasswing, discovered vulnerabilities throughout each main working system and browser. A few of these bugs had survived a long time of human audits, aggressive fuzzing, and open-source scrutiny. One had been sitting for 27 years in OpenBSD, typically thought-about to be one of many world’s most safe working methods.

It is tempting to file this below “AI lab says their AI is too dangerous,” the identical playbook OpenAI ran with GPT-2.

Not so quick; there is a materials distinction this time.

Mythos did not simply discover particular person CVEs.

It chained 4 impartial bugs into an exploit sequence that bypassed each the browser renderer and the OS sandboxing
It carried out native privilege escalation in Linux by way of race circumstances
It constructed a 20-gadget ROP chain focusing on FreeBSD’s NFS server, distributed throughout packets.

Claude Opus 4.6, Anthropic’s earlier frontier mannequin, failed at autonomous exploit improvement nearly solely.Mythos hit a 72.4% success charge within the Firefox JS shell.

This is not theoretical, nor some new three-to-five-year prediction. That is about to be a real-world engineering actuality.

Why Undertaking Glasswing Exposes the Actual Cybersecurity Hole

This is the quantity that ought to maintain safety leaders awake at night time: fewer than 1% of the vulnerabilities discovered by Mythos have been patched.

Let that sink in for a second.

Essentially the most highly effective vulnerability discovery engine ever constructed ran towards the world’s most important software program, and the ecosystem could not take up the output.

Glasswing solved the discovering downside.

No one solved the issue of fixing.

Why Defenders Cannot Hold Up: Calendar Pace vs. Machine Pace

That is the structural challenge the cybersecurity business has been circling for years. AI simply made it unattainable to disregard.

Defenders function on calendar velocity. They:

Collect intelligence
Construct a marketing campaign
Simulate the threats
Mitigate
Repeat

That cycle takes about 4 days on day. Attackers, particularly these now leveraging LLMs at each stage of their operation, are shifting at machine velocity.

For an up-to-the-minute take, David B. Cross, CISO at Atlassian, shall be talking on the Autonomous Validation Summit on May 12 about what this looks like from the inside, why periodic testing can’t keep pace with adversaries that operate autonomously, and what defenders should be doing instead.

AI-Powered Attacks Are Already Autonomous

Earlier this year, a threat actor deployed a custom MCP server hosting an LLM as part of their attack chain against FortiGate appliances.

The AI handled everything:

The result? 2,516 organizations across 106 countries were compromised in parallel. The entire chain, from initial access through credential dumping to data exfiltration, was autonomous. The only human involvement was reviewing the results afterward.

AI-based Vulnerability Discovery Is Outpacing Remediation

The gap between attacker speed and defender speed isn’t new.

What’s new is that a small but worrisome gap just became a canyon.

Now add Mythos-class discovery to this picture.

You don’t get a safer world automatically. You get a tsunami of legitimate findings that still require human verification, organizational process, business continuity considerations, and patch cycles that haven’t fundamentally changed in a decade.

How to Build a Mythos-Ready Security Program

The instinct after Glasswing is to ask: “How do we find more bugs?”

That’s actually the wrong question.

The right one is: “When thousands of exploitable vulnerabilities land on your desk tomorrow morning, can your program actually process them?“

For most organizations, the honest answer is no. And the reason isn’t a lack of tools or talent; it’s a structural dependency on periodic, human-initiated processes that were designed for a world where vulnerabilities trickled in, not one where they arrived in a tsunami.

We can’t fix every vulnerability. We can’t apply every hardening option.

That’s not defeatism, that’s the pragmatic starting point for any security program that actually works. The question that matters isn’t “is this CVE critical?” but “is this vulnerability exploitable in my environment, right now, given what I have deployed?“

A Mythos-ready security program needs three fundamental pieces.

How Autonomous Exposure Validation Closes the Gap — and Where Picus Comes in

This is the part where I’m going to be really transparent about who’s writing this.

At Picus Security, we build a platform for Autonomous Exposure Validation. So, full disclosure, I have a perspective here that comes with an inherent bias. Take it accordingly.

What Glasswing crystallized for us, and for a lot of the CISOs we’ve been speaking with, is that the validation step within any exposure management program just became the most critical bottleneck.

The only lever you can pull in between is knowing which ones actually matter to your environment. That’s validation.

From Four Days to Three Minutes: How Agentic Workflows Change the Cycle

We built Picus Swarm, the AI team powering autonomous, real-time validation, to compress the traditional four-day cycle into minutes.

It’s a set of AI agents that work together to do what used to require handoffs between four separate teams:

Every action is traceable and auditable, andevery agent operates within guardrails you define.

The whole chain, from a new CISA alert to validated, remediation-ready findings, runs in about three minutes.

When a Mythos-class model drops thousands of findings on your organization, you need something that can immediately tell you which of these are exploitable in your environment. Which controls would hold, which would fail, and what’s the vendor-specific fix?