ai agents llm security exploits

AI Agents Can Now Write Browser Exploits—And That Changes Everything

New benchmark reveals Claude Mythos and GPT-5.5 can autonomously develop real browser exploits, raising urgent questions about AI safety and the future of cybersecurity.

May 2026 5 min

AI Agents Can Now Write Browser Exploits—And That Changes Everything

The Carnegie Mellon ExploitBench results aren't just another leaderboard. They are a declaration that frontier AI models have crossed a line that many in the security community assumed was years away. We now have agents that can take a known browser vulnerability and, without human guidance beyond occasional nudges, craft an exploit that achieves arbitrary code execution. Claude Mythos Preview, Anthropic's flagship, reached the highest tier on 21 of 41 vulnerabilities. GPT-5.5 managed only 2. The gap in capability is wide, but the gap in cost is wider: Mythos cost about $36,400 for the full test run, while GPT-5.5 cost just $3,075. That 12x price difference matters, because it tells us the game is just beginning.

Let's be precise about what the benchmark measures. ExploitBench doesn't test whether models can discover new zero-day vulnerabilities—yet. It tests whether they can turn a known CVE into a working exploit, graded across five tiers culminating in full code execution. That's a fundamentally harder problem than generating a proof-of-concept crash. It requires understanding the browser's internal state, the V8 engine's JIT compilation pipeline, and the memory layout of the target process. One of the co-authors, an experienced security researcher, reviewed Mythos transcripts and described the model as a "fairly competent browser security researcher." In one case, Mythos devised an exploit technique the researcher had previously dismissed as too complex. In another, it cracked CVE-2024-0519, a vulnerability that human researchers had failed to exploit for over a year. This isn't pattern matching. This is synthesis of new attack strategies.

The implications for builders are stark. If you operate any service that relies on Chrome, Edge, Node.js, or Cloudflare Workers—and that's basically everyone—you need to assume that frontier AI models can now replicate the work of a mid-level security researcher working on known vulnerabilities. The traditional response to a CVE disclosure is to wait for a patch and apply it. That window is collapsing. When an AI can analyze a bug report and produce a working exploit in hours, the time between disclosure and weaponization shrinks from weeks to minutes. The only defense is to treat every unpatched vulnerability as already compromised. That means moving to aggressive sandboxing, isolated rendering, and assume-breach architectures long before the patch cycle completes.

But there's a second-order effect that's more insidious. The cost differential between Mythos and GPT-5.5 suggests that near-future models at lower price points will close the capability gap. OpenAI could simply spend more compute on GPT-5.5 and likely match Mythos. When a $3,000 run can produce two full code-execution exploits, the economics of offensive AI shift dramatically. Nation-state actors and sophisticated criminal groups already have the resources to run these models at scale. The question is no longer whether AI can develop exploits—it's how to defend against a future where exploring every known vulnerability is cheap and automated.

The benchmark authors are careful to note that ExploitBench doesn't measure discovery of new bugs or full weaponization for real-world attacks. That caveat will not comfort anyone who understands how exploit chains work. Once a model can achieve code execution on a known vulnerability, the next step is enabling it to scan source code for patterns similar to known bugs, then chain those into a full exploit. That capability is not speculative—it's a direct extension of what we're already seeing. The UK's AI Safety Institute has confirmed that Mythos performs somewhat better than GPT-5.5 but at a much higher cost. The trajectory is clear: models will get cheaper, faster, and more capable at every stage of the exploit pipeline.

What does this mean for the average software engineer? It means the attack surface you defend is about to get much larger. AI agents don't get bored, don't overlook obscure CVEs, and don't need sleep. They will probe every corner of your dependency tree the moment a CVE is published. Your job security posture needs to include real-time monitoring of model-driven exploit attempts. That's a fundamentally different threat model from human attackers, because the speed and thoroughness are unmatched. We need new defensive tools: AI-powered fuzzers that can test patches before they ship, automated sandbox analysis that flags exploit attempts, and runtime integrity checks that can detect the kind of memory corruption an AI-generated exploit would leverage.

But the most important takeaway is about alignment, not just security. We are building models that can autonomously compromise critical infrastructure. The same architecture that writes a browser exploit could, with different training, write a defense. The same reasoning chain that constructs a memory corruption payload could reconstruct a secure memory allocator. The difference is entirely in the objective. That makes it urgent to invest in AI safety research that focuses on capability control, not just alignment. We need methods to reliably constrain what these agents can do, not just hope they behave nicely. The ExploitBench results should be a wake-up call for every CISO, every platform engineer, and every AI researcher: the age of autonomous offensive AI is here, and the time to build defenses is now.

Toni Soriano

Principal AI Engineer at Cloudstudio. 18+ years building production systems. Creator of Ollama Laravel (87K+ downloads).

LinkedIn →

Need an AI agent?

We design and build autonomous agents for complex business processes. Let's talk about your use case.

Book a discovery call ← All articles

VibeThinker-3B Proves Reasoning Compresses—And That Changes Everything

Seven AI Agents Built a Newsroom From a CSV. The Articles Are Better Than Humans

Latent Memory Changes Everything: Microsoft's Mirage Rebuilds Video Worlds from the Inside Out

Free Resource

Get the AI Implementation Checklist

10 questions every team should answer before building AI systems. Avoid the most common mistakes we see in production projects.

Check your inbox!

We've sent you the AI Implementation Checklist.

No spam. Unsubscribe anytime.