AI Autonomously Develops Zero-Day Exploits

Today we’re joined by Dominic Jainy, an IT professional with deep expertise in artificial intelligence and its intersection with cybersecurity. We’ll be dissecting a recent, eye-opening study where an AI model, GPT-5.2, successfully developed functional exploits for zero-day vulnerabilities. Our conversation will explore the sophisticated reasoning these models now possess, how the low cost of generating attacks fundamentally changes the economic landscape of hacking, and what this new reality of automated threats means for the future of digital defense.

A recent experiment showed GPT-5.2 could develop functional exploits for every challenge presented. What specific reasoning capabilities do these models demonstrate that allow them to bypass multiple security protections, and how does this change our understanding of automated threats?

What we’re seeing is a shift from simple pattern matching to a genuine, albeit artificial, form of problem-solving. These models aren’t just throwing random code at a wall; they’re systematically analyzing the constraints of a sandboxed environment, understanding the purpose of protections like a shadow stack, and then reasoning their way around them. The AI demonstrated an ability to map out a complex system, identify its weakest links—like the glibc exit handler—and then logically chain together a series of actions to achieve a goal that should be impossible. This completely reframes automated threats. It’s no longer about pre-programmed scripts targeting known flaws; it’s about autonomous agents capable of discovering and exploiting brand-new vulnerabilities in real time, a capability previously reserved for highly skilled human researchers.

Developing a complex exploit was reportedly accomplished for around $50 in just over three hours. How does this low cost and time investment shift the economic calculus for attackers, and what does it mean when offensive capability is measured by token budgets rather than human expertise?

It’s a seismic shift that democratizes high-level hacking. In the past, developing a zero-day exploit for a hardened target required an elite human expert, commanding a six or seven-figure salary and weeks, if not months, of work. Now, that same level of output can be achieved for the price of a dinner out. When a complex task, one that consumed 50 million tokens, costs only $50, it means that even low-resource threat actors can generate a firehose of unique, effective exploits. Offensive capability is no longer bottlenecked by the scarcity of human talent. Instead, it becomes a raw calculation of computational budget. An organization’s or a nation-state’s offensive power could soon be measured not by the size of their elite hacking team, but by the size of their server farm and their AI token budget.

In one scenario, an AI bypassed a shadow stack and seccomp sandbox by chaining seven function calls through the glibc exit handler. Could you walk us through how this creative, multi-step approach mimics a human developer’s process and its implications for defense strategies?

This is where it gets truly fascinating and, frankly, a bit unnerving. A human exploit developer facing these defenses would think, “Okay, the front door is locked. The windows are barred. Is there a back door? A forgotten utility tunnel?” That’s exactly the logic the AI applied. It recognized that direct attacks were blocked by the shadow stack and the seccomp sandbox. So, instead of trying to break those defenses head-on, it looked for a legitimate, albeit obscure, system process it could hijack. Finding the glibc exit handler and realizing it could chain seven distinct function calls through it to eventually write a file is a display of incredible lateral thinking. It’s a creative, almost elegant, solution born from pure logic. For defenders, this is a nightmare. It means we can no longer rely on blocking a few specific, common attack vectors; we have to secure every possible pathway, because the AI is capable of finding the one we forgot.

The exploits generated leveraged known gaps rather than creating entirely new techniques. How do you see these AI capabilities scaling to more complex targets like major web browsers, and what defensive adjustments should organizations prioritize now to prepare for this type of automated attack?

The fact that it used known gaps is actually more concerning in the short term. It means the AI doesn’t need to invent some revolutionary new hacking method to be effective; it just has to be better and faster at applying the techniques we already know exist. Scaling this to a target like Chrome or Firefox is absolutely the next logical step. While a browser is vastly more complex than QuickJS, the fundamental process is the same: analyze the system, find a chain of existing weaknesses, and exploit them. The AI’s systematic approach is perfectly suited for that kind of complexity. For organizations, the immediate priority has to be a ruthless focus on fundamentals—patching, configuration hardening, and reducing the attack surface. We have to assume that any known-but-unpatched vulnerability is not just a potential risk, but an actively exploitable one, because an AI can now automate that discovery-to-exploit pipeline at an unprecedented scale and speed.

What is your forecast for the role of AI in both offensive and defensive cybersecurity over the next five years?

Over the next five years, AI will become the central engine for both sides of the cybersecurity arms race. On the offensive side, we’ll see AI-driven platforms that not only generate exploits but also conduct entire campaigns—from reconnaissance and phishing to lateral movement and data exfiltration—with minimal human oversight. This will lead to a massive increase in the volume and sophistication of attacks. On the defensive side, we will have no choice but to respond in kind. AI-powered defense systems will become essential for real-time threat detection, automated patching, and predictive analysis, identifying potential exploits before they’re even written. The cybersecurity landscape will transform into a high-speed, machine-versus-machine conflict, where the side with the smarter, faster AI will have the decisive advantage. Human experts will transition from being the soldiers on the front lines to being the strategists and trainers for these AI systems.

Explore more

Closing the Feedback Gap Helps Retain Top Talent

The silent departure of a high-performing employee often begins months before any formal resignation is submitted, usually triggered by a persistent lack of meaningful dialogue with their immediate supervisor. This communication breakdown represents a critical vulnerability for modern organizations. When talented individuals perceive that their professional growth and daily contributions are being ignored, the psychological contract between the employer and

Employment Design Becomes a Key Competitive Differentiator

The modern professional landscape has transitioned into a state where organizational agility and the intentional design of the employment experience dictate which firms thrive and which ones merely survive. While many corporations spend significant energy on external market fluctuations, the real battle for stability occurs within the structural walls of the office environment. Disruption has shifted from a temporary inconvenience

How Is AI Shifting From Hype to High-Stakes B2B Execution?

The subtle hum of algorithmic processing has replaced the frantic manual labor that once defined the marketing department, signaling a definitive end to the era of digital experimentation. In the current landscape, the novelty of machine learning has matured into a standard operational requirement, moving beyond the speculative buzzwords that dominated previous years. The marketing industry is no longer occupied

Why B2B Marketers Must Focus on the 95 Percent of Non-Buyers

Most executive suites currently operate under the delusion that capturing a lead is synonymous with creating a customer, yet this narrow fixation systematically ignores the vast ocean of potential revenue waiting just beyond the immediate horizon. This obsession with immediate conversion creates a frantic environment where marketing departments burn through budgets to reach the tiny sliver of the market ready

How Will GitProtect on Microsoft Marketplace Secure DevOps?

The modern software development lifecycle has evolved into a delicate architecture where a single compromised repository can effectively paralyze an entire global enterprise overnight. Software engineering is no longer just about writing logic; it involves managing an intricate ecosystem of interconnected cloud services and third-party integrations. As development teams consolidate their operations within these environments, the primary source of truth—the