Today we’re joined by Dominic Jainy, an IT professional with deep expertise in artificial intelligence and its intersection with cybersecurity. We’ll be dissecting a recent, eye-opening study where an AI model, GPT-5.2, successfully developed functional exploits for zero-day vulnerabilities. Our conversation will explore the sophisticated reasoning these models now possess, how the low cost of generating attacks fundamentally changes the economic landscape of hacking, and what this new reality of automated threats means for the future of digital defense.
A recent experiment showed GPT-5.2 could develop functional exploits for every challenge presented. What specific reasoning capabilities do these models demonstrate that allow them to bypass multiple security protections, and how does this change our understanding of automated threats?
What we’re seeing is a shift from simple pattern matching to a genuine, albeit artificial, form of problem-solving. These models aren’t just throwing random code at a wall; they’re systematically analyzing the constraints of a sandboxed environment, understanding the purpose of protections like a shadow stack, and then reasoning their way around them. The AI demonstrated an ability to map out a complex system, identify its weakest links—like the glibc exit handler—and then logically chain together a series of actions to achieve a goal that should be impossible. This completely reframes automated threats. It’s no longer about pre-programmed scripts targeting known flaws; it’s about autonomous agents capable of discovering and exploiting brand-new vulnerabilities in real time, a capability previously reserved for highly skilled human researchers.
Developing a complex exploit was reportedly accomplished for around $50 in just over three hours. How does this low cost and time investment shift the economic calculus for attackers, and what does it mean when offensive capability is measured by token budgets rather than human expertise?
It’s a seismic shift that democratizes high-level hacking. In the past, developing a zero-day exploit for a hardened target required an elite human expert, commanding a six or seven-figure salary and weeks, if not months, of work. Now, that same level of output can be achieved for the price of a dinner out. When a complex task, one that consumed 50 million tokens, costs only $50, it means that even low-resource threat actors can generate a firehose of unique, effective exploits. Offensive capability is no longer bottlenecked by the scarcity of human talent. Instead, it becomes a raw calculation of computational budget. An organization’s or a nation-state’s offensive power could soon be measured not by the size of their elite hacking team, but by the size of their server farm and their AI token budget.
In one scenario, an AI bypassed a shadow stack and seccomp sandbox by chaining seven function calls through the glibc exit handler. Could you walk us through how this creative, multi-step approach mimics a human developer’s process and its implications for defense strategies?
This is where it gets truly fascinating and, frankly, a bit unnerving. A human exploit developer facing these defenses would think, “Okay, the front door is locked. The windows are barred. Is there a back door? A forgotten utility tunnel?” That’s exactly the logic the AI applied. It recognized that direct attacks were blocked by the shadow stack and the seccomp sandbox. So, instead of trying to break those defenses head-on, it looked for a legitimate, albeit obscure, system process it could hijack. Finding the glibc exit handler and realizing it could chain seven distinct function calls through it to eventually write a file is a display of incredible lateral thinking. It’s a creative, almost elegant, solution born from pure logic. For defenders, this is a nightmare. It means we can no longer rely on blocking a few specific, common attack vectors; we have to secure every possible pathway, because the AI is capable of finding the one we forgot.
The exploits generated leveraged known gaps rather than creating entirely new techniques. How do you see these AI capabilities scaling to more complex targets like major web browsers, and what defensive adjustments should organizations prioritize now to prepare for this type of automated attack?
The fact that it used known gaps is actually more concerning in the short term. It means the AI doesn’t need to invent some revolutionary new hacking method to be effective; it just has to be better and faster at applying the techniques we already know exist. Scaling this to a target like Chrome or Firefox is absolutely the next logical step. While a browser is vastly more complex than QuickJS, the fundamental process is the same: analyze the system, find a chain of existing weaknesses, and exploit them. The AI’s systematic approach is perfectly suited for that kind of complexity. For organizations, the immediate priority has to be a ruthless focus on fundamentals—patching, configuration hardening, and reducing the attack surface. We have to assume that any known-but-unpatched vulnerability is not just a potential risk, but an actively exploitable one, because an AI can now automate that discovery-to-exploit pipeline at an unprecedented scale and speed.
What is your forecast for the role of AI in both offensive and defensive cybersecurity over the next five years?
Over the next five years, AI will become the central engine for both sides of the cybersecurity arms race. On the offensive side, we’ll see AI-driven platforms that not only generate exploits but also conduct entire campaigns—from reconnaissance and phishing to lateral movement and data exfiltration—with minimal human oversight. This will lead to a massive increase in the volume and sophistication of attacks. On the defensive side, we will have no choice but to respond in kind. AI-powered defense systems will become essential for real-time threat detection, automated patching, and predictive analysis, identifying potential exploits before they’re even written. The cybersecurity landscape will transform into a high-speed, machine-versus-machine conflict, where the side with the smarter, faster AI will have the decisive advantage. Human experts will transition from being the soldiers on the front lines to being the strategists and trainers for these AI systems.
