LMDeploy SSRF Weaponized in 12 Hours as AI Infra Targeted

Article Highlights
Off On

Attackers did not wait for a proof-of-concept or a weekend lull, turning a fresh advisory into a working exploit chain in roughly half a day and demonstrating how AI-serving stacks have become fast-moving targets for SSRF-driven reconnaissance and lateral movement. The case centered on CVE-2026-33626, a high-severity flaw (CVSS 7.5) in LMDeploy’s vision-language module present in versions 0.12.0 and earlier, where the load_image() function in lmdeploy/vl/utils.py fetched arbitrary URLs without filtering internal or private address spaces. That single oversight enabled access to sensitive cloud metadata endpoints and internal services. Reported by Orca Security’s Igor Stepansky, the bug was exploited in the wild only 12 hours and 31 minutes after disclosure, collapsing the detection-to-exploitation window and signaling that detailed advisories can now act as deployment playbooks for adversaries and automated exploit synthesis pipelines.

The Exploit: What Happened

SSRF Root Cause and Rapid Abuse

The vulnerability hinged on LMDeploy’s image loading pathway accepting remote URIs and trusting them by default, with no guardrails to block requests to loopback, link-local, or RFC1918 ranges. In practice, that meant a model-serving workflow became a generic SSRF primitive: a user-supplied image URL could quietly pivot to 169.254.169.254 for AWS Instance Metadata Service, 127.0.0.1 for internal admin surfaces, or 10.0.0.0/8 for east-west peeks. Sysdig observed a session from 103.116.72[.]119 that lasted eight minutes yet packed a broad probe. The actor hit IMDS endpoints, enumerated Redis and MySQL, and reached for an internal HTTP admin interface, blending target selection with speed. Out-of-band DNS callbacks to requestrepo[.]com provided confirmation that blind probes landed, while loopback port scans mapped available footholds. No public PoC surfaced, yet the advisory’s file paths and parameter names sufficed to reconstruct reliable requests.

Tradecraft, Evasion, and the New Tempo

Beyond the primitive itself, the operator rotated between VLMs—internlm-xcomposer2 and OpenGVLab/InternVL2-8B—suggesting a basic evasion tactic to sidestep platform-specific logging heuristics and reduce the chance of runbook alerts tied to a single model pipeline. That choice also indicated familiarity with how inference gateways route requests, where swapping model identifiers can flip execution paths and briefly hinder correlation. The flows reflected a routine playbook: test metadata, sweep for state stores like Redis, peek at SQL, and probe internal web consoles that often surface in AI-serving clusters. Crucially, the effort began within hours, not days, revealing how commercial LLMs and scripts can transform advisory breadcrumbs—function names, paths, or permissive URI handlers—into working exploit scaffolds. This tempo left defenders reacting to reconnaissance already in motion, and it showed that advisory transparency, while valuable, now materially shapes attacker time-to-weaponization.

The Wider Campaigns

Parallel Campaigns Across Web and ICS

The same rhythm surfaced across unrelated but thematically aligned targets. WordPress sites faced pressure from two plugin flaws: Ninja Forms – File Upload (CVE-2026-0740) and Breeze Cache (CVE-2026-3844). In practice, those bugs enabled arbitrary file upload and remote code execution, culminating in full site compromise when misconfigurations or weak isolation compounded the impact. Opportunistic operators blended mass scanning with selective follow-through, staging webshells, cron-based persistence, and CDN abuse to mask payload delivery. Meanwhile, a wave of scanning struck Modbus-enabled PLCs exposed on the public internet across 70 countries, splitting into broad, automated sweeps alongside quieter, target-specific fingerprinting. Several scanners geo-located to China, while many IPs carried low reputation scores consistent with rotating infrastructure. Together, the signals pointed to a converging ecosystem: automated enumeration at scale feeding rapid, modular exploit deployment.

Implications for AI Stacks and Concrete Defenses

Building on this foundation, the LMDeploy episode demonstrated why AI-adjacent middleware now sits on the front line. SSRF endures because it stretches across trust boundaries, turning outward-facing parsers into bridges to metadata services, internal control planes, or ephemeral caches. Practical defenses demanded specificity rather than slogans: sanitize URL fetchers with explicit allowlists; block RFC1918, loopback, link-local, and metadata endpoints; enforce IMDSv2 or cloud-equivalent protections; segment Redis and MySQL behind service meshes with mTLS; and narrow egress while capturing high-fidelity DNS logs for OOB detection. Fast patch cycles mattered, but so did posture: strip internet exposure from admin consoles, require token-bound requests for model APIs, and treat model gateways as Tier 0 assets. Taking these steps shrank SSRF blast radius, constrained lateral paths, and turned eight-minute reconnaissance windows into noisy, contained dead ends. The path forward was clear and achievable with disciplined engineering.

Explore more

A Beginner’s Guide to Data Engineering and DataOps for 2026

While the public often celebrates the triumphs of artificial intelligence and predictive modeling, these high-level insights depend entirely on a hidden, gargantuan plumbing system that keeps data flowing, clean, and accessible. In the current landscape, the realization has settled across the corporate world that a data scientist without a data engineer is like a master chef in a kitchen with

Ethereum Adopts ERC-7730 to Replace Risky Blind Signing

For years, the experience of interacting with decentralized applications on the Ethereum blockchain has been fraught with a precarious and dangerous uncertainty known as blind signing. Every time a user attempted to swap tokens or provide liquidity, their hardware or software wallet would present them with a wall of incomprehensible hexadecimal code, essentially asking them to authorize a financial transaction

Germany Funds KDE to Boost Linux as Windows Alternative

The decision by the German government to allocate a 1.3 million euro grant to the KDE community marks a definitive shift in how European nations view the long-standing dominance of proprietary operating systems like Windows and macOS. This financial injection, facilitated by the Sovereign Tech Fund, serves as a high-stakes investment in the concept of digital sovereignty, aiming to provide

Why Is This $20 Windows 11 Pro and Training Bundle a Steal?

Navigating the complexities of modern computing requires more than just high-end hardware; it demands an operating system that integrates seamlessly with artificial intelligence while providing robust security for sensitive personal and professional data. As of 2026, many users still find themselves tethered to aging software environments that struggle to keep pace with the rapid advancements in cloud computing and data

Notion Launches Developer Platform for AI Agent Management

The modern enterprise currently grapples with an overwhelming explosion of disconnected software tools that fragment critical information and stall meaningful productivity across entire departments. While the shift toward artificial intelligence promised to streamline these disparate workflows, the reality has often resulted in a chaotic landscape where specialized agents lack the necessary context to perform high-stakes tasks autonomously. Organizations frequently find