LMDeploy SSRF Weaponized in 12 Hours as AI Infra Targeted

Article Highlights
Off On

Attackers did not wait for a proof-of-concept or a weekend lull, turning a fresh advisory into a working exploit chain in roughly half a day and demonstrating how AI-serving stacks have become fast-moving targets for SSRF-driven reconnaissance and lateral movement. The case centered on CVE-2026-33626, a high-severity flaw (CVSS 7.5) in LMDeploy’s vision-language module present in versions 0.12.0 and earlier, where the load_image() function in lmdeploy/vl/utils.py fetched arbitrary URLs without filtering internal or private address spaces. That single oversight enabled access to sensitive cloud metadata endpoints and internal services. Reported by Orca Security’s Igor Stepansky, the bug was exploited in the wild only 12 hours and 31 minutes after disclosure, collapsing the detection-to-exploitation window and signaling that detailed advisories can now act as deployment playbooks for adversaries and automated exploit synthesis pipelines.

The Exploit: What Happened

SSRF Root Cause and Rapid Abuse

The vulnerability hinged on LMDeploy’s image loading pathway accepting remote URIs and trusting them by default, with no guardrails to block requests to loopback, link-local, or RFC1918 ranges. In practice, that meant a model-serving workflow became a generic SSRF primitive: a user-supplied image URL could quietly pivot to 169.254.169.254 for AWS Instance Metadata Service, 127.0.0.1 for internal admin surfaces, or 10.0.0.0/8 for east-west peeks. Sysdig observed a session from 103.116.72[.]119 that lasted eight minutes yet packed a broad probe. The actor hit IMDS endpoints, enumerated Redis and MySQL, and reached for an internal HTTP admin interface, blending target selection with speed. Out-of-band DNS callbacks to requestrepo[.]com provided confirmation that blind probes landed, while loopback port scans mapped available footholds. No public PoC surfaced, yet the advisory’s file paths and parameter names sufficed to reconstruct reliable requests.

Tradecraft, Evasion, and the New Tempo

Beyond the primitive itself, the operator rotated between VLMs—internlm-xcomposer2 and OpenGVLab/InternVL2-8B—suggesting a basic evasion tactic to sidestep platform-specific logging heuristics and reduce the chance of runbook alerts tied to a single model pipeline. That choice also indicated familiarity with how inference gateways route requests, where swapping model identifiers can flip execution paths and briefly hinder correlation. The flows reflected a routine playbook: test metadata, sweep for state stores like Redis, peek at SQL, and probe internal web consoles that often surface in AI-serving clusters. Crucially, the effort began within hours, not days, revealing how commercial LLMs and scripts can transform advisory breadcrumbs—function names, paths, or permissive URI handlers—into working exploit scaffolds. This tempo left defenders reacting to reconnaissance already in motion, and it showed that advisory transparency, while valuable, now materially shapes attacker time-to-weaponization.

The Wider Campaigns

Parallel Campaigns Across Web and ICS

The same rhythm surfaced across unrelated but thematically aligned targets. WordPress sites faced pressure from two plugin flaws: Ninja Forms – File Upload (CVE-2026-0740) and Breeze Cache (CVE-2026-3844). In practice, those bugs enabled arbitrary file upload and remote code execution, culminating in full site compromise when misconfigurations or weak isolation compounded the impact. Opportunistic operators blended mass scanning with selective follow-through, staging webshells, cron-based persistence, and CDN abuse to mask payload delivery. Meanwhile, a wave of scanning struck Modbus-enabled PLCs exposed on the public internet across 70 countries, splitting into broad, automated sweeps alongside quieter, target-specific fingerprinting. Several scanners geo-located to China, while many IPs carried low reputation scores consistent with rotating infrastructure. Together, the signals pointed to a converging ecosystem: automated enumeration at scale feeding rapid, modular exploit deployment.

Implications for AI Stacks and Concrete Defenses

Building on this foundation, the LMDeploy episode demonstrated why AI-adjacent middleware now sits on the front line. SSRF endures because it stretches across trust boundaries, turning outward-facing parsers into bridges to metadata services, internal control planes, or ephemeral caches. Practical defenses demanded specificity rather than slogans: sanitize URL fetchers with explicit allowlists; block RFC1918, loopback, link-local, and metadata endpoints; enforce IMDSv2 or cloud-equivalent protections; segment Redis and MySQL behind service meshes with mTLS; and narrow egress while capturing high-fidelity DNS logs for OOB detection. Fast patch cycles mattered, but so did posture: strip internet exposure from admin consoles, require token-bound requests for model APIs, and treat model gateways as Tier 0 assets. Taking these steps shrank SSRF blast radius, constrained lateral paths, and turned eight-minute reconnaissance windows into noisy, contained dead ends. The path forward was clear and achievable with disciplined engineering.

Explore more

Is the Mistic Backdoor Hiding in Your Security Tools?

Introduction The emergence of the Mistic backdoor represents a sophisticated advancement in the arsenal of modern cybercriminals, specifically those operating within the niche of Initial Access Brokering (IAB). This malicious software, also identified by some security researchers as MLTBackdoor, has been actively infiltrating corporate environments throughout the first half of 2026. Its primary strength lies in its ability to camouflage

Is the Redmi 17C the New King of Budget Smartphones?

Dominic Jainy is a seasoned IT professional with a deep understanding of how hardware evolution impacts the budget mobile market. Today, he breaks down Xiaomi’s latest strategic move with the Redmi 17C, a device that surprisingly leaps over a generation to deliver high-refresh-rate displays and massive battery life to the entry-level segment. We explore the balance between essential utility features,

How Can PowerTool Speed Up Business Central Data Migrations?

Modern enterprises frequently encounter significant friction during ERP transitions because traditional data migration methods often fail to accommodate the sheer volume and complexity of contemporary datasets. In 2026, the demand for agility within Microsoft Dynamics 365 Business Central has reached a point where standard configuration packages, while functional for small tasks, often act as a bottleneck for larger implementations. The

How to Move Beyond the Portal to a True Developer Platform?

Dominic Jainy stands at the forefront of the modern cloud-native movement, possessing a deep technical mastery of artificial intelligence, machine learning, and blockchain architectures. With years of experience navigating the complexities of large-scale IT infrastructures, he has become a leading voice in the evolution of platform engineering. His perspective is shaped by the practical realities of moving beyond simple automation

Will AI Token Costs Soon Surpass Developer Salaries?

Recent financial projections indicate that the cost of maintaining high-frequency artificial intelligence interactions is rapidly approaching the median annual compensation of experienced software engineers in the global market. As the software development industry undergoes a radical transformation, the traditional overhead associated with human labor is being challenged by the sheer volume of data processed through large language models. This shift