Malicious PyPI Package hermes-px Steals AI Data and Code

Article Highlights
Off On

The rapid democratization of artificial intelligence has led many developers to seek out open-source tools that promise to simplify complex workflows while maintaining a commitment to privacy and data security. However, this reliance on external repositories has also opened a dangerous door for sophisticated cybercriminals who exploit the trust inherent in the developer community. In a particularly alarming discovery made in April 2026, security researchers identified a malicious Python package named hermes-px that had successfully infiltrated the Python Package Index (PyPI). This package was not merely a simple script designed to steal browser cookies; it was a highly orchestrated “Trojanized” tool that masqueraded as a professional-grade AI inference proxy. By offering a seemingly functional solution for routing AI requests, the attackers managed to position themselves directly in the middle of sensitive development pipelines, intercepting every piece of data that passed through their malicious architecture.

The deceptive nature of this attack is highlighted by the professional branding used to market the software under the banner of a fictional entity known as EGen Labs. To convince engineers of its legitimacy, the package was accompanied by high-quality documentation, a fully functional Retrieval-Augmented Generation (RAG) pipeline, and a comprehensive migration guide designed to facilitate a seamless transition from the official OpenAI Python SDK. This level of polish gave the package an air of authority, making it appear as a privacy-focused alternative for developers looking to anonymize their traffic through the Tor network. By mimicking the API surface of industry-standard libraries, hermes-px lowered the barrier to entry for unsuspecting users, who believed they were adopting a secure, free tool while inadvertently handing over their most sensitive intellectual property to a silent adversary.

The Architecture of Deception and Silent Data Harvesting

Behind the facade of a privacy-enhancing proxy, hermes-px functioned as a meticulously engineered data-harvesting machine that completely bypassed its advertised security features. While the package claimed to route all traffic through the Tor network to ensure user anonymity, researchers discovered that it actually utilized the victim’s direct internet connection when exfiltrating stolen information. This blatant contradiction not only exposed the real IP addresses of the users but also allowed the attackers to monitor every interaction in real time. The stolen data, which included proprietary code snippets, development prompts, and sensitive credentials, was funneled directly into an attacker-controlled Supabase database. This infrastructure allowed the threat actors to maintain a centralized repository of intercepted intelligence, effectively turning a “privacy tool” into a massive surveillance network targeting the AI development community.

The impact of this breach extended far beyond individual developers, as the malicious package was found to have hijacked the private AI infrastructure of Universite Centrale in Tunisia. By piggybacking on the university’s computational resources, the attackers were able to provide high-quality AI responses to their victims without incurring any operational costs. This parasitism allowed the hermes-px package to maintain the illusion of a high-performance AI service, as users received accurate and fast responses to their queries. However, every prompt sent to this hijacked system was logged and analyzed by the attackers. This strategy demonstrates a growing trend where cybercriminals do not just steal data but also co-opt legitimate enterprise and academic resources to power their malicious operations, making the attack both financially sustainable and difficult to detect through traditional traffic analysis.

Exploitation of Intellectual Property and Stolen AI Assets

A critical component of the hermes-px payload involved the use of a compressed file named base_prompt.pz, which contained a massive 246,000-character system prompt. Upon closer inspection, researchers realized that this asset was not original work but was instead a stolen system prompt from Anthropic’s proprietary Claude Code. The attackers attempted to rebrand this stolen intellectual property as “AXIOM-1” to bolster the perceived value of their fake service. Despite these efforts to hide the source, the rebranding was executed poorly; the code still contained numerous references to Anthropic-specific function names, internal sandbox filesystem paths, and explicit mentions of the “Claude” model. This stolen prompt was injected into every API call made through the proxy, allowing the attackers to mimic the behavior of world-class AI models while simultaneously violating the intellectual property rights of major technology firms.

This reuse of high-value AI assets represents a sophisticated shift in how malware authors operate within the modern tech ecosystem. By leveraging the advanced capabilities of a stolen system prompt, the creators of hermes-px ensured that their victims would remain satisfied with the tool’s performance, thereby extending the duration of the infection. The inclusion of such a large and complex prompt also served to obfuscate the malicious intent of the package, as the sheer volume of code made manual auditing more difficult for busy developers. This tactic illustrates the intersection of traditional cybercrime and the emerging field of prompt engineering, where the value of a specific AI configuration is high enough to warrant its own dedicated theft and redistribution through malicious channels.

Advanced Obfuscation Techniques and Remote Code Execution

To evade detection by automated security scanners and manual code reviews, the developers of hermes-px employed a sophisticated triple-layer obfuscation chain. All sensitive strings, including the URLs for the exfiltration endpoints, were encrypted using a XOR operation with a 210-byte rotating key. This encrypted data was then further hidden through zlib compression and base64 encoding, ensuring that static analysis tools would see only a jumble of nonsensical characters. The decryption and decompression processes occurred entirely within the system’s memory at runtime, leaving no traces of the malicious strings on the disk. This level of technical sophistication is typically associated with state-sponsored actors or advanced persistent threat groups, highlighting the increasing danger of the software supply chain for Python developers.

Beyond simple data theft, the package included a particularly dangerous feature called the “Interactive Learning CLI,” which encouraged users to execute Python scripts directly from a remote GitHub URL. This mechanism provided the attackers with a permanent remote code execution (RCE) channel into the victim’s development environment. By persuading developers to run external scripts under the guise of an educational tool, the threat actors could push updated malicious payloads or install secondary malware without ever needing to update the original package on PyPI. This dynamic execution capability allowed the attackers to adapt their tactics in real time, moving from simple data exfiltration to more intrusive activities like lateral movement within corporate networks or the installation of persistent backdoors on high-value developer workstations.

Immediate Remediation Strategies and Future Considerations

For any developer who has interacted with or installed the hermes-px package, immediate action is required to mitigate the risk of ongoing data loss and system compromise. The first and most critical step is to perform a thorough uninstallation of the package using standard package management tools while simultaneously auditing the local environment for any secondary scripts that may have been downloaded through the malicious CLI. However, simply removing the software is insufficient given the nature of the data intercepted. Because the package was designed to capture prompts and code in transit, any API keys, environment variables, or proprietary internal documentation that passed through the proxy must be considered fully compromised. Organizations must initiate a comprehensive credential rotation policy and treat any code submitted to the proxy as if it has been leaked to the public domain.

Moving forward, this incident serves as a stark reminder of the vulnerabilities inherent in the open-source AI ecosystem and the need for more rigorous verification of third-party libraries. Network administrators and security teams should proactively block the known exfiltration endpoint at urlvoelpilswwxkiosey.supabase.co to prevent any remaining infected systems from communicating with the attacker’s infrastructure. Furthermore, developers should prioritize the use of official SDKs and implement strict egress filtering to ensure that sensitive data cannot be sent to unauthorized third-party databases. As AI continues to become a central pillar of software development, the community must move toward a model of “zero trust” for external dependencies, emphasizing the importance of code signing, integrity checks, and the use of sandboxed environments for testing new and unverified AI tools.

Explore more

How Can Outbound Lead Gen Reduce B2B Acquisition Costs?

Business enterprises operating in the competitive B2B marketplace are currently facing a significant escalation in customer acquisition costs due to digital saturation and longer sales cycles. As organizations strive to maintain healthy profit margins, the efficiency of traditional inbound marketing has waned, leading to a renewed focus on outbound lead generation services. These professional services provide a direct and controlled

Nigeria Probes 1,369 Entities in Massive Data Privacy Crackdown

The sudden realization that sensitive biometric information and national identity numbers are being traded in clandestine digital marketplaces for less than the cost of a bottled soda has forced a dramatic reevaluation of Nigeria’s digital security protocols. As the nation accelerates its transition into a fully integrated digital economy, the Nigeria Data Protection Commission (NDPC) has identified a significant gap

ChatGPT Becomes Fastest App to Reach One Billion Users

The rapid ascension of conversational artificial intelligence into the daily routines of a global population has culminated in a historic achievement as ChatGPT officially surpassed the one billion user mark in record time. The milestone marks a significant pivot in how digital services scale, dwarfing the adoption rates of previous social media giants and productivity suites. This explosive growth stems

Ethereum Faces 2026 Market Correction and Bearish Sentiment

The current valuation of Ethereum has retreated significantly from its historical peaks, signaling a cooling phase that has caught many retail and institutional participants by surprise. As the asset hovers around the $1,646 threshold, the general sentiment within the digital finance community has shifted toward extreme caution, reflecting a broader retreat from high-volatility investments. This market correction serves as a

Why Is Private Cloud the Foundation for Production AI?

The sudden migration of artificial intelligence from experimental research labs to the very heart of mission-critical corporate operations has fundamentally altered the technological requirements for modern digital infrastructure. Enterprises that once treated cloud selection as a matter of simple convenience now recognize that the residence of sensitive workloads is a high-stakes strategic decision that impacts everything from data security to