Critical Ollama Flaw Exposes Remote Memory Leak Via Out-of-Bounds Read Vulnerability

Cybersecurity experts have uncovered a serious security flaw in Ollama that could let a remote attacker without any authentication access the entire memory of its process.

This out-of-bounds read vulnerability, likely affecting more than 300,000 servers worldwide, is identified as **CVE-2026-7482** (CVSS score: 9.1). Cyera has given it the name **Bleeding Llama**.

Ollama is a widely-used open-source platform that enables large language models (LLMs) to run on local machines rather than in the cloud. On GitHub, the project boasts over 171,000 stars and has been forked more than 16,100 times.

“Ollama versions prior to 0.17.1 have a heap out-of-bounds read issue in the GGUF model loader,” states the CVE.org description. “The /api/create endpoint takes a GGUF file provided by an attacker where the specified tensor offset and size go beyond the file’s real length; during quantization in fs/ggml/gguf.go and server/quantization.go (WriteTo()), the server reads beyond the allocated heap buffer.”

GGUF, which stands for GPT-Generated Unified Format, is a file format designed to store large language models for easy local loading and execution.

At its heart, the issue comes from Ollama’s use of the unsafe package when building a model from a GGUF file, particularly in a function called “WriteTo(),” allowing operations that sidestep the programming language’s memory safety protections.

In a potential attack, a malicious actor could send a specially designed GGUF file to an exposed Ollama server with the tensor’s shape set to an extremely large value, triggering the out-of-bounds heap read during model creation via the /api/create endpoint. If exploited successfully, this could expose sensitive information from the Ollama process memory.

This might include environment variables, API keys, system prompts, and conversation data from other active users. The stolen data can then be sent out by uploading the resulting model artifact through the /api/push endpoint to a registry controlled by the attacker.

The attack process happens in three stages:

– Send a manipulated GGUF file with an inflated tensor shape to an internet-accessible Ollama server using an HTTP POST request.
– Trigger the /api/create endpoint to start model creation, activating the out-of-bounds read flaw.
– Use the /api/push endpoint to extract heap memory data to an external server.

“An attacker can essentially learn everything about the organization from your AI inference — API keys, proprietary code, customer contracts, and much more,” said Cyera security researcher Dor Attias.

“Additionally, engineers frequently connect Ollama to tools like Claude Code. In such cases, the consequences are even greater — all tool outputs are sent to the Ollama server, stored in the heap, and could end up in an attacker’s possession.”

Users are urged to apply the latest patches, restrict network access, check running instances for internet exposure, and isolate and protect them behind a firewall. It is also advised to set up an authentication proxy or API gateway in front of all Ollama instances, since the REST API does not include built-in authentication.

### Two Unpatched Vulnerabilities in Ollama Allow Persistent Code Execution

This development follows research by Striga, which outlined two vulnerabilities in Ollama’s Windows update mechanism that can be combined to achieve persistent code execution. These flaws remain unpatched after being disclosed on January 27, 2026, and have now been made public following the completion of a 90-day disclosure period.

According to Bartłomiej “Bartek” Dmitruk, co-founder of Striga, the Windows desktop client automatically starts on login from the Windows Startup folder, listens on 127.0.0.1:11434, and regularly checks for updates in the background via the /api/update endpoint to apply any pending updates on the next application launch.

The discovered vulnerabilities involve a path traversal issue and a missing signature check that, when paired with the on-login routine, could allow an attacker who can influence update responses to run arbitrary code at every login. The flaws are as follows:

– **CVE-2026-42248** (CVSS score: 7.7) — A missing signature verification flaw that fails to validate the update binary before installation, unlike the macOS version.
– **CVE-2026-42249** (CVSS score: 7.7) — A path traversal flaw caused by the Windows updater creating the local path for the installer’s staging directory directly from HTTP response headers without proper sanitization.

To exploit these flaws, the attacker must control an update server accessible to the victim’s Ollama client. In such a scenario, it could result in an arbitrary executable being delivered as part of the update process and placed in the Windows Startup folder without triggering any signature verification alerts.

To control the update response, one method involves changing the OLLAMA_UPDATE_URL to direct the client to a local server over plain HTTP. The attack chain also assumes AutoUpdateEnabled is turned on, which is the default configuration.

Furthermore, the missing integrity check alone can lead to code execution without needing to exploit the path traversal flaw. In this case, the installer is placed in the expected staging directory. During the next launch from the Startup folder, the update process runs without re-checking the signature, causing the attacker’s code to execute instead.

However, the remote code execution is not permanent, as the next legitimate update overwrites the staged file. By incorporating the path traversal, a malicious actor can redirect the executable to be written outside the normal path and achieve persistent code execution.

According to CERT Polska, which managed the coordinated disclosure process, Ollama for Windows versions 0.12.10 through 0.17.5 are affected by these two vulnerabilities. In the meantime, users are advised to disable automatic updates and remove any existing Ollama shortcut from the Startup folder (“%APPDATA%MicrosoftWindowsStart MenuProgramsStartup”) to block the silent on-login execution pathway.

“Any Ollama for Windows installation running version 0.12.10 through 0.22.0 is vulnerable,” Dmitruk said. “The path traversal writes attacker-chosen executables into the Windows Startup folder. The missing signature verification keeps them there — the post-write cleanup that would remove unsigned files on a working updater does nothing on Windows. On the next login, Windows runs whatever was left behind.”

“The chain results in persistent, silent code execution at the privilege level of the user running Ollama. Realistic payloads include reverse shells, info-stealers that exfiltrate browser secrets and SSH keys, or droppers that move to additional persistence mechanisms. Anything that runs as the current user. Removing the dropped binary from the Startup folder ends the persistence, but the underlying flaws remain.”

Top Posts

AWS and Anthropic Deepen Alliance with Claude Platform Launch

From Bootstrap to Breakthrough: Why a SIM with a Global Profile Falls Short of True In-Factory Provisioning

“Claude Code-Powered Knowledge Base: The Ultimate Builder’s Guide”

Critical Ollama Flaw Exposes Remote Memory Leak via Out-of-Bounds Read Vulnerability

Build Application Firewalls: Your Shield Against the Next Supply Chain Attack

Persistent Access: Why Password Resets Fail to Stop Active Directory Attacks

Hackers Leverage AI to Craft Unprecedented Zero-Day 2FA Bypass Exploit for Mass Attacks

Canvas System Restored Following Cyberattack That Disrupted Thousands of Schools

Outsmarting Cyber Threats: How AI Is Revolutionizing the Future of Cybersecurity Talent

Hackers Exploit Google Ads and Claude.ai Conversations to Distribute Mac Malware

AWS and Anthropic Deepen Alliance with Claude Platform Launch

From Bootstrap to Breakthrough: Why a SIM with a Global Profile Falls Short of True In-Factory Provisioning

“Claude Code-Powered Knowledge Base: The Ultimate Builder’s Guide”

“First Movers Reveal Their Insider Strategies”

US Inflation Poised to Surge Again Amid Rising Oil Prices Fueled by US-Iran Tensions

Build Application Firewalls: Your Shield Against the Next Supply Chain Attack

Trump Taps FEMA Whistleblower Cameron Hamilton to Head the Agency He Once Defended

I Tested How Gemini, ChatGPT, and Claude Analyze Videos — One Model Clearly Wins

Trending

AWS and Anthropic Deepen Alliance with Claude Platform Launch

From Bootstrap to Breakthrough: Why a SIM with a Global Profile Falls Short of True In-Factory Provisioning

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

Critical Ollama Flaw Exposes Remote Memory Leak via Out-of-Bounds Read Vulnerability

Related Posts