"Memory OS: An Open-Source Framework Powering Intelligent Memory Management With Hermes Agent Integration"

Hermes Agent already supports persistent memory out of the box. The Nous Research open-source agent includes curated memory embeddings and full-text retrieval across sessions. But one community contributor argues that native memory capabilities fall short for demanding workloads. Enter Memory OS—a brand-new MIT-licensed library built by ClaudioDrews—which layers six distinct memory tiers onto Hermes. It introduces a vector database, structured fact storage, and a self-maintaining knowledge wiki. While still early, the project shows real promise, and its architecture offers a glimpse into how future agent memory systems might be designed.

Memory OS

Memory OS isn’t a simple add-on for Hermes. It’s a full supplementary framework running parallel to Hermes Agent’s built-in memory. Hermes natively uses workspace directories and a session log. Memory OS preserves those and introduces four additional layers on top. Everything runs locally via Docker, Qdrant, Redis, and Python 3.11+. It integrates with any LLM provider Hermes supports—OpenRouter, OpenAI, Anthropic, Ollama included. The README describes it as a memory operating system, not just a single enhancement.

The Six Layers, From Files to Vectors

Layer 1 — Workspace: Stores MEMORY.md, USER.md, and CREATIVE.md, embedding them directly into the system prompt each turn.
Layer 2 — Sessions: Leverages state.db, a SQLite database with FTS5 full-text search spanning previous conversations.
Layer 3 — Structured Facts: Captures lasting knowledge in memory_store.db using SQLite, HRR, FTS5, and trust scores. A feedback loop continuously adjusts those confidence ratings over time, paired with entity resolution.
Layer 4 — Fabric: A heavily modified fork of the Icarus Plugin. This adaptation adds LLM-driven session extraction not present in the upstream esaradev/icarus-plugin. Cross-session recall is managed through 16 tools, including fabric_recall, fabric_write, and fabric_brief.
Layer 5 — Vector Database: Powered by Qdrant. Employs 4096d Cosine vectors combined with BM25 sparse search for keyword-style matching.
Layer 6 — LLM Wiki: A continuously updated vault of topics, entities, and comparisons. The wiki is ingested back into Qdrant on an ongoing basis through a process named wiki-continuous-ingest.

How the Retrieval Flow Works

The pipeline governs when memory is read and stored. During pre_llm_call, Memory OS performs what it calls surgical recall. It queries four sources simultaneously: Fabric, Qdrant, Sessions, and Facts. Each source passes through a relevance threshold check before its contents reach the model. Per-session deduplication prevents duplicate context from surfacing twice. A social-noise filter skips trivial exchanges like a simple “thanks.” During post_llm_call and on_session_end, the system automatically extracts and records new insights. The primary goal is keeping token usage lean, not flooding the context window.

The Fallback Cascade and Cleanup

Layer 5’s retrieval follows a four-stage fallback chain. It attempts hybrid search first, then dense vectors, then lexical, then SQLite. If one approach fails, the next steps in. This design ensures retrieval keeps working even under vector database degradation. Memory OS also runs a weekly decay scanner to phase out outdated entries. Semantic dedup merges nearly identical memories when cosine similarity goes past 0.92. These maintenance routines aim to prevent memory bloat during months of continuous use.

Local-First, And Deliberately So

Memory OS positions itself in contrast to cloud memory providers such as mem0, Zep, and Letta. Its pitch is that memory infrastructure should run entirely on your machine. Memory data stays local—no subscription required. API calls still go to whichever LLM provider you select. Hermes already integrates eight external memory partners, including mem0 and Honcho. Memory OS is not one of those official integrations. It’s an independent, community-built framework layered directly on Hermes. For teams subject to data-residency requirements, a local-first memory store can make a real difference.

Strengths and Limitations

Strengths:

Well-defined layered architecture that cleanly separates files, sessions, facts, vectors, and a wiki
Entirely local infrastructure with no cloud memory service dependency
Works with any LLM provider, matching Hermes Agent’s flexibility
Designed for token efficiency through gated retrieval and per-session deduplication

Limitations:

Very early stage, with limited commit history
A forked Icarus Plugin that the author confirms is not compatible with upstream
Heavier setup: Docker, Qdrant, Redis, and an ARQ Worker required
No public benchmarks for recall accuracy, response latency, or token savings

Key Takeaways

Memory OS is a community-built, MIT-licensed framework adding six memory tiers on top of Hermes Agent.
It integrates workspace files, FTS5 session search, trust-scored facts, a forked Icarus fabric, Qdrant vectors, and a self-maintaining LLM wiki.
Retrieval triggers on pre_llm_call with gated, deduplicated pulls from four sources; capture fires on post_llm_call and on_session_end.
The memory layer runs fully local and is provider-agnostic, though LLM API calls still route to your chosen provider.

Explore the Repo. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ ML SubReddit and subscribe to our Newsletter. Wait—are you on Telegram? You can join us there too.

Looking to partner with us to promote your GitHub repo, Hugging Face page, product launch, or webinar? Get in touch

The post Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent appeared first on MarkTechPost.

Top Posts

Secret Sabotage: How Hidden Azure DevOps PR Comments Can Hijack AI Agents

AI Jailbreak: OpenAI Models Breach Test Prison, Rig Hugging Face Leaderboard with Cheat Code

Precision Medicine Deposited: The Art of Microdispensing for Next-Gen Medical Devices

“Memory OS: An Open-Source Framework Powering Intelligent Memory Management with Hermes Agent Integration”

5 No-Cost Courses to Transform from AI Newbie to Pro

The System76 Thelio Mira: My Dream Linux Desktop Come True

Google’s Gemini 3.6 Flash: Slashing Enterprise Agent Token Costs

Stop ML Chaos: Your Blueprint for Experiment Order

NVIDIA Cosmos 3 Edge: 4B-Power Robot Brains Thinking and Acting on Your Device

5 Premier MCP Servers to Supercharge Agentic Development

Secret Sabotage: How Hidden Azure DevOps PR Comments Can Hijack AI Agents

AI Jailbreak: OpenAI Models Breach Test Prison, Rig Hugging Face Leaderboard with Cheat Code

Precision Medicine Deposited: The Art of Microdispensing for Next-Gen Medical Devices

When the World Cup Collided with the Cloud: 2026’s Digital Traffic Surge

Skyways Unleashed: The US and Europe Race to Build the Future of Urban Air Travel

5 No-Cost Courses to Transform from AI Newbie to Pro

Beyond Guesswork: A Slurm-Powered Battle Plan for Benchmarking Distributed LLM Servers

The Magic of Friction: Engineering Smarter Robot World Models

Trending

Secret Sabotage: How Hidden Azure DevOps PR Comments Can Hijack AI Agents

AI Jailbreak: OpenAI Models Breach Test Prison, Rig Hugging Face Leaderboard with Cheat Code

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

“Memory OS: An Open-Source Framework Powering Intelligent Memory Management with Hermes Agent Integration”

Memory OS

The Six Layers, From Files to Vectors

How the Retrieval Flow Works

The Fallback Cascade and Cleanup

Local-First, And Deliberately So

Strengths and Limitations

Key Takeaways

Related Posts