TencentDB Agent Memory is now available as an open-source memory management framework for AI agents. Released under the permissive MIT license, it tackles a common pain point for developers building agents that operate over extended periods: the tendency for context to inflate uncontrollably and for agents to lose track of previously seen information.
The framework pairs a lightweight symbolic short-term memory component with a structured, multi-level long-term memory pipeline. It plugs directly into OpenClaw as a native plugin and connects to Hermes Agent via a Gateway adapter. Out of the box, it uses a local SQLite database enhanced with the sqlite-vec extension, meaning no third-party cloud APIs are needed to get started.
Why giving agents reliable memory remains a tough challenge
Conventional memory systems typically chop conversations into small pieces and toss them into a single undifferentiated vector database. Retrieval then amounts to a crude similarity search spanning thousands of disconnected snippets, with no higher-level organizational framework to guide the agent. Both memory tiering and symbolic representations serve as the core foundations that TencentDB Agent Memory relies on.
A four-level semantic pyramid for structured recall
Rather than storing everything in one undifferentiated log, TencentDB Agent Memory arranges long-term personalization data into four distinct tiers: L0 Conversation, L1 Atom, L2 Scenario, and L3 Persona. These map to raw dialogue exchanges, individual facts, situational context blocks, and a consolidated user profile, respectively.
The Persona tier holds everyday user preferences and is always checked first during retrieval. The system only descends to the Atom or raw Conversation layers when more granular detail is required. This design keeps lower tiers as an evidence reservoir while upper tiers maintain an organized, high-level summary.
Data is stored using multiple formats suited to each type. Facts, logs, and execution traces reside in databases optimized for full-text search. Personas, scenarios, and task canvases are saved as readable Markdown files. All layered memory artifacts are kept under ~/.openclaw/memory-tdai/.
Compressing short-term context with Mermaid-based symbolic memory
Extended agent workflows burn through tokens quickly due to lengthy tool outputs, search results, source code, and error logs. TencentDB Agent Memory counters this by offloading and symbolically encoding intermediate state.
Detailed tool logs are written to external files stored under refs/*.md. Meanwhile, transitions between task states are captured using Mermaid diagram syntax within a compact task canvas. The agent then operates over this symbolic graph directly inside its active context window.
When the actual raw text becomes necessary, the agent looks up a node_id and fetches the matching file. Tencent’s engineering team characterizes this as a predictable top-down traversal — starting from a high-level symbol, navigating through a mid-level index, and finally reaching the underlying raw content.
Performance benchmarks
All benchmarks were conducted across sustained multi-task sessions rather than single isolated queries. In the SWE-bench evaluation, for instance, each session consists of 50 back-to-back tasks designed to stress-test how well the system handles growing context.
On WideSearch, adding the OpenClaw plugin boosted the pass rate from 33% to 50%, marking a 51.52% relative gain. Token consumption simultaneously dropped from 221.31M to 85.64M — a 61.38% reduction.
On SWE-bench, task success improved from 58.4% to 64.2%, while token usage fell from 3474.1M to 2375.4M, amounting to a 33.09% reduction. On AA-LCR, the success rate climbed from 44.0% to 47.5%, with tokens decreasing from 112.0M to 77.3M — a 30.98% cut.
For long-term memory accuracy, PersonaMem scores jumped from 48% to 76%. Keep in mind that all figures come from Tencent’s internal evaluations.
How recall and retrieval work
The default retrieval approach is a hybrid model. BM25 keyword matching is combined with vector embedding similarity, and results are merged using Reciprocal Rank Fusion (RRF). Developers can override this and select either pure keyword or embedding mode via a configuration setting. The BM25 tokenizer handles both Chinese (via jieba segmentation) and English text.
By default, the system extracts L1 memories every five conversation turns and refreshes the user persona profile after every 50 new memories are created. Recall queries return up to five results with a 5-second timeout. If the timeout is exceeded, the system simply skips memory injection rather than holding up the conversation.
Getting started and developer tools
OpenClaw integration is delivered through a single npm package: @tencentdb-agent-memory/memory-tencentdb. The project requires Node.js 22.16 or above. A single configuration toggle activates everything, after which the plugin automatically manages conversation recording, memory extraction, scene aggregation, persona construction, and retrieval.
For Hermes users, a pre-built Docker image packages the agent, the plugin, and the TDAI Memory Gateway together. The default model configured is Tencent Cloud’s DeepSeek-V3.2, though any OpenAI-compatible API endpoint can be used by setting the MODEL_PROVIDER=custom flag.
Two tools are available to agents during an active session: tdai_memory_search and tdai_conversation_search. Each returns results containing node_id and result_ref fields to enable precise traceability. Tencent Cloud Vector Database (TCVDB) is also offered as an alternative backend to the default local SQLite setup.
Marktechpost’s Visual Overview
Curated by MARKTECHPOST · AI Research, Engineered for Builders
Key Takeaways
- TencentDB Agent Memory is Tencent’s open-source (MIT-licensed) memory framework for AI agents, built around symbolic short-term memory and a layered long-term memory pipeline with zero external API dependencies.
- Long-term memory follows a 4-tier semantic pyramid (L0 Conversation → L1 Atom → L2 Scenario → L3 Persona), with drill-down navigation via
node_idandresult_refrather than flat vector retrieval. - Short-term memory offloads verbose tool logs to
refs/*.mdand retains only a compact Mermaid task canvas in context, significantly reducing token usage while maintaining full traceability. - Reported improvements when integrated with OpenClaw: WideSearch pass rate climbs from 33% to 50% with a 61.38% reduction in tokens, SWE-bench improves from 58.4% to 64.2%, AA-LCR from 44.0% to 47.5%, and PersonaMem accuracy jumps from 48% to 76%.
- Distributed as an npm plugin for OpenClaw and a Docker image for Hermes, defaulting to local SQLite + sqlite-vec, hybrid BM25 + vector + RRF retrieval, with an optional Tencent Cloud Vector Database (TCVDB) backend.
Explore the Repo. Also, feel free to follow us on Twitter and ensure to join our 150k+ ML SubReddit and our Newsletter. Are you on telegram? now you can join us on Telegram as well.
Looking to collaborate with us to showcase your GitHub Repo OR Hugging Face Page OR Product Launch OR Webinar? Get in touch with us.
Michal Sutter is a data science expert who holds a Master of Science in Data Science from the University of Padova. Equipped with a robust background in statistical analysis, machine learning, and data engineering, Michal specializes in transforming intricate datasets into meaningful, actionable insights.



