Tencent Launches TencentDB Agent Memory: 4-Tier Local Memory Pipeline For AI Agents Unveiled

TencentDB Agent Memory is now available as an open-source memory management framework for AI agents. Released under the permissive MIT license, it tackles a common pain point for developers building agents that operate over extended periods: the tendency for context to inflate uncontrollably and for agents to lose track of previously seen information.

The framework pairs a lightweight symbolic short-term memory component with a structured, multi-level long-term memory pipeline. It plugs directly into OpenClaw as a native plugin and connects to Hermes Agent via a Gateway adapter. Out of the box, it uses a local SQLite database enhanced with the sqlite-vec extension, meaning no third-party cloud APIs are needed to get started.

Why giving agents reliable memory remains a tough challenge

Conventional memory systems typically chop conversations into small pieces and toss them into a single undifferentiated vector database. Retrieval then amounts to a crude similarity search spanning thousands of disconnected snippets, with no higher-level organizational framework to guide the agent. Both memory tiering and symbolic representations serve as the core foundations that TencentDB Agent Memory relies on.

A four-level semantic pyramid for structured recall

Rather than storing everything in one undifferentiated log, TencentDB Agent Memory arranges long-term personalization data into four distinct tiers: L0 Conversation, L1 Atom, L2 Scenario, and L3 Persona. These map to raw dialogue exchanges, individual facts, situational context blocks, and a consolidated user profile, respectively.

The Persona tier holds everyday user preferences and is always checked first during retrieval. The system only descends to the Atom or raw Conversation layers when more granular detail is required. This design keeps lower tiers as an evidence reservoir while upper tiers maintain an organized, high-level summary.

Data is stored using multiple formats suited to each type. Facts, logs, and execution traces reside in databases optimized for full-text search. Personas, scenarios, and task canvases are saved as readable Markdown files. All layered memory artifacts are kept under ~/.openclaw/memory-tdai/.

Compressing short-term context with Mermaid-based symbolic memory

Extended agent workflows burn through tokens quickly due to lengthy tool outputs, search results, source code, and error logs. TencentDB Agent Memory counters this by offloading and symbolically encoding intermediate state.

Detailed tool logs are written to external files stored under refs/*.md. Meanwhile, transitions between task states are captured using Mermaid diagram syntax within a compact task canvas. The agent then operates over this symbolic graph directly inside its active context window.

When the actual raw text becomes necessary, the agent looks up a node_id and fetches the matching file. Tencent’s engineering team characterizes this as a predictable top-down traversal — starting from a high-level symbol, navigating through a mid-level index, and finally reaching the underlying raw content.

Performance benchmarks

All benchmarks were conducted across sustained multi-task sessions rather than single isolated queries. In the SWE-bench evaluation, for instance, each session consists of 50 back-to-back tasks designed to stress-test how well the system handles growing context.

On WideSearch, adding the OpenClaw plugin boosted the pass rate from 33% to 50%, marking a 51.52% relative gain. Token consumption simultaneously dropped from 221.31M to 85.64M — a 61.38% reduction.

On SWE-bench, task success improved from 58.4% to 64.2%, while token usage fell from 3474.1M to 2375.4M, amounting to a 33.09% reduction. On AA-LCR, the success rate climbed from 44.0% to 47.5%, with tokens decreasing from 112.0M to 77.3M — a 30.98% cut.

For long-term memory accuracy, PersonaMem scores jumped from 48% to 76%. Keep in mind that all figures come from Tencent’s internal evaluations.

How recall and retrieval work

The default retrieval approach is a hybrid model. BM25 keyword matching is combined with vector embedding similarity, and results are merged using Reciprocal Rank Fusion (RRF). Developers can override this and select either pure keyword or embedding mode via a configuration setting. The BM25 tokenizer handles both Chinese (via jieba segmentation) and English text.

By default, the system extracts L1 memories every five conversation turns and refreshes the user persona profile after every 50 new memories are created. Recall queries return up to five results with a 5-second timeout. If the timeout is exceeded, the system simply skips memory injection rather than holding up the conversation.

Getting started and developer tools

OpenClaw integration is delivered through a single npm package: @tencentdb-agent-memory/memory-tencentdb. The project requires Node.js 22.16 or above. A single configuration toggle activates everything, after which the plugin automatically manages conversation recording, memory extraction, scene aggregation, persona construction, and retrieval.

For Hermes users, a pre-built Docker image packages the agent, the plugin, and the TDAI Memory Gateway together. The default model configured is Tencent Cloud’s DeepSeek-V3.2, though any OpenAI-compatible API endpoint can be used by setting the MODEL_PROVIDER=custom flag.

Two tools are available to agents during an active session: tdai_memory_search and tdai_conversation_search. Each returns results containing node_id and result_ref fields to enable precise traceability. Tencent Cloud Vector Database (TCVDB) is also offered as an alternative backend to the default local SQLite setup.

Marktechpost’s Visual Overview

TencentDB Agent Memory — Preview

01 / OVERVIEW

What is TencentDB Agent Memory?

An MIT-licensed memory framework for AI agents that pairs symbolic short-term memory with a four-stage long-term memory pipeline. Runs entirely locally with no reliance on external APIs.

Short-term memory

Shifts verbose tool logs to external files and maintains a compact Mermaid-based task canvas within the active context window.

Long-term memory

Distills conversations into a layered semantic pyramid spanning four tiers: L0 through L3.

Local backend

Ships with SQLite and sqlite-vec by default. The Tencent Cloud Vector Database (TCVDB) is available as an optional alternative.

Integrations

Available as an OpenClaw plugin and as a standalone Hermes Agent Docker image.

02 / ARCHITECTURE

The Four-Tier Semantic Pyramid

Long-term memory follows a layered design rather than a flat store. Higher tiers capture organized summaries; lower tiers retain granular evidence.

L3 · PersonaUser profile stored as persona.md

L2 · ScenarioSituational blocks saved in Markdown format

L1 · AtomIndividual facts recorded as JSONL

L0 · ConversationUnprocessed dialogue history

Drill-down path runs from Persona through Scenario and Atom to raw Conversation. Every reference carries a node_id and result_ref for reliable, deterministic traceback.

03 / SYMBOLIC SHORT-TERM

Mermaid task canvas + context offloading

Detailed intermediate logs are the biggest token drain during extended tasks. To fix this, the plugin saves those logs to disk and retains only a concise symbol graph within the active context.

How it works

Complete tool logs are saved to refs/*.md in the data directory.
State transitions are represented using Mermaid syntax within a lightweight task canvas.
The agent processes the symbol graph, then uses grep on a node_id to retrieve the original text when needed.

Files are stored on disk at ~/.openclaw/memory-tdai/. All artifacts are human-readable, enabling transparent, white-box debugging.

04 / INSTALL

Install the OpenClaw plugin

You will need Node.js 22.16 or later along with an existing OpenClaw setup.


openclaw plugins install @tencentdb-agent-memory/memory-tencentdb
openclaw gateway restart

Zero-config enable

To activate the plugin using the default SQLite + sqlite-vec backend, simply add the following snippet to ~/.openclaw/openclaw.json.

{
  "memory-tencentdb": {
    "enabled": true
  }
}

05 / CONFIGURATION

Daily-tuning parameters

Each field ships with a well-chosen default. The most frequently adjusted options are listed below.

Field	Default	Description
`storeBackend`	sqlite	Storage backend
`recall.strategy`	hybrid	keyword / embedding / hybrid (RRF)
`recall.maxResults`	5	Items returned per recall
`recall.timeoutMs`	5000	Skip injection on timeout
`pipeline.everyNConversations`	5	L1 extraction every N turns
`persona.triggerEveryN`	50	Generate persona every N memories
`offload.enabled`	false	Short-term compression toggle

06 / SHORT-TERM COMPRESSION

Enable Mermaid offloading (v0.3.4+)

Three straightforward steps to activate context offloading for long-horizon tasks.

Step 1 · Turn on offload in the plugin config

{
  "memory-tencentdb": {
    "config": {
      "offload": { "enabled": true }
    }
  }
}

Step 2 · Register the slot so OpenClaw routes offload requests correctly

{
  "plugins": {
    "slots": {
      "contextEngine": "openclaw-context-offload"
    }
  }
}

Step 3 · Run the runtime patch once per OpenClaw installation

bash scripts/openclaw-after-tool-call-messages.patch.sh

07 / HERMES DOCKER

Run memory-enabled Hermes in a single container

One Docker image includes Hermes Agent, the memory_tencentdb plugin, and the TDAI Memory Gateway.


docker build -f Dockerfile.hermes -t hermes-memory .


docker run -d 
  --name hermes-memory 
  --restart unless-stopped 
  -p 8420:8420 
  -e MODEL_API_KEY="your-api-key" 
  -e MODEL_BASE_URL=" 
  -e MODEL_NAME="deepseek-v3.2" 
  -e MODEL_PROVIDER="custom" 
  -v hermes_data:/opt/data 
  hermes-memory


curl

Any OpenAI-compatible endpoint can be used by setting MODEL_PROVIDER=custom. Memory data is persisted in the hermes_data volume.

08 / AGENT TOOLS & RECALL

What the agent sees

The agent has access to two tools during a session. By default, retrieval combines BM25, vector search, and RRF fusion.

tdai_memory_search

Searches across L1 Atoms, L2 Scenarios, and L3 Persona.

tdai_conversation_search

Searches through raw L0 Conversation history.

Retrieval defaults

Hybrid strategy combining BM25 keyword matching and vector embeddings, merged via Reciprocal Rank Fusion.
The BM25 tokenizer handles both Chinese (jieba) and English text.
Returns up to 5 results per recall with a 5000 ms timeout; on timeout, injection is skipped.
Each result includes a node_id and result_ref for full traceback capability.

09 / BENCHMARKS

Reported gains with OpenClaw

Results are measured across continuous long-horizon sessions, not isolated single turns. SWE-bench evaluates 50 consecutive tasks per session.

Benchmark	Baseline	With Plugin	Δ Pass	Δ Tokens
WideSearch	33%	50%	+51.52%	−61.38%
SWE-bench	58.4%	64.2%	+9.93%	−33.09%
AA-LCR	44.0%	47.5%	+7.95%	−30.98%
PersonaMem	48%	76%	+59%	—

These figures come from Tencent’s own evaluations and reflect integration with OpenClaw.

10 / RESOURCES

Where to go next

Links to documentation, source code, and community channels.

Source code

github.com/Tencent/TencentDB-Agent-Memory

npm package

@tencentdb-agent-memory/memory-tencentdb

Roadmap

Portable memory, automatic Skill generation, visual debugging dashboard.

Curated by MARKTECHPOST · AI Research, Engineered for Builders

Key Takeaways

TencentDB Agent Memory is Tencent’s open-source (MIT-licensed) memory framework for AI agents, built around symbolic short-term memory and a layered long-term memory pipeline with zero external API dependencies.
Long-term memory follows a 4-tier semantic pyramid (L0 Conversation → L1 Atom → L2 Scenario → L3 Persona), with drill-down navigation via node_id and result_ref rather than flat vector retrieval.
Short-term memory offloads verbose tool logs to refs/*.md and retains only a compact Mermaid task canvas in context, significantly reducing token usage while maintaining full traceability.
Reported improvements when integrated with OpenClaw: WideSearch pass rate climbs from 33% to 50% with a 61.38% reduction in tokens, SWE-bench improves from 58.4% to 64.2%, AA-LCR from 44.0% to 47.5%, and PersonaMem accuracy jumps from 48% to 76%.
Distributed as an npm plugin for OpenClaw and a Docker image for Hermes, defaulting to local SQLite + sqlite-vec, hybrid BM25 + vector + RRF retrieval, with an optional Tencent Cloud Vector Database (TCVDB) backend.

Explore the Repo. Also, feel free to follow us on Twitter and ensure to join our 150k+ ML SubReddit and our Newsletter. Are you on telegram? now you can join us on Telegram as well.

Looking to collaborate with us to showcase your GitHub Repo OR Hugging Face Page OR Product Launch OR Webinar? Get in touch with us.

Michal Sutter is a data science expert who holds a Master of Science in Data Science from the University of Padova. Equipped with a robust background in statistical analysis, machine learning, and data engineering, Michal specializes in transforming intricate datasets into meaningful, actionable insights.

Subscribe to Updates

Top Posts