[D] Antaris-suite 3.0 (open Supply, Free) — Zero-dependency Agent Reminiscence, Guard, Routing, And Context Administration (benchmarks + 3-model Code Evaluation Inside)

So, I picked up vibe coding again in early 2025 once I was making an attempt to learn to make listed chatbots and nice tuned Discord bots that mimic my pal's mannerisms. I found agentic coding when Claude Code was launched and just about turned an addict. It's all I did at night time. Then I bought into brokers, and when ClawBot got here out it was sport over for me (or at the very least my time). So I constructed one and starrt utilizing it to code just about completely, utilizing DIscord to speak with it. I'm looking for a approach out of my present job and I'm hoping this opens up some pathways.

Effectively the night/early morning after Valentines Day, once I was lastly in a position to sneak away to my pc and construct, I got here again to a zombified agent and ended up shedding much more progress from the night earlier than than I'd prefer to admit. (Seems once you us discord as your sole technique of communication, exporting your whole chat historical past and even simply telling it to learn again to a sure time-stamp works very well for recovering misplaced reminiscence).

In any case, I made a decision to look into methods to enhance its reminiscence, and stumbled throughout some reddit posts and articles that appeared like an excellent place to start out. I swapped my technique from utilizing an ordinary markdown file and storing each 4 hours + on command to a mode of indexing reminiscences with the thought of constructing in a decay system for the reminiscences and a recall and search operate. (Nothing new within the house, nevertheless it was enjoyable to study myself). That's how my first mission was born- Antaris-Reminiscence. It indexes its reminiscences primarily based on precedence, and makes use of native sharded JSONL storage. When it must recall one thing, it makes use of BM25 and decay-weighted looking out, and narrows down the highest 5-10 reminiscences primarily based on the context of the dialog. That was my first module. No RAG, no Vector DB, simply persistent file primarily based reminiscence.

Now I'm on V3.0 of antaris-suite, a six Python packages that handles the infrastructure layer of an agent from reminiscence, security, routing, and context utilizing pipeline coordination and shared contracts. Zero exterior dependencies on the core packages. No pulling reminiscences from the cloud, no utilizing different LLMs to kind by means of them, no API keys, nothing. Which, it seems, makes it insanely quick.

“`bash
pip set up antaris-memory antaris-router antaris-guard antaris-context antaris-pipeline
“`

If you happen to use OpenClaw: there's a local plugin. openclaw plugins set up antaris-suite — reminiscence recall and ingest hook into each agent flip robotically, no code modifications. Contains compaction-aware session restoration so long-running brokers don't lose context throughout reminiscence resets.
—

**What every package deal really does:**

**Antaris-Reminiscence**

Sharded storage for manufacturing scalability (20,000+ reminiscences, sub-second search)
Quick search indexes (full-text, tags, dates) saved as clear JSON recordsdata
Computerized schema migration from single-file to sharded format with rollback
Multi-agent shared reminiscence swimming pools with namespace isolation and entry controls
Retrieval weighted by recency × significance × entry frequency (Ebbinghaus-inspired decay)
Enter gating classifies incoming content material by precedence (P0–P3) and drops ephemeral noise at consumption
Detects contradictions between saved reminiscences utilizing deterministic rule-based comparability
Runs totally offline — zero community calls, zero tokens, zero API keys
Not a vector database, not a information graph, not semantic by default not LLM-dependent, and never infinitely scalable and not using a database.

**Antaris-Guard**

PromptGuard — detects immediate injection makes an attempt utilizing 47+ regex patterns with evasion resistance
ContentFilter — detects and redacts PII (emails, telephones, SSNs, bank cards, API keys, credentials)
ConversationGuard — multi-turn evaluation; catches threats that develop throughout a dialog
ReputationTracker — per-source belief profiles that evolve with interplay historical past
BehaviorAnalyzer — burst, escalation, and probe sequence detection throughout classes
AuditLogger — structured JSONL safety occasion logging for compliance
RateLimiter — token bucket price limiting with file-based persistence
Coverage DSL — compose, serialize, and reload safety insurance policies from JSON recordsdata
Compliance templates for enterprise — GDPR, HIPAA, PCI-DSS, SOC2 preconfigured configurations

**Antaris-Router**

Semantic classification — TF-IDF vectors + cosine similarity, not key phrase matching
End result studying — tracks routing choices and their outcomes, builds per-model high quality profiles
SLA enforcement — price funds alerts, latency targets, high quality rating monitoring per mannequin/tier
Fallback chains — computerized escalation when low cost fashions fail
A/B testing — routes a configurable % to premium fashions to validate low cost routing
Context-aware — adjusts routing primarily based on iteration rely, dialog size, consumer experience
Multi-objective — optimize for high quality, price, velocity, or balanced
Runs totally offline — zero community calls, zero tokens, zero API keys

-**Antaris-context**

Sliding window context supervisor with token funds enforcement.
Flip lifecycle API

**Antaris Pipeline**

The orchestration layer for the complete antaris-suite inside OpenClaw. It wires collectively reminiscence recall, security checking, mannequin routing, and context administration right into a single event-driven lifecycle.

**Antaris-Contract**

Versioned state schemas,
failure semantics,
concurrency mannequin docs,
debug CLI for the complete Antaris Suite.

—

**Benchmarks (Mac Mini M4, 10-core, 32GB):**

The Antaris vs mem0 numbers are a direct head-to-head on the identical machine with a dwell OpenAI API key — 50 artificial entries, various corpus sizes (50, 100, 100,000, 500,000, 1,000,000,10 runs averaged. Letta and Zep have been measured individually (totally different methodology — see footnotes).

Even with a full pipeline flip of guard + recall + context + routing + ingest antaris measured at 1,000-memory corpus. mem0 determine = measured search p50 (193ms) + measured ingest per entry (312ms).

LangChain ConversationBufferMemory: its quick as a result of it's a listing append + recency retrieval — not semantic search. At 1,000+ reminiscences it dumps all the pieces into context. Not equal performance.

Zep Cloud measured by way of cloud API from a DigitalOcean droplet (US-West area). Community-inclusive latency.

Letta self-hosted: Docker + Ollama (qwen2.5:1.5b + nomic-embed-text) on the identical DigitalOcean droplet. Every ingest generates an embedding by way of Ollama. Not an area in-process comparability.

Benchmark scripts are within the repo. For the antaris vs mem0 numbers particularly, you may reproduce them your self in about 60 seconds:

“`bash
OPENAI_API_KEY=sk-… python3 benchmarks/quick_compare.py –runs 10 –entries 50
“`

**Engineering choices price noting:**

– Storage is apparent JSONL shards + a WAL. Readable, moveable, no lock-in. At 1M entries bulk ingest runs at ~11,600 objects/sec with near-flat scaling (after bulk_ingest repair).
– Locking is `os.mkdir`-based (atomic on POSIX and Home windows) reasonably than `fcntl`, so it really works cross-platform with out exterior dependencies nonetheless.
– Hashes use BLAKE2b-128 (not MD5). Migration script included for present shops.
– Guard fails open by default (configurable to fail-closed for public-facing deployments).
– The pipeline plugin for OpenClaw consists of compaction-aware session restoration: handoff notes written earlier than context compaction, restored as arduous context on resume (that is nonetheless one in every of my favourite options.

—

GitHub:
Docs:

Web site:

Unique README and the unique thought for the architecure. On the time we consider this to be a novel resolution to the Agent Amnesia drawback, and in addition we've found a variety of these thought have been mentioned earlier than, good quantity of them by no means have, like our Dream State Processing.

┌─────────────────────────────────────────────┐ │ MemorySystem │ │ │ │ ┌──────────┐ ┌───────────┐ ┌────────────┐ │ │ │ Decay │ │ Sentiment │ │ Temporal │ │ │ │ Engine │ │ Tagger │ │ Engine │ │ │ └──────────┘ └───────────┘ └────────────┘ │ │ ┌──────────┐ ┌───────────┐ ┌────────────┐ │ │ │Confidence│ │Compression│ │ Forgetting │ │ │ │ Engine │ │ Engine │ │ Engine │ │ │ └──────────┘ └───────────┘ └────────────┘ │ │ ┌──────────────────────────────────────┐ │ │ │ Consolidation Engine │ │ │ │ (Dream State Processing) │ │ │ └──────────────────────────────────────┘ │ │ │ │ Storage: JSON file (zero dependencies) │ └─────────────────────────────────────────────┘

Pleased to reply questions on structure, the benchmark methodology, or something that appears mistaken.

<3 Antaris

submitted by /u/fourbeersthepirates
[comments]

Top Posts

2026 Showdown: Run These 4 Local LLMs Smoothly on Just One 24GB GPU

Pixel Protection at $5/Month: Is It Worth the Cost?

The Hidden Files: Inside the First Release on US Election Integrity Secrets

[D] antaris-suite 3.0 (open supply, free) — zero-dependency agent reminiscence, guard, routing, and context administration (benchmarks + 3-model code evaluation inside)

2026 Showdown: Run These 4 Local LLMs Smoothly on Just One 24GB GPU

The Micro-Loop That Turbocharges RAG: Parsing Questions Before Retrieval

WANDR: The Open Benchmark Stress-Testing Research Agents That Wander Wide and Deep

Unlock Loyalty: Revolutionizing FinTech Retention Secrets

Kimi K3 vs DeepSeek V4 Pro vs GLM-5.2: Open Trillion-Scale MoE Models Compared on Benchmarks, License, and Serving Cost

Beyond the Hype: Architecting Your AI-Native Data Fortress

2026 Showdown: Run These 4 Local LLMs Smoothly on Just One 24GB GPU

Pixel Protection at $5/Month: Is It Worth the Cost?

The Hidden Files: Inside the First Release on US Election Integrity Secrets

Will Bitcoin’s $80K Surge Ignite US CLARITY This Week? Hodler’s Edge

The Micro-Loop That Turbocharges RAG: Parsing Questions Before Retrieval

Beyond the SaaS Storm: How Workday and Tech Titans Plan to Outsmart AI Apocalypse

Ignite Your Neural Network: Demystifying Backpropagation for Curious Minds

SonicWall’s Hidden Zero-Days: How Hackers Stole Root Access Before the Patch

Trending

2026 Showdown: Run These 4 Local LLMs Smoothly on Just One 24GB GPU

Pixel Protection at $5/Month: Is It Worth the Cost?

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

[D] antaris-suite 3.0 (open supply, free) — zero-dependency agent reminiscence, guard, routing, and context administration (benchmarks + 3-model code evaluation inside)

Related Posts