Prepared: April 04, 2026For: Hermes Agent (Nous Research / OpenClaw)Providers Analyzed: 7
1. Executive Summary & Rankings
AI agents are fundamentally stateless — they don’t remember anything between sessions unless given that capacity. Memory providers solve this by extracting, storing, and retrieving relevant context from past interactions. They are one of the highest-leverage upgrades you can give an agent.
Below is the ranking by popularity (GitHub stars, downloads, community activity, and market presence as of April 2026), followed by a deep dive into each.
#1 Mem0
51.9k GitHub stars
#2 OpenViking
20.9k GitHub stars
#3 ByteRover
~3.7k GitHub stars
#4 Hindsight
Vectorize.io backed
#5 Honcho
Plastic Labs · $5.4M raised
#6 RetainDB
SOTA on LongMemEval
#7 Holographic
Academic / niche
Key Takeaway
The gap between Mem0 and everything else is massive — over 50k stars versus 20k for the next contender. However, raw popularity does not mean best fit. Your use case (self-hosted, multi-platform agent) aligns most naturally with OpenViking (already running), while Mem0 and Hindsight offer the richest features if you’re willing to adopt a cloud or managed service.
2. Side-by-Side Comparison
Provider
Type
Storage
License
Cost
Key Differentiator
Mem0
Cloud · Open-core
Pluggable (your own vector DB)
Apache 2.0
Free tier; paid scaling
Largest ecosystem, framework-agnostic
OpenViking
Self-hosted
Local filesystem + SQLite
AGPL-3.0
Free
Filesystem paradigm, tiered L0/L1/L2
ByteRover
Local · SaaS
Local knowledge tree
Proprietary
Free CLI; paid cloud
Pre-compression extraction, coding focus
Hindsight
Local · Cloud
PostgreSQL + PgVector
Proprietary
Free tier; paid plans
Biomimetic memory, 3-layer model, knowledge graph
Honcho
Self-hosted + Cloud
PostgreSQL + pgvector
AGPL-3.0
$100 free credits; paid after
User/agent modeling (Peer Paradigm)
RetainDB
Managed cloud
Proprietary cloud storage
Apache 2.0 (SDK)
Free 10k ops/mo; paid beyond
SOTA on LongMemEval benchmark
Holographic
Local only
SQLite + numpy
Built into Hermes
Free
HRR algebraic vectors, trust scoring
3. Mem0 (mem0ai/mem0)
Mem0
51.9k stars · 5.8k forks · 296 contributors
What Is It?
Mem0 (pronounced “mem-zero”) is the most popular AI memory layer in the industry. It acts as a universal, self-improving memory layer that sits between your application and the LLM. When a user sends a message, Mem0 automatically extracts facts, stores them, and retrieves only the most relevant ones at the next turn. It supports Open-source and managed tiers, with SDKs in Python and TypeScript.
Add: After each conversation turn, Mem0 passes the message history through an LLM (default: gpt-4.1-nano) to extract atomic facts from the conversation.
Store: These facts are embedded and stored in your chosen vector database. Mem0 de-duplicates, merges, and updates conflicting facts automatically.
Search: At the start of each new turn, Mem0 searches the vector DB for relevant memories based on the current query. Results are optionally reranked.
Inject: Retrieved memories are injected into the system prompt, giving the LLM cross-session context without blowing the context window.
Hermes Agent Integration
pip install mem0ai
# Set env: MEM0_API_KEY=sk-... (for managed service)
# Or: configure your own vector store for self-hosted mode
Pros
Largest community and ecosystem (51.9k stars)
Pluggable storage — use your own vector DB
Python SDK, TypeScript SDK, CLI, and MCP server
Native integrations with LangGraph, CrewAI, Vercel AI SDK, Claude Code, Cursor, Codex
Multi-level memory: User, Session, Agent scopes
AWS-backed credibility
Benchmarked: +26% accuracy vs. OpenAI Memory, 90% lower token usage
Cons
Managed Cloud API is a black box — you don’t control extraction logic
Free tier has usage limits; costs scale with conversations
Self-hosted mode requires you to manage a vector DB
Fact extraction quality depends on the underlying LLM
Can lose nuance when compressing complex user preferences into flat facts
Verdict: The safe default choice. If you want memory that just works with the largest community backing it, pick Mem0. Best for production teams that need integrations across multiple frameworks.
4. OpenViking (volcengine/OpenViking)
OpenViking
20.9k stars · 1.5k forks
What Is It?
OpenViking is an open-source context database built by ByteDance’s Volcengine team (the TikTok parent company’s cloud division). It’s the only provider that uses a filesystem paradigm — everything is organized into a virtual viking:// directory tree that agents can browse, search, and navigate hierarchically. You’re already running it on this machine.
Architecture
Agent
|
| viking://resources/memories/profile/
| viking://resources/memories/preferences/
| viking://resources/memories/entities/
| viking://resources/memories/events/
| viking://resources/memories/cases/
| viking://resources/memories/patterns/
v
+------------------+
| OpenViking Server |
| viking:// tree |
+--------------------+
| |
v v
Local FS SQLite + embeddings
How It Works
Ingest: Resources (URLs, documents, conversation turns) are added via ov add-resource or programmatically. They are automatically parsed, embedded, and placed into a directory structure.
L1 (Overview) — ~2,000 tokens, core info for planning
L2 (Full) — complete data, loaded on demand
Directory-Guided Search: Instead of flat vector search, OpenViking first identifies the most relevant directory, then drills down progressively. This significantly reduces token waste.
Session End Extraction: After conversations end, the system asynchronously extracts long-term memories (preferences, entities, events) into the appropriate category.
Benchmark Results (LoCoMo10 Dataset)
Setup
Task Completion Rate
Token Cost (Input)
OpenClaw (Native Memory)
35.65%
24.6M
OpenClaw + LanceDB
44.55%
51.6M
OpenClaw + OpenViking
52.08%
4.3M
Hermes Agent Integration
pip install openviking
# Set memory.provider: openviking in config.yaml
# Server: openviking-server
# Already running at http://localhost:1933
Pros
Completely free and fully self-hosted
83-91% reduction in token costs vs. naive approaches (benchmark-backed)
Filesystem organization is intuitive for human debugging
Backed by ByteDance — serious engineering resources
Already running in your environment
Automatic session-end memory extraction
Cons
Must manage your own server infrastructure
Requires a VLM and embedding provider setup (config-heavy)
Browser tools are broken (GitHub issue #4740)
Smaller community than Mem0 (though still substantial)
AGPL-3.0 license can be restrictive for commercial use
Primarily designed for OpenClaw — Hermes integration is newer
Verdict: The best token-efficiency choice with excellent structured organization. Backed by ByteDance, fully free, and already running for you. Downsides are infrastructure overhead and a broken browser interface.
5. ByteRover (byterover-cli/brv)
ByteRover
~3.7k stars · 368 forks
What Is It?
ByteRover is a CLI-based knowledge tree purpose-built for AI coding agents. Its unique claim is pre-compression extraction — capturing key insights from long conversations before the LLM’s context window compresses or discards them. It also recently launched Cipher, an open-source memory layer specifically for coding IDEs.
How It Works
Extract: During a long coding session, ByteRover runs parallel to the conversation, extracting project-specific knowledge, user preferences, architectural decisions, and patterns.
Structure: These facts are organized into a hierarchical knowledge tree rather than a flat vector store.
Store: The tree is persisted locally in a portable format.
Retrieve: When starting a new session, the agent can load relevant branches from the knowledge tree at context window start.
Hermes Agent Integration
npm install -g byterover-cli # CLI: brv
# or use the cloud service
Pros
Pre-compression extraction is a genuinely novel idea — capture before the window drops it
Portable between IDEs (VS Code, Cursor, Claude Code, neovim)
CLI-first design fits naturally into terminal-based workflows
Cipher extends it to a shared team memory concept
Local storage option available — no cloud dependency
Cons
Narrowly focused on coding agents — less applicable to general-purpose assistants
Smaller community than major alternatives
Proprietary license for the core product
Newer and less battle-tested than Mem0/OpenViking
Hermes Agent integration is thin compared to other providers
Verdict: Niche but interesting. The pre-compression extraction concept is clever and valuable for long coding sessions. Best for developers who switch between IDEs and want persistent coding context. Not a great fit for general-purpose agents.
6. Hindsight (vectorize-io/hindsight)
Hindsight
Vectorize.io backed · SOTA on LongMemEval
What Is It?
Hindsight is an agent memory system built by Vectorize.io that structures memory around how human memory actually works. Instead of just storing facts, it uses a three-layer biomimetic model (World, Experiences, Mental Models) and a knowledge graph to create connections between disparate pieces of information. It has a published arXiv paper and scored state-of-the-art on the LongMemEval benchmark.
Architecture — The Biomimetic Model
┌──────────────────────────────────────┐
│ MENTAL MODELS │ ← Abstracted insights formed by
│ (How things work) │ reflecting on raw data
├──────────────────────────────────────┤
│ EXPERIENCES │ ← The agent's own past actions
│ (What I did, what happened) │ and their outcomes
├──────────────────────────────────────┤
│ WORLD │ ← General facts and knowledge
│ (Static facts) │ about the external world
└──────────────────────────────────────┘
How It Works
Retain (Store): An LLM extracts facts from conversations and stores them with context, timestamps, and entity labels into PostgreSQL with PgVector.
Recall (Retrieve): Runs four retrieval strategies in parallel: Semantic (Vector), Keyword (BM25), Graph (Entity/Causal), and Temporal (Time-range). Results are combined via reciprocal rank fusion with cross-encoder reranking.
Reflect (Learn): A unique capability — Hindsight can analyze its stored memories to form new insights, discover patterns, and update its own mental models without new external input.
Available as Docker (fully local) and cloud managed
Published academic paper adds credibility
Built-in UI (Web dashboard)
Integrations for Claude Code, Telegram, Paperclip
Cons
Complex setup — requires heavy ML dependencies (Torch, Transformers) for full features
Proprietary license (not open source in the traditional sense)
Heavier resource requirements than simpler providers
Knowledge graph can be overkill for simple memory needs
Paid managed tier beyond free limits
Confusing repo situation (two different GitHub repos exist)
Verdict: The most intellectually ambitious option. If you want an AI that learns rather than just retrieves, Hindsight is the pick. The reflect() capability and biomimetic model are genuinely ahead of the curve. But it's complex to run and proprietary.
7. Honcho (plastic-labs/honcho)
Honcho
Plastic Labs · $5.4M pre-seed
What Is It?
Honcho is an open-source memory library and managed service designed around the Peer Paradigm — both humans and AI agents are treated as “Peers,” each with their own evolving identity profile. Rather than just storing facts, Honcho builds psychological models of users and agents, tracking learning styles, communication preferences, and behavioral patterns over time. Founded by Plastic Labs which has raised $5.4M in pre-seed funding.
How It Works
Ingest: Messages are logged to sessions between Peers (user and agent).
Derive: A background worker (“Deriver”) asynchronously analyzes the conversation to update:
Representations — evolving psychological profiles of each Peer
Summaries — compressed summaries of sessions
Conclusions — dialectic reasoning about what the user actually wants
Retrieve: Natural language queries to a “Peer Oracle” — e.g., alice.chat("What learning styles does the user respond to best?") — to hydrate prompts with deeply personalized context.
User modeling may feel intrusive or unnecessary for simple tasks
AGPL-3.0 license restricts commercial use without a managed service
Smaller community footprint than Mem0/OpenViking
Heavily opinionated approach — not a generic memory store
$100 free credits run out; managed service pricing kicks in
Verdict: The most opinionated provider. If you want deeply personalized agents that understand how you think, not just what you said, Honcho is unmatched. But it requires significant setup infrastructure and is overkill if you don't need personality modeling.
8. RetainDB
RetainDB
SOTA on LongMemEval · Managed Cloud
What Is It?
RetainDB is a managed cloud memory service with the goal of being the memory layer for AI agents, with only two API calls needed. It scored state-of-the-art on the LongMemEval benchmark (88% preference recall, tied with Hindsight). Its SDK enables memory integration in under 30 seconds with a Vercel AI SDK wrapper.
How It Works
Context Query: POST /v1/context/query retrieves relevant memories and injects them into the system prompt.
LLM Generation: Your LLM generates a response with the enriched context.
Learn: POST /v1/learn stores the interaction for future sessions.
RetainDB uses hybrid search (Vector + BM25) and delta compression for storage efficiency. It supports 7 memory types and works with any LLM provider.
Pricing
Tier
Price
Limits
Free
$0/mo
10,000 operations/month
Paid
$20/mo
Higher limits, priority access
Pros
Extremely easy to integrate — 2 API calls, 30-second setup
Free tier covers 10k operations/month — generous for personal use
Works with any LLM and framework
Starter templates for Next.js, Express, Python, LangChain
Cons
Cloud-only — no self-hosted option at all
$20/mo paid tier is relatively expensive compared to self-hosted alternatives
Smallest community and public footprint of any provider
Less control over data — privacy implications for sensitive use cases
SDK is primarily TypeScript-focused; Python support is thinner
Managed service = vendor lock-in risk
Verdict: The quickest to integrate but the least flexible. If you want a managed solution that just works with great benchmarks and don't mind the cloud dependency, RetainDB is straightforward. But it's the most vendor-locked option.
9. Holographic (Built into Hermes Agent)
Holographic Memory
Academic/niche · Zero external stars
What Is It?
Holographic memory is a fully local, zero-dependency SQLite-based memory plugin built directly into the Hermes Agent source code. It uses Holographic Reduced Representations (HRR), an academic approach from cognitive science (Tony Plate, 1995) that performs symbolic AI on top of real-valued vectors. Memories are stored as compressed holographic vectors that can be combined (superposition) and extracted (unbinding) algebraically.
How It Works
HRR is a mathematical framework for storing composite memories in a single vector:
Binding: Two vectors A and B are combined via circular convolution to produce C = A ⊛ B (analogous to "key-value" association).
Superposition: Multiple bound pairs are added together to compound information: V = A₁⊛B₁ + A₂⊛B₂ + ... + Aₙ⊛Bₙ
Unbinding: To retrieve B given A, we unbind: B ≈ A* ⊛ V (where A* is the inverse/approximate inverse).
Trust Scoring: Each memory is scored based on user feedback (+0.05 for helpful, -0.10 for unhelpful), influencing retrieval ranking.
Hermes Agent Integration
# No external package needed — it's in the Hermes source tree.
# Enable via: hermes memory setup # select "holographic"
Pros
Zero external dependencies — only needs numpy
100% local — no API keys, no cloud, no network calls
Unique algebraic query capabilities:
probe — entity-specific recall
reason — compositional AND queries across entities
contradict — automated conflict detection
Trust scoring with feedback-driven weights is novel
Privacy-maximal — data never leaves your machine
Already shipped with Hermes Agent
Cons
Smallest community — essentially just one plugin maintainer
HRR is an academic approach with a steep learning curve
Scaling to thousands of memories degrades recall quality (interference)
Not well-documented outside the Hermes codebase
No managed service or cloud backup option
May require significant tuning to get good results for non-trivial use cases
Limited adoption means limited bug fixes and community support
Verdict: The most technically novel but least practical. HRR is fascinating math — but it's like using a neural-symbolic microscope when most people just need a filing cabinet. Good for privacy-focused, local-only setups with moderate memory volumes.
10. Bonus: /sethome Command
/sethome (and its alias /set-home) is not a memory provider — it’s a Hermes Agent gateway command that sets the current chat channel as the “home channel.”
Property
Detail
Type
Gateway command (not a tool or MCP)
Scope
Session / platform-level
Platforms
Telegram, Discord, Slack, WhatsApp, Signal, Matrix, etc.
Purpose
Sets this chat as the primary delivery destination for notifications, cron jobs, scheduled tasks, and system alerts
CLI equivalent
Not available — CLI only (cli-only=True)
Use /sethome in the chat where you want to receive Hermes’s automated outputs (morning briefs, scheduled tasks, health alerts, etc.). Without setting a home channel, scheduled outputs may have nowhere to go.
11. Recommendations by Use Case
If you want...
Choose...
Why
The default best overall
Mem0
51.9k stars, widest ecosystem, works with everything, proven at scale. The safe pick.
Lowest token cost + self-hosted
OpenViking
83-91% token reduction backed by benchmarks. Already running on your machine. Free.
Agent that actually learns
Hindsight
SOTA on LongMemEval, unique reflect() capability, biomimetic memory model. Docker deployable.
Deep user personalization
Honcho
Psychological profiling, Peer Oracle, dialectic reasoning. Best for deeply personal assistants.
Zero effort, managed service
RetainDB
2 API calls, 30-second setup. SOTA benchmark scores. But cloud-only.
Persistent coding context
ByteRover
Pre-compression extraction is perfect for long coding sessions. Switch IDEs with memory intact.
Maximum privacy, zero deps
Holographic
No network calls, no API keys, just SQLite and numpy. Data never leaves your box.
Recommendation for This Machine
OpenViking is the best fit for your current setup: it’s already configured and running, it’s free and self-hosted, and the benchmark results are genuinely impressive. If you want to explore alternatives:
Hindsight is the most interesting upgrade candidate — the reflect() capability and knowledge graph would give your agent genuine learning ability.
Mem0 is the safest production bet if you need maximum ecosystem compatibility.
Honcho is worth trying if you care about user modeling (e.g., your agent learns your communication style, preferences, and behavioral patterns).