anima v0.1
Long-term memory for AI agents.
This guide covers the Python memory library, FastAPI backend, MCP integration, and the Next.js website (Firebase auth + dashboard). Use the sidebar to jump between sections.
Overview
Anima extracts structured facts from agent conversations, deduplicates them, and supersedes outdated facts instead of deleting history. Retrieval combines FAISS semantic search with BM25, fused with confidence and recency. User profiles and an optional entity graph stay updated on every write.
Three ways to use the same memory engine:
- Python API — embed directly in your agent (
Memoryclass) - MCP server — six tools for Cursor, Claude Desktop, or any MCP client
- HTTP API — FastAPI service in
api/for dashboards and BFF patterns
The website (this Next.js app) is marketing, Firebase auth, and a dashboard with memory chat wired to the HTTP API via a server-side BFF (/api/messages, /api/memories, etc.).
The Python package, MCP server, and HTTP API all use the same Memory class — production embeddings, dedup, and supersession behave the same; the API only differs in how it returns responses (profile/graph run in the background).
Getting started
Requirements
- Python 3.12+ with uv
- Optional: OpenRouter API key for LLM extraction
- Website: Bun and a Firebase project
Clone and install (library + API)
git clone https://github.com/themillenniumfalcon/anima.git
cd anima
uv sync
cp .env.example .env
# OPENROUTER_API_KEY — extraction, dedup, profile, graph
# EMBEDDER=st (default) or openai + OPENAI_API_KEY
cp api/.env.example api/.env
# INTERNAL_API_KEY (match website ANIMA_INTERNAL_API_KEY)Run tests (no API key required)
uv run pytest -vTests pass an explicit constant embed_fn for fast offline vectors. Without OPENROUTER_API_KEY, extraction falls back to storing the raw message as a single fact.
Start the HTTP API
uv run --directory api python src/cli.py
# → http://127.0.0.1:8000/docs (Swagger UI)Run the website
cd website
cp .env.example .env.local
# Fill NEXT_PUBLIC_FIREBASE_* and FIREBASE_* admin vars
bun install
bun dev
# → http://localhost:3000Python API
Import Memory and Config from the anima package. All storage is scoped by user_id.
import asyncio
from anima import Config, Memory
async def main():
mem = Memory(config=Config.from_env())
await mem.add(
"User: I moved to Berlin last week.",
user_id="alice",
)
result = mem.search("Where does Alice live?", user_id="alice")
print(result.injected_prompt)
page = mem.list("alice", limit=20)
print(f"{len(page.items)} of {page.total} memories")
asyncio.run(main())Core methods
| Method | Description |
|---|---|
| mem.add() | Extract and store facts from a conversation turn |
| mem.add(..., enrich=False) | Skip profile/graph on the hot path; call mem.enrich() after |
| mem.enrich() | Update user profile + entity graph for saved memories |
| mem.reindex_vectors() | Re-embed SQL rows into FAISS after an embedder change |
| mem.search() | Hybrid recall; returns injected_prompt for your LLM |
| mem.list() | Paginated memory listing |
| mem.pin() | Mark permanent — exempt from decay/TTL |
| mem.export() | JSON snapshot of memories + profile |
| mem.import_memories() | Restore from export |
| mem.ingest_file() | Chunk and index Markdown documents |
| mem.watch() | Signal-based auto-save on keyword triggers |
| mem.decay_memories() | Apply confidence decay lifecycle |
Embeddings (default)
Memory(config=Config.from_env()) picks a production embedder from anima.embeddings: local sentence-transformers (all-MiniLM-L6-v2, 384-dim) unless OPENAI_API_KEY is set (then OpenAI text-embedding-3-small). Stale FAISS indexes with the wrong dimension are removed and current SQL memories are re-embedded on startup.
from anima import Config, Memory, build_embedder
# Optional explicit wiring:
embed_fn, dim = build_embedder()
cfg = Config.from_env()
cfg.embedding_dim = dim
mem = Memory(config=cfg, embed_fn=embed_fn)
# Override entirely:
mem = Memory(config=cfg, embed_fn=my_embed_fn)MCP server
Stdio transport — add to Cursor or Claude Desktop MCP config. Run from the repo root so uv resolves the workspace.
{
"mcpServers": {
"anima": {
"command": "uv",
"args": ["run", "anima-mcp"],
"cwd": "/path/to/anima"
}
}
}After pip install anima-mem, use command: "anima-mcp". MCP uses the same Memory() defaults as the Python API (production embedder + dedup). Set OPENROUTER_API_KEY in the environment where the MCP process runs.
Tools
| Tool | Description |
|---|---|
| memory_add | Store a conversation turn |
| memory_search | Hybrid semantic + BM25 search |
| memory_profile | Get or update user profile |
| memory_list | Paginated memory listing |
| memory_delete | Remove a memory |
| memory_ingest_file | Chunk and index a Markdown file |
HTTP API
FastAPI service in api/. Firebase is not used here — the Next.js server verifies users, then calls anima with a shared secret and the Firebase UID.
Authentication
Production: Next.js sends X-Api-Key (or Authorization: Bearer) and X-User-Id. The browser must never call anima directly with X-User-Id.
Local dev: leave INTERNAL_API_KEY empty, set ALLOW_DEV_USER_HEADER=true, use X-Dev-User-Id.
curl -X POST http://127.0.0.1:8000/api/messages \
-H "Content-Type: application/json" \
-H "X-Dev-User-Id: alice" \
-d '{"content": "I moved to Berlin last week."}'
curl "http://127.0.0.1:8000/api/search?q=Where+does+Alice+live" \
-H "X-Dev-User-Id: alice"Endpoints
| Endpoint | Description |
|---|---|
| GET /health | Liveness and version |
| GET /api/quota | Message count and limit per user |
| POST /api/messages | Extract and store from content body |
| GET /api/memories | Paginated memory list |
| GET /api/memories/{id} | Single memory (owner only) |
| DELETE /api/memories/{id} | Delete memory |
| POST /api/memories/{id}/pin | Pin memory |
| GET /api/search?q= | Hybrid search + injected_prompt |
| GET /api/profile | User profile and prompt |
| GET /api/stats | Counts and timeline for charts |
Per-user message quota defaults to 10 (MESSAGE_LIMIT in api/.env), stored in api/data/.
Latency
POST /api/messages returns after extract, embed, dedup, and store (mem.add(..., enrich=False)). Profile and entity-graph LLM calls run in a background taskso the dashboard shows "Stored N fact(s)" sooner. Pip/MCP mem.add() runs enrichment inline unless you opt out.
Memory pipeline
Hot path (every mem.add() and POST /api/messages):
- LLM extraction — OpenRouter turns the message into atomic facts (and an optional episode summary).
- Embed — production vectors via
sentence-transformersor OpenAI. - Dedup / supersession — FAISS neighbors for the same
user_id; cosine similarity + employment/location slot rules trigger LLM review; old facts getvalid_until/superseded_by(never hard-deleted). Raw message text is passed into dedup so markers like "now work at" survive extraction normalization. - Persist — SQLite + FAISS index save.
Enrichment (pip/MCP inline; HTTP API in background after 201):
- Profile — incremental JSON patch per new fact.
- Entity graph — optional NetworkX edges from each fact.
Search — mem.search() fuses FAISS + BM25 with confidence and recency, returning injected_prompt for your agent LLM.
Capabilities
| Feature | Details |
|---|---|
| Fact extraction | LLM-backed; OpenRouter by default |
| Embeddings | ST or OpenAI by default; required for dedup/search |
| Temporal validity | Supersede, don’t delete — full history |
| Dedup | Similarity + slot rules + LLM review band |
| Hybrid search | FAISS + BM25 + confidence + recency |
| User profiles | Updated on enrich (every add() by default) |
| Entity graph | Optional NetworkX relationships |
| Lifecycle | Decay, TTL, watch() keyword auto-save |
| Ingestion | Markdown chunking via ingester registry |
Configuration
Python library (.env at repo root)
| Variable | Purpose |
|---|---|
| OPENROUTER_API_KEY | LLM extraction, dedup, profiles, graph |
| OPENROUTER_MODEL | Default model (e.g. claude-3.5-haiku) |
| DATABASE_URL | SQLAlchemy URL (default SQLite) |
| VECTOR_STORE_PATH | FAISS index directory |
| GRAPH_STORE_PATH | Entity graph JSON directory |
| EMBEDDER | st (default) or openai |
| ST_MODEL | Local model (default all-MiniLM-L6-v2, 384-dim) |
| OPENAI_API_KEY | Use OpenAI embeddings when set |
| EMBED_MODEL | OpenAI embedding model (default text-embedding-3-small) |
| EMBEDDING_DIM | 1536 for OpenAI; ST ignores and uses model dim |
| DECAY_RATE_PER_WEEK | Confidence decay (default 0.05) |
| DECAY_MIN_CONFIDENCE | Threshold before deletion |
| WATCH_KEYWORDS | Triggers for watch() auto-save |
| LOG_LEVEL | Logging verbosity |
HTTP API (api/.env)
| Variable | Purpose |
|---|---|
| HOST / PORT | Server bind (default 127.0.0.1:8000) |
| CORS_ORIGINS | Allowed origins (Next.js dev URL) |
| INTERNAL_API_KEY | Shared secret with website server |
| ALLOW_DEV_USER_HEADER | Local X-Dev-User-Id bypass |
| MESSAGE_LIMIT | Per-user message quota (default 10) |
| EMBEDDER / ST_MODEL | Same embedding options as library |
| LOG_LEVEL | INFO or DEBUG for [dedup] / [memory] traces |
Website (website/.env.local)
| Variable | Purpose |
|---|---|
| NEXT_PUBLIC_FIREBASE_* | Client Firebase config (6 vars) |
| FIREBASE_PROJECT_ID | Admin SDK project ID |
| FIREBASE_CLIENT_EMAIL | Service account email |
| FIREBASE_PRIVATE_KEY | Service account key (use \n for newlines) |
| ANIMA_API_URL | anima FastAPI base URL (default http://127.0.0.1:8000) |
| ANIMA_INTERNAL_API_KEY | Server-only; same as api INTERNAL_API_KEY |
Website & authentication
Routes: / (landing), /login, /signup, /dashboard (protected).
Sign-in flow
- Client signs in with Firebase Auth (email/password or Google).
- Client POSTs the Firebase ID token to
/api/auth/session. - Server creates an httpOnly session cookie (
anima-session, 5 days) via firebase-admin. proxy.tschecks cookie presence on protected routes; dashboard layout verifies the cookie with Admin SDK.- Sign out: DELETE session API + Firebase
signOut.
Forms use Zod validation (lib/validations/auth.ts). Firebase errors map to readable messages in lib/auth/errors.ts.
Website stack
| Layer | Technology |
|---|---|
| Framework | Next.js 16, React 19, TypeScript |
| Styling | Tailwind CSS v4, shadcn/radix-ui |
| Auth | Firebase client + firebase-admin session cookies |
| Validation | Zod |
| Package manager | Bun |
Architecture
End-to-end data flow:
Browser
→ Firebase Auth (client SDK)
→ POST /api/auth/session (ID token → httpOnly cookie)
→ Next.js dashboard (memory chat → /api/messages BFF)
Next.js server
→ anima FastAPI with X-Api-Key + X-User-Id (Firebase UID)
anima API
→ mem.add(enrich=False) → 201 response
→ background: profile + entity graph
→ anima.Memory (same library as pip / MCP)
→ SQLite + FAISS + optional graph storeRepository layout
anima/
├── src/anima/ # Memory library (core product)
├── api/ # FastAPI dashboard backend
├── website/ # Next.js marketing + auth (this app)
├── benchmarks/ # LoCoMo evaluation
└── tests/ # pytest suiteIntegration status
| Area | Status |
|---|---|
| Landing + product sections | Done |
| Firebase auth + session cookies | Done |
| Protected /dashboard shell | Done |
| Next.js BFF → anima API routes | Done |
| Dashboard memory chat + quota | Done |
| Memory graph / stats charts | In progress |
Development
Python
uv sync --extra dev
uv run pytest -v
uv run ruff check src tests
# LoCoMo benchmarks (optional)
uv sync --extra benchmark
uv run python benchmarks/run_locomo.pyWebsite
bun run dev # development
bun run build # production build
bun run lint # ESLintMakefile (repo root)
| Target | Action |
|---|---|
| make install | uv sync |
| make test | pytest |
| make lint | ruff |
| make benchmark | LoCoMo evaluation |
Publish
uv build
uv publish # requires PyPI credentialsLicense: MIT. Package includes py.typed for type checkers.