Documentation[ DOCS ]

anima v0.1

Long-term memory for AI agents.

This guide covers the Python memory library, FastAPI backend, MCP integration, and the Next.js website (Firebase auth + dashboard). Use the sidebar to jump between sections.

Overview

Anima extracts structured facts from agent conversations, deduplicates them, and supersedes outdated facts instead of deleting history. Retrieval combines FAISS semantic search with BM25, fused with confidence and recency. User profiles and an optional entity graph stay updated on every write.

Three ways to use the same memory engine:

Python API — embed directly in your agent (Memory class)
MCP server — six tools for Cursor, Claude Desktop, or any MCP client
HTTP API — FastAPI service in api/ for dashboards and BFF patterns

The website (this Next.js app) is marketing, Firebase auth, and a dashboard with memory chat wired to the HTTP API via a server-side BFF (/api/messages, /api/memories, etc.).

The Python package, MCP server, and HTTP API all use the same Memory class — production embeddings, dedup, and supersession behave the same; the API only differs in how it returns responses (profile/graph run in the background).

Getting started

Requirements

Python 3.12+ with uv
Optional: OpenRouter API key for LLM extraction
Website: Bun and a Firebase project

Clone and install (library + API)

Repository root

git clone https://github.com/themillenniumfalcon/anima.git
cd anima
uv sync
cp .env.example .env
# OPENROUTER_API_KEY — extraction, dedup, profile, graph
# EMBEDDER=st (default) or openai + OPENAI_API_KEY

cp api/.env.example api/.env
# INTERNAL_API_KEY (match website ANIMA_INTERNAL_API_KEY)

Run tests (no API key required)

uv run pytest -v

Tests pass an explicit constant embed_fn for fast offline vectors. Without OPENROUTER_API_KEY, extraction falls back to storing the raw message as a single fact.

Start the HTTP API

uv run --directory api python src/cli.py
# → http://127.0.0.1:8000/docs (Swagger UI)

Run the website

website/

cd website
cp .env.example .env.local
# Fill NEXT_PUBLIC_FIREBASE_* and FIREBASE_* admin vars
bun install
bun dev
# → http://localhost:3000

Python API

Import Memory and Config from the anima package. All storage is scoped by user_id.

Minimal example

import asyncio
from anima import Config, Memory

async def main():
    mem = Memory(config=Config.from_env())

    await mem.add(
        "User: I moved to Berlin last week.",
        user_id="alice",
    )
    result = mem.search("Where does Alice live?", user_id="alice")
    print(result.injected_prompt)

    page = mem.list("alice", limit=20)
    print(f"{len(page.items)} of {page.total} memories")

asyncio.run(main())

Core methods

Method	Description
mem.add()	Extract and store facts from a conversation turn
mem.add(..., enrich=False)	Skip profile/graph on the hot path; call mem.enrich() after
mem.enrich()	Update user profile + entity graph for saved memories
mem.reindex_vectors()	Re-embed SQL rows into FAISS after an embedder change
mem.search()	Hybrid recall; returns injected_prompt for your LLM
mem.list()	Paginated memory listing
mem.pin()	Mark permanent — exempt from decay/TTL
mem.export()	JSON snapshot of memories + profile
mem.import_memories()	Restore from export
mem.ingest_file()	Chunk and index Markdown documents
mem.watch()	Signal-based auto-save on keyword triggers
mem.decay_memories()	Apply confidence decay lifecycle

Embeddings (default)

Memory(config=Config.from_env()) picks a production embedder from anima.embeddings: local sentence-transformers (all-MiniLM-L6-v2, 384-dim) unless OPENAI_API_KEY is set (then OpenAI text-embedding-3-small). Stale FAISS indexes with the wrong dimension are removed and current SQL memories are re-embedded on startup.

from anima import Config, Memory, build_embedder

# Optional explicit wiring:
embed_fn, dim = build_embedder()
cfg = Config.from_env()
cfg.embedding_dim = dim
mem = Memory(config=cfg, embed_fn=embed_fn)

# Override entirely:
mem = Memory(config=cfg, embed_fn=my_embed_fn)

MCP server

Stdio transport — add to Cursor or Claude Desktop MCP config. Run from the repo root so uv resolves the workspace.

mcp.json

{
  "mcpServers": {
    "anima": {
      "command": "uv",
      "args": ["run", "anima-mcp"],
      "cwd": "/path/to/anima"
    }
  }
}

After pip install anima-mem, use command: "anima-mcp". MCP uses the same Memory() defaults as the Python API (production embedder + dedup). Set OPENROUTER_API_KEY in the environment where the MCP process runs.

Tools

Tool	Description
memory_add	Store a conversation turn
memory_search	Hybrid semantic + BM25 search
memory_profile	Get or update user profile
memory_list	Paginated memory listing
memory_delete	Remove a memory
memory_ingest_file	Chunk and index a Markdown file

HTTP API

FastAPI service in api/. Firebase is not used here — the Next.js server verifies users, then calls anima with a shared secret and the Firebase UID.

Authentication

Production: Next.js sends X-Api-Key (or Authorization: Bearer) and X-User-Id. The browser must never call anima directly with X-User-Id.

Local dev: leave INTERNAL_API_KEY empty, set ALLOW_DEV_USER_HEADER=true, use X-Dev-User-Id.

Local curl

curl -X POST http://127.0.0.1:8000/api/messages \
  -H "Content-Type: application/json" \
  -H "X-Dev-User-Id: alice" \
  -d '{"content": "I moved to Berlin last week."}'

curl "http://127.0.0.1:8000/api/search?q=Where+does+Alice+live" \
  -H "X-Dev-User-Id: alice"

Endpoints

Endpoint	Description
GET /health	Liveness and version
GET /api/quota	Message count and limit per user
POST /api/messages	Extract and store from content body
GET /api/memories	Paginated memory list
GET /api/memories/{id}	Single memory (owner only)
DELETE /api/memories/{id}	Delete memory
POST /api/memories/{id}/pin	Pin memory
GET /api/search?q=	Hybrid search + injected_prompt
GET /api/profile	User profile and prompt
GET /api/stats	Counts and timeline for charts

Per-user message quota defaults to 10 (MESSAGE_LIMIT in api/.env), stored in api/data/.

Latency

POST /api/messages returns after extract, embed, dedup, and store (mem.add(..., enrich=False)). Profile and entity-graph LLM calls run in a background taskso the dashboard shows "Stored N fact(s)" sooner. Pip/MCP mem.add() runs enrichment inline unless you opt out.

Memory pipeline

Hot path (every mem.add() and POST /api/messages):

LLM extraction — OpenRouter turns the message into atomic facts (and an optional episode summary).
Embed — production vectors via sentence-transformers or OpenAI.
Dedup / supersession — FAISS neighbors for the same user_id; cosine similarity + employment/location slot rules trigger LLM review; old facts get valid_until / superseded_by(never hard-deleted). Raw message text is passed into dedup so markers like "now work at" survive extraction normalization.
Persist — SQLite + FAISS index save.

Enrichment (pip/MCP inline; HTTP API in background after 201):

Profile — incremental JSON patch per new fact.
Entity graph — optional NetworkX edges from each fact.

Search — mem.search() fuses FAISS + BM25 with confidence and recency, returning injected_prompt for your agent LLM.

Capabilities

Feature	Details
Fact extraction	LLM-backed; OpenRouter by default
Embeddings	ST or OpenAI by default; required for dedup/search
Temporal validity	Supersede, don’t delete — full history
Dedup	Similarity + slot rules + LLM review band
Hybrid search	FAISS + BM25 + confidence + recency
User profiles	Updated on enrich (every add() by default)
Entity graph	Optional NetworkX relationships
Lifecycle	Decay, TTL, watch() keyword auto-save
Ingestion	Markdown chunking via ingester registry

Configuration

Python library (.env at repo root)

Variable	Purpose
OPENROUTER_API_KEY	LLM extraction, dedup, profiles, graph
OPENROUTER_MODEL	Default model (e.g. claude-3.5-haiku)
DATABASE_URL	SQLAlchemy URL (default SQLite)
VECTOR_STORE_PATH	FAISS index directory
GRAPH_STORE_PATH	Entity graph JSON directory
EMBEDDER	st (default) or openai
ST_MODEL	Local model (default all-MiniLM-L6-v2, 384-dim)
OPENAI_API_KEY	Use OpenAI embeddings when set
EMBED_MODEL	OpenAI embedding model (default text-embedding-3-small)
EMBEDDING_DIM	1536 for OpenAI; ST ignores and uses model dim
DECAY_RATE_PER_WEEK	Confidence decay (default 0.05)
DECAY_MIN_CONFIDENCE	Threshold before deletion
WATCH_KEYWORDS	Triggers for watch() auto-save
LOG_LEVEL	Logging verbosity

HTTP API (api/.env)

Variable	Purpose
HOST / PORT	Server bind (default 127.0.0.1:8000)
CORS_ORIGINS	Allowed origins (Next.js dev URL)
INTERNAL_API_KEY	Shared secret with website server
ALLOW_DEV_USER_HEADER	Local X-Dev-User-Id bypass
MESSAGE_LIMIT	Per-user message quota (default 10)
EMBEDDER / ST_MODEL	Same embedding options as library
LOG_LEVEL	INFO or DEBUG for [dedup] / [memory] traces

Website (website/.env.local)

Variable	Purpose
NEXT_PUBLIC_FIREBASE_*	Client Firebase config (6 vars)
FIREBASE_PROJECT_ID	Admin SDK project ID
FIREBASE_CLIENT_EMAIL	Service account email
FIREBASE_PRIVATE_KEY	Service account key (use \n for newlines)
ANIMA_API_URL	anima FastAPI base URL (default http://127.0.0.1:8000)
ANIMA_INTERNAL_API_KEY	Server-only; same as api INTERNAL_API_KEY

Website & authentication

Routes: / (landing), /login, /signup, /dashboard (protected).

Sign-in flow

Client signs in with Firebase Auth (email/password or Google).
Client POSTs the Firebase ID token to /api/auth/session.
Server creates an httpOnly session cookie (anima-session, 5 days) via firebase-admin.
proxy.ts checks cookie presence on protected routes; dashboard layout verifies the cookie with Admin SDK.
Sign out: DELETE session API + Firebase signOut.

Forms use Zod validation (lib/validations/auth.ts). Firebase errors map to readable messages in lib/auth/errors.ts.

Website stack

Layer	Technology
Framework	Next.js 16, React 19, TypeScript
Styling	Tailwind CSS v4, shadcn/radix-ui
Auth	Firebase client + firebase-admin session cookies
Validation	Zod
Package manager	Bun

Architecture

End-to-end data flow:

Flow

Browser
  → Firebase Auth (client SDK)
  → POST /api/auth/session (ID token → httpOnly cookie)
  → Next.js dashboard (memory chat → /api/messages BFF)

Next.js server
  → anima FastAPI with X-Api-Key + X-User-Id (Firebase UID)

anima API
  → mem.add(enrich=False) → 201 response
  → background: profile + entity graph
  → anima.Memory (same library as pip / MCP)
  → SQLite + FAISS + optional graph store

Repository layout

anima/
├── src/anima/     # Memory library (core product)
├── api/           # FastAPI dashboard backend
├── website/       # Next.js marketing + auth (this app)
├── benchmarks/    # LoCoMo evaluation
└── tests/         # pytest suite

Integration status

Area	Status
Landing + product sections	Done
Firebase auth + session cookies	Done
Protected /dashboard shell	Done
Next.js BFF → anima API routes	Done
Dashboard memory chat + quota	Done
Memory graph / stats charts	In progress

Development

Python

uv sync --extra dev
uv run pytest -v
uv run ruff check src tests

# LoCoMo benchmarks (optional)
uv sync --extra benchmark
uv run python benchmarks/run_locomo.py

Website

website/

bun run dev      # development
bun run build    # production build
bun run lint     # ESLint

Makefile (repo root)

Target	Action
make install	uv sync
make test	pytest
make lint	ruff
make benchmark	LoCoMo evaluation

Publish

uv build
uv publish   # requires PyPI credentials

License: MIT. Package includes py.typed for type checkers.