animav0.1
Documentation[ DOCS ]

anima v0.1

Long-term memory for AI agents.

This guide covers the Python memory library, FastAPI backend, MCP integration, and the Next.js website (Firebase auth + dashboard). Use the sidebar to jump between sections.

Overview

Anima extracts structured facts from agent conversations, deduplicates them, and supersedes outdated facts instead of deleting history. Retrieval combines FAISS semantic search with BM25, fused with confidence and recency. User profiles and an optional entity graph stay updated on every write.

Three ways to use the same memory engine:

  • Python API — embed directly in your agent (Memory class)
  • MCP server — six tools for Cursor, Claude Desktop, or any MCP client
  • HTTP API — FastAPI service in api/ for dashboards and BFF patterns

The website (this Next.js app) is marketing, Firebase auth, and a dashboard with memory chat wired to the HTTP API via a server-side BFF (/api/messages, /api/memories, etc.).

The Python package, MCP server, and HTTP API all use the same Memory class — production embeddings, dedup, and supersession behave the same; the API only differs in how it returns responses (profile/graph run in the background).

Getting started

Requirements

  • Python 3.12+ with uv
  • Optional: OpenRouter API key for LLM extraction
  • Website: Bun and a Firebase project

Clone and install (library + API)

Repository root
git clone https://github.com/themillenniumfalcon/anima.git
cd anima
uv sync
cp .env.example .env
# OPENROUTER_API_KEY — extraction, dedup, profile, graph
# EMBEDDER=st (default) or openai + OPENAI_API_KEY

cp api/.env.example api/.env
# INTERNAL_API_KEY (match website ANIMA_INTERNAL_API_KEY)

Run tests (no API key required)

uv run pytest -v

Tests pass an explicit constant embed_fn for fast offline vectors. Without OPENROUTER_API_KEY, extraction falls back to storing the raw message as a single fact.

Start the HTTP API

uv run --directory api python src/cli.py
# → http://127.0.0.1:8000/docs (Swagger UI)

Run the website

website/
cd website
cp .env.example .env.local
# Fill NEXT_PUBLIC_FIREBASE_* and FIREBASE_* admin vars
bun install
bun dev
# → http://localhost:3000

Python API

Import Memory and Config from the anima package. All storage is scoped by user_id.

Minimal example
import asyncio
from anima import Config, Memory

async def main():
    mem = Memory(config=Config.from_env())

    await mem.add(
        "User: I moved to Berlin last week.",
        user_id="alice",
    )
    result = mem.search("Where does Alice live?", user_id="alice")
    print(result.injected_prompt)

    page = mem.list("alice", limit=20)
    print(f"{len(page.items)} of {page.total} memories")

asyncio.run(main())

Core methods

MethodDescription
mem.add()Extract and store facts from a conversation turn
mem.add(..., enrich=False)Skip profile/graph on the hot path; call mem.enrich() after
mem.enrich()Update user profile + entity graph for saved memories
mem.reindex_vectors()Re-embed SQL rows into FAISS after an embedder change
mem.search()Hybrid recall; returns injected_prompt for your LLM
mem.list()Paginated memory listing
mem.pin()Mark permanent — exempt from decay/TTL
mem.export()JSON snapshot of memories + profile
mem.import_memories()Restore from export
mem.ingest_file()Chunk and index Markdown documents
mem.watch()Signal-based auto-save on keyword triggers
mem.decay_memories()Apply confidence decay lifecycle

Embeddings (default)

Memory(config=Config.from_env()) picks a production embedder from anima.embeddings: local sentence-transformers (all-MiniLM-L6-v2, 384-dim) unless OPENAI_API_KEY is set (then OpenAI text-embedding-3-small). Stale FAISS indexes with the wrong dimension are removed and current SQL memories are re-embedded on startup.

from anima import Config, Memory, build_embedder

# Optional explicit wiring:
embed_fn, dim = build_embedder()
cfg = Config.from_env()
cfg.embedding_dim = dim
mem = Memory(config=cfg, embed_fn=embed_fn)

# Override entirely:
mem = Memory(config=cfg, embed_fn=my_embed_fn)

MCP server

Stdio transport — add to Cursor or Claude Desktop MCP config. Run from the repo root so uv resolves the workspace.

mcp.json
{
  "mcpServers": {
    "anima": {
      "command": "uv",
      "args": ["run", "anima-mcp"],
      "cwd": "/path/to/anima"
    }
  }
}

After pip install anima-mem, use command: "anima-mcp". MCP uses the same Memory() defaults as the Python API (production embedder + dedup). Set OPENROUTER_API_KEY in the environment where the MCP process runs.

Tools

ToolDescription
memory_addStore a conversation turn
memory_searchHybrid semantic + BM25 search
memory_profileGet or update user profile
memory_listPaginated memory listing
memory_deleteRemove a memory
memory_ingest_fileChunk and index a Markdown file

HTTP API

FastAPI service in api/. Firebase is not used here — the Next.js server verifies users, then calls anima with a shared secret and the Firebase UID.

Authentication

Production: Next.js sends X-Api-Key (or Authorization: Bearer) and X-User-Id. The browser must never call anima directly with X-User-Id.

Local dev: leave INTERNAL_API_KEY empty, set ALLOW_DEV_USER_HEADER=true, use X-Dev-User-Id.

Local curl
curl -X POST http://127.0.0.1:8000/api/messages \
  -H "Content-Type: application/json" \
  -H "X-Dev-User-Id: alice" \
  -d '{"content": "I moved to Berlin last week."}'

curl "http://127.0.0.1:8000/api/search?q=Where+does+Alice+live" \
  -H "X-Dev-User-Id: alice"

Endpoints

EndpointDescription
GET /healthLiveness and version
GET /api/quotaMessage count and limit per user
POST /api/messagesExtract and store from content body
GET /api/memoriesPaginated memory list
GET /api/memories/{id}Single memory (owner only)
DELETE /api/memories/{id}Delete memory
POST /api/memories/{id}/pinPin memory
GET /api/search?q=Hybrid search + injected_prompt
GET /api/profileUser profile and prompt
GET /api/statsCounts and timeline for charts

Per-user message quota defaults to 10 (MESSAGE_LIMIT in api/.env), stored in api/data/.

Latency

POST /api/messages returns after extract, embed, dedup, and store (mem.add(..., enrich=False)). Profile and entity-graph LLM calls run in a background taskso the dashboard shows "Stored N fact(s)" sooner. Pip/MCP mem.add() runs enrichment inline unless you opt out.

Memory pipeline

Hot path (every mem.add() and POST /api/messages):

  1. LLM extraction — OpenRouter turns the message into atomic facts (and an optional episode summary).
  2. Embed — production vectors via sentence-transformers or OpenAI.
  3. Dedup / supersession — FAISS neighbors for the same user_id; cosine similarity + employment/location slot rules trigger LLM review; old facts get valid_until / superseded_by(never hard-deleted). Raw message text is passed into dedup so markers like "now work at" survive extraction normalization.
  4. Persist — SQLite + FAISS index save.

Enrichment (pip/MCP inline; HTTP API in background after 201):

  1. Profile — incremental JSON patch per new fact.
  2. Entity graph — optional NetworkX edges from each fact.

Searchmem.search() fuses FAISS + BM25 with confidence and recency, returning injected_prompt for your agent LLM.

Capabilities

FeatureDetails
Fact extractionLLM-backed; OpenRouter by default
EmbeddingsST or OpenAI by default; required for dedup/search
Temporal validitySupersede, don’t delete — full history
DedupSimilarity + slot rules + LLM review band
Hybrid searchFAISS + BM25 + confidence + recency
User profilesUpdated on enrich (every add() by default)
Entity graphOptional NetworkX relationships
LifecycleDecay, TTL, watch() keyword auto-save
IngestionMarkdown chunking via ingester registry

Configuration

Python library (.env at repo root)

VariablePurpose
OPENROUTER_API_KEYLLM extraction, dedup, profiles, graph
OPENROUTER_MODELDefault model (e.g. claude-3.5-haiku)
DATABASE_URLSQLAlchemy URL (default SQLite)
VECTOR_STORE_PATHFAISS index directory
GRAPH_STORE_PATHEntity graph JSON directory
EMBEDDERst (default) or openai
ST_MODELLocal model (default all-MiniLM-L6-v2, 384-dim)
OPENAI_API_KEYUse OpenAI embeddings when set
EMBED_MODELOpenAI embedding model (default text-embedding-3-small)
EMBEDDING_DIM1536 for OpenAI; ST ignores and uses model dim
DECAY_RATE_PER_WEEKConfidence decay (default 0.05)
DECAY_MIN_CONFIDENCEThreshold before deletion
WATCH_KEYWORDSTriggers for watch() auto-save
LOG_LEVELLogging verbosity

HTTP API (api/.env)

VariablePurpose
HOST / PORTServer bind (default 127.0.0.1:8000)
CORS_ORIGINSAllowed origins (Next.js dev URL)
INTERNAL_API_KEYShared secret with website server
ALLOW_DEV_USER_HEADERLocal X-Dev-User-Id bypass
MESSAGE_LIMITPer-user message quota (default 10)
EMBEDDER / ST_MODELSame embedding options as library
LOG_LEVELINFO or DEBUG for [dedup] / [memory] traces

Website (website/.env.local)

VariablePurpose
NEXT_PUBLIC_FIREBASE_*Client Firebase config (6 vars)
FIREBASE_PROJECT_IDAdmin SDK project ID
FIREBASE_CLIENT_EMAILService account email
FIREBASE_PRIVATE_KEYService account key (use \n for newlines)
ANIMA_API_URLanima FastAPI base URL (default http://127.0.0.1:8000)
ANIMA_INTERNAL_API_KEYServer-only; same as api INTERNAL_API_KEY

Website & authentication

Routes: / (landing), /login, /signup, /dashboard (protected).

Sign-in flow

  1. Client signs in with Firebase Auth (email/password or Google).
  2. Client POSTs the Firebase ID token to /api/auth/session.
  3. Server creates an httpOnly session cookie (anima-session, 5 days) via firebase-admin.
  4. proxy.ts checks cookie presence on protected routes; dashboard layout verifies the cookie with Admin SDK.
  5. Sign out: DELETE session API + Firebase signOut.

Forms use Zod validation (lib/validations/auth.ts). Firebase errors map to readable messages in lib/auth/errors.ts.

Website stack

LayerTechnology
FrameworkNext.js 16, React 19, TypeScript
StylingTailwind CSS v4, shadcn/radix-ui
AuthFirebase client + firebase-admin session cookies
ValidationZod
Package managerBun

Architecture

End-to-end data flow:

Flow
Browser
  → Firebase Auth (client SDK)
  → POST /api/auth/session (ID token → httpOnly cookie)
  → Next.js dashboard (memory chat → /api/messages BFF)

Next.js server
  → anima FastAPI with X-Api-Key + X-User-Id (Firebase UID)

anima API
  → mem.add(enrich=False) → 201 response
  → background: profile + entity graph
  → anima.Memory (same library as pip / MCP)
  → SQLite + FAISS + optional graph store

Repository layout

anima/
├── src/anima/     # Memory library (core product)
├── api/           # FastAPI dashboard backend
├── website/       # Next.js marketing + auth (this app)
├── benchmarks/    # LoCoMo evaluation
└── tests/         # pytest suite

Integration status

AreaStatus
Landing + product sectionsDone
Firebase auth + session cookiesDone
Protected /dashboard shellDone
Next.js BFF → anima API routesDone
Dashboard memory chat + quotaDone
Memory graph / stats chartsIn progress

Development

Python

uv sync --extra dev
uv run pytest -v
uv run ruff check src tests

# LoCoMo benchmarks (optional)
uv sync --extra benchmark
uv run python benchmarks/run_locomo.py

Website

website/
bun run dev      # development
bun run build    # production build
bun run lint     # ESLint

Makefile (repo root)

TargetAction
make installuv sync
make testpytest
make lintruff
make benchmarkLoCoMo evaluation

Publish

uv build
uv publish   # requires PyPI credentials

License: MIT. Package includes py.typed for type checkers.