API Reference
Base URL: http://localhost:3030 (or your deployment URL)
Authentication: Authorization: Bearer <token> header on all endpoints except /health. The token is the single API key configured via the env var named by server.auth_token_env (default LOOMEM_AUTH_TOKEN); if no key is configured the server runs in local passthrough mode and accepts all requests.
When no stream is specified, data is read from and written to the default stream __user_default__.
Storage
POST /v1/store
Store a memory chunk.
Request:
{
"content": "User prefers dark mode in all tools",
"stream": "100",
"level": 0,
"metadata": { "source": "user-stated" },
"user_id": "alice",
"app_id": "claude"
}
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
content |
string | yes | — | Memory text (1 – 1M chars) |
stream |
string | no | __user_default__ |
Namespace ID |
level |
int | no | 0 | Memory tier (0 = raw, 1 = consolidated) |
metadata |
object | no | null | Custom JSON metadata |
user_id |
string | no | null | Creator identifier |
app_id |
string | no | null | Application identifier |
Response:
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"status": "stored"
}
Side effects: - Pre-ingestion sanitization (HTML stripping, instruction injection detection) - PII redaction (phones, emails, PESEL, blocklist words) - Entity extraction (dictionary + LLM queue) - Embedding generation (async queue) - Tantivy indexing - Graph population (stream-scoped) - Surprise scoring (importance adjustment) - Contradiction detection against existing memories
LLM-based knowledge extraction from full conversation transcripts is available via the memory_ingest MCP tool — see MCP Tools Reference.
Search
POST /v1/search
Hybrid search across all memory tiers.
Request:
{
"query": "What IDE does the user prefer?",
"top_k": 5,
"stream": "100",
"date_from": "2026-01-01",
"date_to": "2026-04-03",
"trace": true,
"dry_run": false
}
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
query |
string | yes | — | Search query |
top_k |
int | no | 10 | Max results |
stream |
string | no | __user_default__ |
Namespace filter |
streams |
string[] | no | null | Multi-namespace search |
date_from |
string | no | null | ISO date lower bound |
date_to |
string | no | null | ISO date upper bound |
entity |
string | no | null | Filter by entity name |
trace |
bool | no | false | Include debug trace info |
dry_run |
bool | no | false | Skip implicit boost / access tracking |
valid_at |
int (unix sec) | no | null | Bitemporal time-travel: return only chunks whose [valid_from, valid_until] covers this timestamp. Open intervals (None) are unbounded on that side. |
include_superseded |
bool | no | false | Include old versions |
fact_type |
string | no | null | Filter: preference, project, fact |
subject |
string | no | null | Filter by subject entity |
min_confidence |
f64 | no | null | Minimum extraction confidence |
Response:
{
"results": [
{
"chunk_id": "abc-123",
"content": "User switched from VSCode to Cursor (reason: speed)",
"score_final": 0.87,
"trace_info": {
"level": "L1",
"source": "consolidation",
"is_latest": true,
"created_at": 1743638400,
"memory_type": "static",
"importance": 1.2,
"access_count": 3,
"version": 2,
"superseded_by": null
}
}
],
"trace_metadata": {
"total_results_before_topk": 42,
"dedup_removed": 3,
"search_latency_us": 1200,
"query_complexity": "medium"
}
}
POST /v1/context-pack
Smart context packing for system prompt injection. Assembles a token-budgeted context window from profile, relevant memories, and recent activity.
Request:
{
"query": "working on dashboard project",
"stream": "100",
"budget_tokens": 2000,
"sections": ["profile", "relevant", "recent"]
}
| Field | Type | Default | Description |
|---|---|---|---|
query |
string | null | Topic focus |
stream |
string | __user_default__ |
Namespace |
budget_tokens |
int | 4000 | Max tokens in response |
sections |
string[] | all | Which sections to include |
Section token allocation:
- profile: 20% of budget
- relevant: 50% of budget
- recent: 30% of budget
Response:
{
"context": "## Profile\nUser is a software engineer...\n\n## Relevant\n...\n\n## Recent\n...",
"sources": [
{ "chunk_id": "abc-123", "score": 0.87, "section": "relevant" }
],
"total_tokens": 1856,
"sections_included": ["profile", "relevant", "recent"],
"sections_truncated": []
}
Memory management
POST /v1/delete
Delete a single chunk by ID.
{ "id": "abc-123" }
Response: { "status": "deleted", "id": "abc-123" }
DELETE /api/memories/{id}
REST-compliant delete. Optional ?ns=<namespace> query parameter.
Response: { "deleted": true, "id": "abc-123" }
POST /v1/purge-namespace
Delete all chunks in a stream.
{
"stream": "100",
"dry_run": true,
"confirmed": false
}
Response:
{
"status": "ok",
"stream": "100",
"dry_run": true,
"deleted_count": 42,
"deleted_ids": ["abc-123", "def-456", "..."]
}
Set dry_run: false and confirmed: true to actually delete.
Graph
GET /v1/graph/entity/{name}?stream={stream_id}
Get entity node with edges and chunk references. Requires stream parameter — graph is per-stream isolated.
{
"entity": {
"id": "ent-123",
"name": "Cursor",
"type": "TECHNOLOGY",
"aliases": ["cursor IDE"],
"chunk_count": 2
},
"neighbors": [
{ "entity": "Alice", "entity_type": "PERSON", "relation": "uses", "direction": "incoming" }
],
"chunk_ids": ["abc-123", "def-456"]
}
GET /v1/graph/stats
{
"total_entities": 42,
"total_edges": 87,
"avg_chunks_per_entity": 3.2
}
POST /v1/build-graph
Rebuild entity graph from all stored entities. Useful after bulk ingestion.
POST /v1/extract-entities
Trigger LLM NER backfill on chunks missing entity extraction. Runs asynchronously.
Synthesis
Version history of a chunk is available via the memory_history MCP tool; the synthesized user profile via the memory_profile MCP tool. See MCP Tools Reference.
GET /v1/generate-memory-md
Generate a MEMORY.md proposal from top chunks.
{
"proposal": "# Memory\n\n## Identity\n- Software engineer...\n\n## Preferences\n...",
"metadata": { "chunks_considered": 200, "sections": 15 }
}
Consolidation & maintenance
POST /v1/dream
Trigger dream consolidation on current stream.
{
"stream": "100",
"chunks_processed": 50,
"groups_found": 12,
"facts_merged": 8,
"contradictions_resolved": 2,
"cost_usd": 0.04,
"duration_ms": 3200
}
Memory quality analysis is available via the memory_reflect MCP tool.
POST /v1/boost
Boost a chunk's importance to 1.5.
{ "id": "abc-123" }
POST /v1/embed-missing
Backfill embeddings for chunks that don't have them.
{
"status": "ok",
"total_missing": 15,
"embedded": 15,
"failed": 0
}
POST /v1/retag-all
Re-extract entities on all chunks.
POST /v1/score-all
Recompute importance scores for all chunks via embedding similarity.
POST /v1/re-embed-all
Regenerate all embeddings (useful when switching providers).
POST /v1/reset-importance
Reset all accumulated implicit boosts back to defaults.
POST /v1/reset-backfill
Clear LLM entity extraction markers to allow re-processing.
Admin
GET /v1/status
Engine health and statistics.
{
"status": "ok",
"uptime_secs": 86400,
"config_summary": {
"vector_enabled": true,
"tantivy_enabled": true,
"scheduler_enabled": true,
"rocksdb_keys": 311,
"tantivy_docs": 115,
"embeddings_count": 112
}
}
GET /v1/namespaces
{
"namespaces": {
"personal": "100",
"work": "110"
}
}
GET /v1/whoami
Returns the auth context for the current caller (active stream, role, accessible streams).
GET /health
Liveness check. No authentication required.
{ "status": "ok" }
Ambient memory
POST /v1/ambient returns a small, token-budgeted set of plain-fact memory snippets for injection into an agent's context at the start of a turn. See the ambient endpoint contract.
Background workers
POST /admin/workers/pause
Pause all background workers (consolidation, decay, compaction, backup, clustering, purge, stats). Useful for eval runs where you need a frozen database.
Requires admin token.
Response:
{
"paused": true,
"message": "All workers paused"
}
POST /admin/workers/resume
Resume all background workers.
Response:
{
"paused": false,
"message": "All workers resumed"
}
GET /admin/workers/status
Check if workers are paused.
Response:
{
"paused": false
}
Pause state does not survive server restart — workers resume automatically on startup.
MCP endpoint
POST /mcp
MCP JSON-RPC 2.0 endpoint. Used by Claude Desktop, claude.ai connectors, and other MCP clients. The server identifies itself as loomem-memory.
See MCP Tools Reference for the 14 available memory_* tools.
OAuth endpoints
For MCP Remote Connector authorization:
POST /oauth/register Dynamic Client Registration (RFC 7591)
GET /oauth/authorize Authorization page (user enters API key)
POST /oauth/authorize Submit authorization
POST /oauth/token Exchange code for access token