Configuration Reference

All settings live in config.toml. There are no hardcoded defaults — every value must be explicitly configured.

Override config file path with LOOMEM_CONFIG environment variable.

Authentication is configured via [server].auth_token_env (see [server]) — there is no separate [auth] section.

[storage]

[storage]
data_dir = "./data"
vector_enabled = true

Key	Type	Default	Description
`data_dir`	string	"./data"	Root directory for RocksDB, Tantivy, embeddings
`vector_enabled`	bool	true	Enable vector (embedding) search

[storage.rocksdb]

[storage.rocksdb]
max_open_files = 1000
compression = "lz4"
write_buffer_size = 67108864
max_write_buffer_number = 3

Key	Type	Description
`max_open_files`	int	File descriptor pool size
`compression`	string	`"lz4"` or `"snappy"` — LZ4 recommended
`write_buffer_size`	int	Bytes per write buffer (64 MB)
`max_write_buffer_number`	int	Number of write buffers

[storage.tantivy]

[storage.tantivy]
enabled = true
heap_size_mb = 128

Key	Type	Description
`enabled`	bool	Enable BM25 full-text search
`heap_size_mb`	int	Index writer memory pool

[storage.intent_log]

Write-ahead log for cross-store consistency.

[storage.intent_log]
enabled = true
dir = "wal"
max_size_mb = 10
sync_on_write = false

Key	Type	Description
`enabled`	bool	Enable WAL
`dir`	string	WAL directory (relative to data_dir)
`max_size_mb`	int	Rotate when log reaches this size
`sync_on_write`	bool	fsync every write (safer but slower)

[search]

[search]
top_k = 10
surprise_boost = 1.5
synonyms_file = "synonyms.toml"
entities_file = "entities.toml"
rerank_enabled = false
rerank_candidates = 20
rerank_model_dir = "models/reranker"
multi_query_enabled = false
stem_polish = true

Key	Type	Description
`top_k`	int	Default number of search results
`surprise_boost`	f64	Novelty multiplier on ingest (Titans-inspired)
`synonyms_file`	string	Path to synonym expansion map
`entities_file`	string	Path to entity dictionary
`rerank_enabled`	bool	Enable cross-encoder reranking (~97ms/pair)
`rerank_candidates`	int	Send top N to reranker
`rerank_model_dir`	string	ONNX model directory
`multi_query_enabled`	bool	Decompose complex queries into sub-queries
`stem_polish`	bool	Enable Polish language stemming

[search.cache]

[search.cache]
enabled = true
max_entries = 500
ttl_secs = 300

Semantic query cache. Identical queries within TTL return cached results.

[search.graph]

[search.graph]
enabled = true
max_hops = 1
boost_factor = 0.3
max_graph_additions = 3

Key	Type	Description
`enabled`	bool	Enable graph-enhanced search
`max_hops`	int	1 = direct neighbors, 2 = 2-hop
`boost_factor`	f64	Score multiplier for graph-discovered results (0.0 – 1.0)
`max_graph_additions`	int	Max graph-only results added to search

[search.hybrid_weights]

[search.hybrid_weights]
vector = 0.6
bm25 = 0.4

Controls the fusion ratio. Must sum to 1.0.

[search.decay]

[search.decay]
l0_lambda = 0.05
l1_lambda = 0.03

Exponential decay rate per tier. Higher = faster decay. Half-life ≈ ln(2) / lambda days.

Tier	Lambda	Approximate half-life
L0	0.05	~14 days
L1	0.03	~23 days

[search.complexity]

[search.complexity]
enabled = false
simple_top_k = 3
medium_top_k = 10
complex_top_k = 20

Complexity-aware routing. When enabled, adjusts top_k based on query complexity classification. Currently disabled — all queries use full pipeline.

[worker]

[worker.consolidation]

[worker.consolidation]
interval_secs = 300
batch_size = 200
concurrency = 2
timeout_secs = 300
min_chunks_to_consolidate = 3
min_age_secs = 60
prompt_version = 1
consolidation_style = "structured"
similarity_threshold = 0.20

Key	Type	Description
`interval_secs`	int	How often to run (5 min)
`batch_size`	int	Max chunks per stream per run
`concurrency`	int	Parallel streams
`min_chunks_to_consolidate`	int	Skip streams with fewer chunks
`min_age_secs`	int	Only consolidate chunks older than this
`consolidation_style`	string	`"structured"` = typed facts with date resolution, `"observation"` = granular facts, `"summary"` = paragraph
`similarity_threshold`	f64	Cosine similarity for topic grouping before compress (0.0 = one per chunk, 1.0 = all in one)

[worker.decay_worker]

[worker.decay_worker]
interval_secs = 3600
factor = 0.995
l0_factor = 0.990
l1_factor = 0.995
dormant_threshold = 0.01
access_boost = true
adaptive_enabled = true
adaptive_dampening = 0.5
adaptive_cap = 200

Key	Type	Description
`interval_secs`	int	Run every hour
`l0/l1_factor`	f64	Decay multiplier per hour per tier
`dormant_threshold`	f64	Score below this = marked dormant
`access_boost`	bool	Reset score on search hit
`adaptive_enabled`	bool	ACT-R: frequently accessed decay slower
`adaptive_dampening`	f64	How much access_count influences decay
`adaptive_cap`	int	access_count ceiling for adaptive effect

[worker.compaction]

[worker.compaction]
interval_secs = 3600
timeout_secs = 300

RocksDB background compaction trigger.

[worker.clustering]

[worker.clustering]
interval_secs = 21600
max_iterations = 1000
timeout_secs = 600

Clustering worker runs k-means on L1 embeddings to group related memories.

[scheduler]

[scheduler]
enabled = true

Master switch for all background workers.

[llm]

[llm]
provider = "openai"                  # completions (consolidation / reflect)
api_key_env = "OPENAI_API_KEY"
embedding_provider = "local"         # "local" (on-device ONNX) or "openai"
embedding_model = "text-embedding-3-small"   # used when embedding_provider = "openai"
# embedding_model_path = "/abs/path"  # local model dir; default ~/.loomem/models/multilingual-e5-small
embedding_dim = 384                  # MUST match the active embedding model
compression_model = "gpt-4.1-mini"
timeout_secs = 10
fallback_to_regex = true

Key	Type	Description
`provider`	string	Completions provider for consolidation / reflect (`"openai"`)
`embedding_provider`	string	`"local"` (on-device ONNX, no API key) or `"openai"`. Missing in older configs → treated as `"openai"`
`api_key_env`	string	Environment variable name for the API key
`embedding_model`	string	Embedding model name when `embedding_provider = "openai"`
`embedding_model_path`	string?	Directory with `model.onnx` + `tokenizer.json` for local embeddings. Unset → `~/.loomem/models/multilingual-e5-small`
`embedding_dim`	int	Embedding vector dimensions — must match the active model
`compression_model`	string	Model for consolidation, extraction, dream
`timeout_secs`	int	API call timeout
`fallback_to_regex`	bool	Use regex extraction if the completions LLM is unavailable

Local embeddings (default)

Fresh installs use embedding_provider = "local": embeddings are computed on-device with a quantization-free ONNX model (default multilingual-e5-small, 384-dim, good multilingual/Polish recall) via the pure-Rust tract runtime. No API key is required and no text leaves the machine. The model ships in the default build; obtain the model files with:

./scripts/fetch-embedding-model.sh         # → ~/.loomem/models/multilingual-e5-small

The completions LLM (consolidation, fact extraction, dream) is independent: with no OPENAI_API_KEY, those steps fall back to regex (fallback_to_regex), while memory_store and semantic search work fully locally. To use a local LLM for completions too, point provider at an OpenAI-compatible endpoint (future cycle).

Embedding dimension & re-embedding

embedding_dim must match the active embedding model (multilingual-e5-small = 384, OpenAI text-embedding-3-small = 1536). The dimension a database was built with is recorded in its metadata; on a mismatch the server refuses to start rather than silently mixing vector sizes. To switch providers/models on an existing database, re-embed it:

loomem-server --reembed   # recompute all vectors with the configured provider (run with the server stopped)

To use the API instead of local embeddings, set embedding_provider = "openai", embedding_dim = 1536, export OPENAI_API_KEY, and re-embed.

[server]

[server]
host = "127.0.0.1"
port = 3030
auth_token_env = "LOOMEM_AUTH_TOKEN"

Key	Type	Description
`host`	string	Bind address (`0.0.0.0` for production)
`port`	int	HTTP port (overridden by `PORT` env var)
`auth_token_env`	string	Name of the env var holding the API Bearer key (default `LOOMEM_AUTH_TOKEN`). If that env var is unset or empty, auth is disabled and the server runs in local passthrough mode

[resource_guards]

[resource_guards]
max_cpu_cores = 1.0
max_memory_mb = 512
min_disk_space_mb = 1024
llm_timeout_secs = 30
worker_timeout_secs = 120

Startup checks. Server refuses to start if resources are below thresholds.

[streams] and [namespaces]

[streams]
shared = "001"

[streams.agents]
assistant = { raw = "100", compressed = "101" }

[namespaces]
personal = "100"
work = "110"

streams — agent-specific raw/compressed stream IDs. namespaces — human-readable name → stream ID mapping. Returned by memory_namespaces.

Data written without an explicit stream goes to the built-in default stream __user_default__.

[pii]

[pii]
enabled = true
redact_phones = true
redact_emails = true
redact_ids = true
blocklist_file = "pii_blocklist.txt"
audit_log = true

PII is stripped before every LLM API call (consolidation, extraction, dream). Original content in RocksDB remains untouched.

[cost]

[cost]
daily_cap_usd = 15.00
alert_threshold_usd = 10.00
anomaly_multiplier = 3.0
persist = true

Key	Type	Description
`daily_cap_usd`	f64	Hard stop — no LLM calls beyond this
`alert_threshold_usd`	f64	Warning threshold
`anomaly_multiplier`	f64	3x typical daily cost = anomaly alert
`persist`	bool	Save cost counters across restarts

[knowledge_extraction]

[knowledge_extraction]
enabled = true
model = "gpt-4.1-mini"
max_transcript_tokens = 20000
dedup_cosine_threshold = 0.92
contradiction_check = true
contradiction_cosine_min = 0.5
max_facts_per_transcript = 20

Controls the memory_ingest LLM pipeline.

[entity_extraction]

[entity_extraction]
enabled = true
model = "gpt-4.1-mini"
batch_size = 20
flush_interval_secs = 3
queue_capacity = 200
confidence_threshold = 0.7
max_tokens_per_batch = 2000

Async LLM NER for entities not in the dictionary.

[contradiction]

[contradiction]
enabled = true
similarity_threshold = 0.70
max_candidates = 5
model = "gpt-4.1-mini"

When a new fact arrives, top max_candidates similar existing facts are checked via LLM for contradiction. Verdict: updates (supersede), extends (keep both), none (no relation).

[conversation_extraction]

[conversation_extraction]
enabled = true
model = "gpt-4.1-mini"
max_tokens = 2000
confidence_threshold = 0.7
dedup_threshold = 0.80
max_extractions_per_request = 30

[profile]

[profile]
enabled = true
model = "gpt-4.1-mini"
max_chunks = 100
cache_ttl_secs = 3600
max_static_facts = 30
max_recent_items = 15

User profile synthesis. Cached for cache_ttl_secs.

[retention]

[retention]
soft_delete_days = 30
hard_purge_interval_secs = 86400

Key	Type	Description
`soft_delete_days`	int	Days to keep soft-deleted chunks before hard purge
`hard_purge_interval_secs`	int	How often the purge worker runs (seconds)

The hard-purge worker removes expired soft-deleted chunks from RocksDB, Tantivy, and graph.

[associator]

[associator]
enabled = true
k_clusters = 0
max_clusters = 50
max_iterations = 100
min_serendipity = 0.1
max_associations = 3

Key	Type	Description
`enabled`	bool	Enable ECA serendipity engine and auto-clustering
`k_clusters`	int	Fixed cluster count (0 = auto)
`max_clusters`	int	Upper bound on auto-determined clusters
`min_serendipity`	f64	Minimum score to include an association
`max_associations`	int	Max serendipitous results per search

[dream]

[dream]
enabled = true
batch_size = 50
min_group_size = 2
model = "gpt-4.1-mini"
cost_cap_usd_per_run = 0.10

Key	Type	Description
`batch_size`	int	Chunks per dream run
`min_group_size`	int	Min chunks per subject to merge
`cost_cap_usd_per_run`	f64	Safety cap per dream session

[memory_generator]

[memory_generator]
enabled = true
max_chunks = 200
max_sections = 20
model = "gpt-4.1-mini"

Controls generate-memory-md endpoint output.

Environment variables

Variable	Purpose	Required
`LOOMEM_AUTH_TOKEN`	API Bearer key (name configurable via `server.auth_token_env`)	No (auth disabled if unset — local passthrough mode)
`OPENAI_API_KEY`	LLM API key (name configurable via `llm.api_key_env`)	Only for OpenAI completions (`provider = "openai"`) or OpenAI embeddings (`embedding_provider = "openai"`). With local embeddings (default) and no key, the LLM steps fall back to regex.
`LOOMEM_CONFIG`	Config file path	No (default: `config.toml`)
`PORT`	Override server port	No
`SERVER_ORIGIN`	OAuth redirect base URL	No (for MCP Remote)
`LOOMEM_LOG_FORMAT`	`"json"` for structured logs	No
`LOOMEM_AT_REST_MASTER_KEY`	Master key enabling at-rest encryption (32-byte base64)	No
`LOOMEM_AT_REST_EXPECT_ENABLED`	Refuse to start without a master key when set	No
`LOOMEM_AMBIENT_CACHE_TTL_SECS`	Cache TTL for `/v1/ambient` responses	No (default 60)
`TELEGRAM_BOT_TOKEN`, `LOOMEM_TELEGRAM_CHAT_ID`	Optional cost-alert webhook	No

For at-rest encryption, see SECURITY.md — Data at rest.