Configuration Guide
Every parameter has a sensible default. You don’t need to change anything to get started. This guide helps you tune settings when you have a specific use case or want to optimize performance.
Quick-start recipes
Section titled “Quick-start recipes”Pick the recipe closest to your use case, then fine-tune individual parameters as needed.
Personal assistant / chatbot
Section titled “Personal assistant / chatbot”Long-running conversations across many sessions. Needs to remember user preferences, past events, and facts.
config = CognitiveMemoryConfig( extraction_mode="semantic", retrieval_score_exponent=0.3, direct_boost=0.1, core_access_threshold=10, core_session_threshold=3,)# Search with: top_k=20, deep_recall=FalseThis is the default configuration. Semantic extraction captures structured facts across sessions. Core promotion ensures frequently-referenced information (name, preferences) becomes permanent.
Dialog system / roleplay / trivia
Section titled “Dialog system / roleplay / trivia”Exact recall of what was said matters more than structured reasoning. Users may ask “what did X say about Y?”
config = CognitiveMemoryConfig( extraction_mode="raw", retrieval_score_exponent=0.1, # don't penalize old turns run_maintenance_during_ingestion=False, # no consolidation)# Search with: top_k=50, deep_recall=FalseRaw mode stores every turn verbatim. Low alpha keeps old turns retrievable. Disable maintenance to prevent consolidation from merging turns.
Hybrid recall (best of both)
Section titled “Hybrid recall (best of both)”Need structured facts AND exact quotes. Willing to pay the storage and LLM cost.
config = CognitiveMemoryConfig( extraction_mode="hybrid", retrieval_score_exponent=0.2, direct_boost=0.1,)# Search with: top_k=30, deep_recall=TrueHybrid stores both extracted facts and raw turns. Deep recall ensures superseded originals can still surface when needed. Good for applications where users ask both “what is the user’s job?” and “what exactly did they say on March 12?”
Medical / legal records
Section titled “Medical / legal records”Long-term, high-fidelity recall. Nothing should be forgotten.
config = CognitiveMemoryConfig( extraction_mode="semantic", retrieval_score_exponent=0.1, # nearly ignore decay core_access_threshold=5, # promote to core faster core_stability_threshold=0.7, cold_storage_ttl_days=365, # keep cold memories longer consolidation_retention_threshold=0.10, # only consolidate very faded custom_extraction_instructions="Focus on medical conditions, medications, allergies, and treatment history.",)# Search with: top_k=60, deep_recall=TrueLow alpha means old memories rank nearly as high as recent ones. Aggressive core promotion protects important information. Long cold storage TTL and conservative consolidation prevent data loss.
Real-time / news / rapidly changing data
Section titled “Real-time / news / rapidly changing data”Information changes frequently. Old data should fade in favor of updates.
config = CognitiveMemoryConfig( extraction_mode="semantic", retrieval_score_exponent=0.7, # strong recency bias direct_boost=0.05, # less stability reinforcement cold_migration_days=3, # move to cold faster cold_storage_ttl_days=30, # purge cold data quickly consolidation_retention_threshold=0.30, # consolidate earlier)# Search with: top_k=10, deep_recall=FalseHigh alpha suppresses old memories. Fast cold migration and short TTL keep the active set small and current.
Batch import / benchmark
Section titled “Batch import / benchmark”Migrating historical data. Speed matters, not real-time behavior.
config = CognitiveMemoryConfig( run_maintenance_during_ingestion=False,)mem = CognitiveMemory(config=config)
for conv in conversations: await mem.extract_and_store(conv, session_id=..., run_tick=False)
# Run maintenance once at the endawait mem.tick()Disabling maintenance during ingestion avoids O(n) cold-migration checks after every batch.
Parameter-by-parameter guide
Section titled “Parameter-by-parameter guide”Extraction mode
Section titled “Extraction mode”extraction_mode controls how conversations become memories. See Ingestion Pipeline for detailed comparison.
| Mode | LLM cost | Storage | Best for |
|---|---|---|---|
"semantic" | Yes | Low | Most applications |
"raw" | No | Medium | Exact recall, low latency |
"hybrid" | Yes | High | Both structured and verbatim recall |
Extraction model
Section titled “Extraction model”extraction_model sets the LLM used for fact extraction (semantic and hybrid modes only).
| Model | Quality | Cost | Speed |
|---|---|---|---|
gpt-4o-mini | Good | Low | Fast |
gpt-4o | Better | 10x | Slower |
gpt-4.1-mini | Good | Low | Fast |
Upgrade to gpt-4o if extraction quality is the bottleneck (e.g., missing important facts, poor categorization). For most applications, gpt-4o-mini is sufficient.
Custom extraction instructions
Section titled “Custom extraction instructions”custom_extraction_instructions prepends domain-specific guidance to the extraction prompt.
# Medical assistantconfig = CognitiveMemoryConfig( custom_extraction_instructions="Focus on medical conditions, medications, dosages, allergies, and treatment outcomes. Classify all medical facts as core.")
# E-commerceconfig = CognitiveMemoryConfig( custom_extraction_instructions="Track product preferences, sizes, brands, past purchases, and return history.")Use this when the default extraction misses domain-specific information or assigns wrong importance/categories.
Retrieval score exponent (alpha)
Section titled “Retrieval score exponent (alpha)”retrieval_score_exponent controls how much memory decay affects search ranking. The scoring formula is score = similarity * retention^alpha.
| Value | Behavior | Use case |
|---|---|---|
| 0.1 | Nearly ignores decay. Old and new memories rank equally. | Medical records, legal, archival |
| 0.3 | Default. Balanced recency bias. | Personal assistants, chatbots |
| 0.5 | Moderate recency bias. Recent memories preferred. | News, evolving topics |
| 0.7-1.0 | Strong recency bias. Old memories effectively suppressed. | Real-time data, trending topics |
This is the single most impactful parameter for retrieval quality. Tune this first.
top_k (search parameter)
Section titled “top_k (search parameter)”Not a config parameter — passed to search(). But it’s the most impactful retrieval knob.
| Value | Effect | Use case |
|---|---|---|
| 5-10 | Precise, may miss relevant memories | Simple Q&A, low latency |
| 20-40 | Balanced recall and precision | Most applications |
| 50-60 | Broad recall, catches weak matches | With re-ranking, complex queries |
Our benchmarks showed +14pp accuracy improvement going from k=10 to k=60. If you’re missing relevant memories, increase k first.
Deep recall (search parameter)
Section titled “Deep recall (search parameter)”results = await mem.search(query, deep_recall=True)When enabled, superseded memories (originals that were consolidated into summaries) are included in search results with a penalty (deep_recall_penalty, default 0.5x score).
Enable when: Users need specific dates, names, or numbers that may have been lost during consolidation.
Disable when: You want clean, deduplicated results and don’t need historical granularity.
Retrieval boosting
Section titled “Retrieval boosting”Controls how much retrieving a memory strengthens it (spaced repetition).
| Parameter | Default | Effect of increasing |
|---|---|---|
direct_boost | 0.1 | Memories stabilize faster with fewer retrievals |
associative_boost | 0.03 | Linked memories strengthen more from co-retrieval |
max_spaced_rep_multiplier | 2.0 | Larger bonus for accessing long-idle memories |
spaced_rep_interval_days | 7.0 | Full bonus kicks in sooner after last access |
If memories fade too fast: Increase direct_boost to 0.15-0.20.
If memories never fade: Decrease direct_boost to 0.05 and lower max_spaced_rep_multiplier to 1.5.
Core promotion
Section titled “Core promotion”Core memories have a high retention floor (0.60) and never get cold-migrated. Three conditions must ALL be met:
| Parameter | Default | Effect of lowering |
|---|---|---|
core_access_threshold | 10 | Fewer retrievals needed for promotion |
core_stability_threshold | 0.85 | Lower stability bar for promotion |
core_session_threshold | 3 | Fewer distinct sessions needed |
Too few core promotions? Lower all three (e.g., 5, 0.7, 2).
Too many core promotions? Raise core_access_threshold to 15-20 and core_session_threshold to 5+. Over-promoting makes the system “sticky” — old information never fades.
Decay parameters
Section titled “Decay parameters”| Parameter | Default | Description |
|---|---|---|
faint_threshold | 0.15 | Retention below this = “faint” memory |
Base decay rates are category-level constants:
| Category | Decay rate (days) | Floor | Behavior |
|---|---|---|---|
| Episodic | 45 | 0.02 | Fades relatively fast |
| Semantic | 120 | 0.02 | Fades slowly |
| Procedural | Infinity | 0.02 | Never decays |
| Core | 120 | 0.60 | Fades slowly, high floor |
You can’t change base decay rates via config — they’re set by memory category. To make memories last longer, increase direct_boost or use custom_extraction_instructions to bias toward higher importance scores (stability = 0.1 + importance * 0.3).
Associations
Section titled “Associations”Controls how memories link to each other and activate during retrieval.
| Parameter | Default | Effect of changing |
|---|---|---|
association_strengthen_amount | 0.1 | Higher = associations build faster per co-retrieval |
association_retrieval_threshold | 0.3 | Lower = more associations activate (more multi-hop, more noise) |
association_decay_constant_days | 90 | Higher = associations persist longer without reinforcement |
For multi-hop reasoning: Lower association_retrieval_threshold to 0.2 and increase association_decay_constant_days to 120-180.
For precision (less noise): Raise association_retrieval_threshold to 0.4.
Consolidation
Section titled “Consolidation”Controls when and how fading memories get merged into summaries.
| Parameter | Default | Effect of changing |
|---|---|---|
consolidation_retention_threshold | 0.20 | Higher = consolidate earlier (less storage, but lose detail sooner) |
consolidation_group_size | 5 | Lower = more frequent consolidation events |
consolidation_similarity_threshold | 0.70 | Lower = larger, more diverse groups |
For archival applications: Lower threshold to 0.10 (only consolidate very faded memories).
For high-throughput applications: Raise threshold to 0.30 and lower group size to 3.
Tiered storage
Section titled “Tiered storage”Controls when memories leave hot storage and when cold memories expire.
| Parameter | Default | Effect of changing |
|---|---|---|
cold_migration_days | 7 | Lower = memories leave hot store faster (less search noise, but may lose multi-hop paths) |
cold_storage_ttl_days | 180 | Higher = cold memories survive longer before becoming stubs |
For long-term applications: Set cold_storage_ttl_days to 365+.
For ephemeral applications: Set cold_migration_days to 3 and cold_storage_ttl_days to 30.
Embedding
Section titled “Embedding”| Parameter | Default | Description |
|---|---|---|
embedding_model | "text-embedding-3-small" | OpenAI embedding model |
embedding_dimensions | 1536 | Vector dimensions |
Higher dimensions (3072 with text-embedding-3-large) give better similarity discrimination at the cost of storage and search speed. For most applications, 1536 dimensions with text-embedding-3-small is the right trade-off.
Measuring quality
Section titled “Measuring quality”The best way to measure tuning impact:
- Build a test set of 50-100 question-answer pairs from your domain
- Ingest the relevant conversations
- Run searches for each question
- Score answers against ground truth
- Track overall accuracy and multi-hop accuracy separately
Avoid tuning on individual examples — optimize for aggregate metrics across your test set.
Key interactions
Section titled “Key interactions”retrieval_score_exponentanddirect_boostinteract: higher alpha makes boosting more impactfulconsolidation_retention_thresholdandcold_migration_daystogether control when memories leave hot storagecore_access_thresholdandcore_stability_thresholdcreate a compound requirement: lowering one doesn’t help if the other is too highextraction_mode="hybrid"withdeep_recall=Truegives maximum recall at the cost of more noise — pair with highertop_kand re-ranking