Meta-Learned Memory

Automatically discovering how LLMs should manage their context window

Updated 2026-02-10

How an LLM uses its context window matters a lot – the difference between a good and bad context management strategy can be 6x on downstream tasks¹. Right now, these strategies are hand-designed. We’re trying to automate that.

The idea

We represent context management strategies as executable Python programs: each program decides what to store, how to retrieve it, and how to format it for the model. Then we search over the space of programs using LLM-guided evolution. The search loop maintains a population of strategy programs, evaluates them on a task suite, and uses an LLM to propose mutations informed by execution logs. The result is a fully automatic pipeline that discovers context strategies without human design effort.

Results

What we found

The search discovers qualitatively distinct strategies. Across runs, 7 different strategy families emerged, not just minor variants of each other. Some resemble known approaches; others are genuinely novel.
Novel strategies include an online bandit over retrieval methods that dynamically selects between different retrieval functions based on recent reward, and a dual-pool MMR diversity scheme that maintains separate memory pools and merges them using maximal marginal relevance.
Less context can be better. The top strategies aggressively compress and filter, suggesting that current models are hurt more by irrelevant context than by missing information.

Meta-Learned Memory

The idea

Results

What we found

Links