Compression
Working memory compression is how Beakr keeps long-running conversations inside a model's context window. The WorkingMemory object runs four layers in order -- each cheaper than the next is expensive -- and stops as soon as the token budget is under threshold.
The four layers
Each layer runs only if the previous layer didn't bring tokens back under threshold. In practice, layer 3 fires rarely -- layers 0-2 handle the common case.
Checkpoint boundary
Compression respects a checkpoint boundary: messages from before the current run's first checkpoint are eligible for clearing; messages from the current run are protected.
Protected tools
Protected tools are calls whose outputs cannot be safely summarized away because later reasoning may depend on the exact result, a returned artifact identifier, or a durable write side effect. Beakr preserves these results longer than ordinary search snippets or cached reads.
- repl_run: reasoning often references prior REPL output many turns later
- artifact_create / artifact_edit: the returned ID is the only handle
- memory_write, document_edit: write-side effects are not recoverable from a summary