Compression

Working memory compression is how Beakr keeps long-running conversations inside a model's context window. The WorkingMemory object runs four layers in order -- each cheaper than the next is expensive -- and stops as soon as the token budget is under threshold.

The four layers

Each layer runs only if the previous layer didn't bring tokens back under threshold. In practice, layer 3 fires rarely -- layers 0-2 handle the common case.

Checkpoint boundary

Compression respects a checkpoint boundary: messages from before the current run's first checkpoint are eligible for clearing; messages from the current run are protected.

Protected tools

Protected tools are calls whose outputs cannot be safely summarized away because later reasoning may depend on the exact result, a returned artifact identifier, or a durable write side effect. Beakr preserves these results longer than ordinary search snippets or cached reads.

repl_run: reasoning often references prior REPL output many turns later
artifact_create / artifact_edit: the returned ID is the only handle
memory_write, document_edit: write-side effects are not recoverable from a summary