Pregel executor

The Pregel executor drives graph execution forward in discrete super-steps. It is inspired by Google's Pregel model for large-scale graph processing and uses a Bulk Synchronous Parallel (BSP) loop: plan the runnable work, execute independent nodes in parallel, merge updates in a deterministic order, then checkpoint state. This gives us deterministic ordering, native parallelism, and a checkpoint boundary we can stop and resume at.This gives Beakr the ability to pause, resume, or retry long-running workflows without starting over.

The super-step loop

Phases 1-4 always run in this order within a single super-step; phase 5 is the loop condition. The executor halts when every channel reaches a stable version with no nodes left to trigger.

Phase details

PLAN -- find runnable nodes

Walk the graph. For each node, compare its input-channel versions to the versions it last consumed. Any node with a newer version on any input is runnable this super-step.

EXECUTE -- run nodes

Independent operations run in parallel. Each sees an isolated snapshot of channel values -- writes from this super-step are not visible until phase 3.

UPDATE -- apply channel reducers

Collect every pending write, then apply channel-specific reducers in a deterministic order (alphabetical by channel name). Some channels append messages, others keep only the latest value, others combine running totals.

CHECKPOINT -- save to DB

Serialize the new channel snapshot into a checkpoint row. This enables mid-run resume.

Why Bulk Synchronous Parallel?

BSP means work happens in rounds. Nodes read a stable snapshot, run independently where possible, then publish their writes at the end of the round. That boundary is what makes parallel execution predictable.

Determinism: same graph + same inputs = same trace
Parallelism: independent operations run concurrently
Resumability: every super-step can become a resume point

Failure modes

When a node raises, the executor marks the super-step as failed and reports the error to the frontend. Writes from failed nodes are never applied.