BeakrGo to website

Connector sync

Connectors are the intake valve of the flywheel. Beakr manages OAuth credentials and sync state -- and every sync ends with an ingestion log entry that ties new knowledge base pages back to the exact external items that produced them.

PROVIDERSSlackGmailGoogle DriveNotionConfluenceJiraGitHubOAUTHTokenlifecycleToken refreshWebhooksConnection healthCONNECTORState + healthScope configProvider typeSync modeLast syncedSYNC PIPELINE1Enumerate items2Fan out workers3Download content4Source dedup5Compile to KBSYNC JOBstatus, item_counterror_count, durationINGESTIONLOGsync_job_idpages_created/updated

Beakr manages both OAuth credentials and sync state. The pipeline is five hops from trigger to knowledge base page.

Key models

CONNECTOR
The top-level integration record. Stores provider type, OAuth connection ID, sync mode, health status, and the timestamp of the last successful sync. One connector per provider per organization.
EXTERNAL ITEM
A normalized representation of a single source object -- a Drive file, a Slack thread, a Confluence page, a Jira ticket. Deduplicated by connector and external ID so the same item is never processed twice in the same sync.
EXTERNAL LINK
A URL reference extracted from an external item's content. Used for cross-referencing: if a Drive doc links to a Jira ticket, both items can be related in the graph.
EXTERNAL SOURCE ATTACHMENT
Binary or large-text content attached to an external item -- PDFs, images, spreadsheets. Downloaded and stored separately from the item metadata for efficient processing.
SYNC JOB
An execution record for a single sync run. Tracks status (running, completed, failed), item counts, error counts, duration, and links back to the connector that triggered it.

Sync cadence and triggers

Connectors do not listen for changes from external providers. Syncs are triggered explicitly by users or by the system when a connector is first configured. Syncs can run when a connector is first configured, when a user explicitly starts a sync, on scheduled cadences such as nightly refreshes, and through provider webhook triggers where supported. Scope controls still determine what data can enter the system, so teams can balance freshness, cost, and permissions.

Each connector has a scope mode that determines what gets synced:

Auth health monitoring

MechanismTriggerWhat it does
OAuth provider webhooksReal-timeWebhook events fire when a connection's OAuth token is refreshed, revoked, or encounters an error. Beakr updates the connector's health status immediately.
Scheduled token refreshEvery 2 hoursProactively refreshes tokens before they expire. Catches cases where provider webhooks are delayed or missed.

Change detection and memory

The infrastructure for delta sync exists in the data model: each source record tracks a content fingerprint and last-modified timestamp. Delta tracking is not only a sync optimization; it is also part of the memory layer. The system can reason about what changed, when it changed, and which downstream pages, citations, graph edges, or agent answers may be affected.

Change-aware sync roadmap

1
Content-change webhooks

Automatic detection of file edits, new messages, and ticket updates from connected providers -- triggering re-sync without manual intervention.

2
Scheduled re-sync

Automatic periodic re-sync on a configurable cadence, keeping your knowledge base fresh without manual triggers.

3
Delta sync

Intelligent change detection that skips unchanged items, making re-syncs faster and more cost-efficient.