Provenance

Five layers of attribution, tracked on every page. Provenance is non-negotiable -- nothing in the knowledge base exists without a traceable chain back to a primary source, an agent run, or a human edit.

Each layer answers a different question: where did this come from, does it support or contradict, when did it happen, what agent run wrote it, what changed since last time.

The five layers

Page sources

Which external documents contributed to this page? Each source links a page to its upstream connector items -- Drive files, Slack threads, Confluence pages, PubMed articles.

Section citations

Per-section references with a stance field: support, contradicts, or qualifies. Citations accumulate over time as new sources are ingested -- they are never replaced.

Temporal events

Time-stamped events with date precision attached to sections. Captures when something happened in the real world, not just when it was recorded in the system.

Ingestion events

Records of which agent run or sync job produced a given page revision. Ties the knowledge base entry back to the pipeline execution that created or updated it.

Revision history

Full content snapshots at every write. Supports diff, rollback, and audit. Every revision records who or what wrote it and the markdown content at that point. Human edits record the affiliated person or account, while agent edits record the agent run, model/config, and triggering sync or conversation when available.

Citation tokens

Citations are embedded as inline tokens in structured metadata, not in prose. Each token format identifies the source type and a stable identifier:

{{pmid:12345678}}

PubMed article. The integer is the PMID, which resolves to a stable NLM URL and can be used to fetch full citation metadata.

{{gdrive:1abc2def}}

Google Drive file. The ID maps to the Drive API file resource and can be opened in the browser via the web view URL.

{{slack:C01AB/1234.5678}}

Slack message. Channel ID and message timestamp, resolvable to a deep link via the Slack API.

{{artifact:abc-123}}

Internal artifact (e.g. a Jira ticket, Confluence page, or other connector item) identified by its Beakr external item ID.

{{web:sha256-hash}}

Web page. The hash is a content-addressable identifier derived from the canonical URL, ensuring stability across URL reformatting.

The full provenance chain

Five joins away from any paragraph is the connector that delivered the source.

Contradictions are quality signals.

A contradicts citation is not an error. It means the system has identified a genuine disagreement between sources and is surfacing it instead of silently picking a winner. Multiple stances on the same section -- support, contradict, qualify -- give users the context to decide which source to trust based on recency, authority, or domain expertise.