BeakrGo to website

The knowledge base compiler

The compiler is a sandbox-first agent that analyzes raw source files -- reading documents, searching text, running code, and interpreting images -- then writes structured pages with [[links]], temporal metadata, and cited sources. It is one of three knowledge-layer agents: the compiler creates and updates pages from sources, capture proposes knowledge from conversations, and health keeps the graph connected and current.

RAW SOURCESprotocol-v3.pdfoutcomes-q1.xlsxslide-deck.pptxslack-thread.jsonSANDBOXCompiler agentTOOLSDocument readingText searchCode executionImage analysisEXTRACTION PROFILEGoalsFocus areasKey entitiesSTRUCTURED PAGESProtocol V3topicHead of R&DpersonQ1 Outcomesmeeting[[OR-3 Calibration]]red link

Raw files enter the sandbox, the agent reads them with its toolset, and structured pages come out the other side.

Page types

The compiler assigns a page_type to every page it creates. Page types carry semantic meaning that affects how content is structured, how links are weighted, and how the page appears in search results. The supported compiler set now matches ingestion: topic, person, organization, decision, meeting, overview, research_note, experiment, protocol, compound, dataset, and initiative.

topic

A concept, technology, process, or domain area. The most common page type.

person

An individual -- a team member, collaborator, author, or external contact.

organization

A company, institution, lab, partner, or vendor.

decision

A specific decision with context, rationale, alternatives considered, and outcome.

meeting

A meeting, discussion, or synchronous event with attendees, agenda, and outcomes.

overview

A high-level summary that synthesizes information from multiple related pages.

research_note

A research observation, analysis note, literature summary, or interpretation that should remain traceable to source context.

experiment

A specific study, assay, run, test, or trial-like activity with objective, setup, conditions, results, interpretation, and follow-ups.

protocol

A repeatable method, SOP, process, or workflow with purpose, inputs, steps, parameters, controls, outputs, and version changes.

compound

A drug, molecule, biologic, candidate, or formulation with aliases, modality, target or mechanism, indication, status, and evidence.

dataset

A result set, analysis output, assay readout, stability data, or measurement collection with source, scope, methods, findings, and limitations.

initiative

An ongoing program or body of work with goals, scope, owners, linked compounds, experiments, protocols, datasets, decisions, status, and next steps.

index

Index pages are created by maintenance/direct-write workflows when needed, but they are not in the compiler ingestion enum.

Temporal metadata

The compiler extracts date-anchored events from source material -- decisions, milestones, deadlines, experiment dates -- and stores them as structured temporal metadata on each page. This enables timeline queries, chronological ordering, and staleness detection.

event_start
The start date of the event described by this page. For a meeting, the meeting date. For a decision, the date the decision was made. For a multi-day event, the first day.
event_end
The end date of the event, if it spans a range. For single-day events, this is the same as event_start or left null.
date_precision
How precise the extracted date is: day, month, or year. A source that says "Q1 2024" produces month precision; one that says "March 15, 2024" produces day. Downstream queries use precision to avoid false specificity.

Extraction guidance

An extraction profile tells the compiler what to focus on when processing source material. Profiles are scoped hierarchically: organization → project. A project-level profile overrides the org profile.

Profile fields include goals (what the knowledge base should help with), focus areas (topics to prioritize during extraction), and key entities (people, organizations, and concepts that should always get their own pages).

For example, a biotech research project might set focus areas to "clinical trial protocols, regulatory submissions, safety data" and key entities to researchers and partner organizations. The compiler uses these signals to decide what deserves a dedicated page, what level of detail to extract, and which links to create.

Profile scope cascades downward. An org-level profile applies to every project in the organization unless a more specific profile is set. This means you can configure baseline extraction behavior once and override it only where a project has specialized needs.