BeakrGo to website

Integrations

Beakr connects to the tools your team already uses. Connectors sync on update so knowledge stays current -- no manual re-uploads, no stale documents.

Overview

Integrations are how external systems become Beakr memory. A connector handles authentication, scope, enumeration, download or API fetch, source-record tracking, and handoff into the ingestion and compiler pipeline.

Supported integrations

Each connector pulls data from a source platform, processes it through the ingestion pipeline, and writes structured knowledge base pages into the knowledge base. The table below lists every supported integration grouped by category.

Cloud storage

File-based connectors enumerate folders, download documents (PDFs, DOCX, PPTX, XLSX, images, CSVs, and more), extract text, chunk it, and embed it. When files change in the source, the connector re-ingests only the delta.

PlatformWhat syncsNotes
Google DriveAll file types in selected drives or foldersSupports Shared Drives and My Drive. Google Docs exported as HTML for richer parsing.
DropboxFiles and folders in selected pathsBusiness and personal accounts. Recursive enumeration with path filtering.
OneDriveFiles and folders from user or SharePoint sitesIntegrated via Microsoft Graph API. Supports both personal and organizational accounts.
SharePointDocument libraries and site pagesSite-level scoping. Lists and library content extracted and structured.
BoxFiles and folders in selected directoriesEnterprise Box accounts. Folder-level access control respected during enumeration.

Communication

Communication connectors use snapshot-based ingestion. Rather than syncing individual messages as documents, they capture conversations, threads, and events as temporal snapshots with speaker attribution preserved.

PlatformWhat syncsNotes
SlackChannel messages, threads, and reactionsSnapshot ingestion. Speaker names attached to each message. Thread context preserved.
GmailEmail threads from selected labels or all mailThread-level ingestion. Sender, recipients, and timestamps extracted as metadata.
Microsoft TeamsChannel messages and repliesSnapshot ingestion. Team and channel hierarchy maintained. Speaker attribution preserved.
OutlookEmail messages and calendar eventsIntegrated via Microsoft Graph. Folder-level scoping available.

Project management

Project management connectors sync issues, pages, and workspaces into the knowledge base. Each item becomes a structured knowledge base page with metadata (status, assignee, labels) preserved.

PlatformWhat syncsNotes
JiraIssues, epics, and project metadataSupports Jira Cloud. Issue descriptions, comments, status, and custom fields extracted.
ConfluenceSpaces and pagesFull page content with attachments. Space-level scoping for selective sync.
NotionPages, databases, and nested contentRecursive page tree traversal. Database properties mapped to structured metadata.

Development

PlatformWhat syncsNotes
GitHubRepository files, READMEs, issues, and pull requestsSupports public and private repos. Branch-level scoping. Markdown files prioritized.

Calendar

PlatformWhat syncsNotes
Google CalendarEvents, attendees, descriptions, and meeting notesSnapshot-based. Temporal metadata extracted for timeline queries.

Research tools

PlatformWhat syncsNotes
ZoteroLibrary items, PDFs, and annotationsGroup and personal libraries. Citation metadata preserved.
OverleafLaTeX projects and compiled documentsProject-level sync. Compiled PDF and source .tex files both ingested.

Scientific platforms

PlatformWhat syncsNotes
Benchling EnterpriseNotebook entries, protocols, sequences, and registry entitiesEnterprise API integration. Structured scientific data preserved with schema context.
LabguruExperiments, protocols, and inventory recordsELN data extracted with experimental metadata and relationships intact.

Public databases

Public database connectors query external scientific and academic databases on demand and ingest results into the knowledge base. These do not require OAuth -- they use public APIs. Beakr has handlers for 25+ public and scientific sources; the table below groups representative coverage rather than listing every endpoint individually.

DatabaseWhat syncsNotes
PubMed / PMC / bioRxivArticle abstracts, full text where available, preprints, metadata, and MeSH termsSearch by keyword, author, PMID, DOI, or public identifier. Full citation metadata preserved.
ClinicalTrials.gov / NIH RePORTERTrial records, endpoints, sponsors, funded projects, and linked publicationsNCT, grant, investigator, and topic lookup. Trial phase, status, and funding context tracked.
UniProt / AlphaFold / PDBProtein entries, structures, sequences, and functional annotationsAccession and structure lookup with cross-references to genes, pathways, and literature.
KEGG / Reactome / STRINGPathways, interactions, compounds, and gene relationshipsPathway-level ingestion with graph-ready cross-references.
OpenAlex / OpenNeuro / USPTOScholarly works, authors, institutions, datasets, and patent recordsBroad academic and IP search. Metadata includes citations, topics, datasets, inventors, and assignees.
PubChem / ChEMBL / HMDB / openFDAMolecules, bioactivity, metabolites, labels, adverse events, and recallsChemical and regulatory context for drug discovery and translational workflows.
Ensembl / ClinVar / GWAS Catalog / GDC / cBioPortalGenes, variants, studies, cancer genomics, and cohort-level datasetsVariant and disease context with stable identifiers for downstream provenance.

How sync works

Every connector follows the same five-stage pipeline, regardless of the source platform. This consistency means that once data enters Beakr, it is structured, searchable, and attributed the same way whether it came from Slack or a PDF in Google Drive.

1. Connector configuration

Each connector is configured with a scope and a mode:

SettingOptionsDescription
Scopeuser, group, orgDetermines who can access the synced data. User-scoped connectors are private. Org-scoped connectors share data across the organization.
Modeall, restrictedIn all mode, the connector syncs everything it has access to. In restricted mode, you select specific folders, channels, or items to sync.

2. OAuth authentication

Authentication is handled via secure OAuth management. When a user connects a platform, they authorize through the provider's standard OAuth flow. Beakr manages token storage, refresh, and rotation securely -- OAuth tokens are never exposed to end users or stored alongside application data.

3. Enumeration

Provider-specific handlers list all available items from the source. Each provider has its own enumeration logic (e.g., listing files in Drive, channels in Slack, pages in Confluence). The enumeration step produces a manifest of items to ingest, filtered by the connector's scope and mode settings.

4. Ingestion

Each enumerated item is downloaded, parsed, chunked, and embedded. File-based items go through ingest_file_item (binary download, text extraction, chunking). Document-based items go through ingest_document_item (API-fetched content, structured parsing). Both paths produce the same output: chunked text with source metadata ready for the knowledge base.

5. Compilation

The compilation step takes ingested chunks and creates or updates structured knowledge base pages. New information is merged with existing pages. Attribution is tracked at the paragraph level so every statement can be traced back to the source document and connector that produced it.

Continuous sync

Connectors re-enumerate on a schedule and in response to webhook triggers where supported. Only changed or new items are re-ingested. Deleted items are flagged and their knowledge base contributions marked accordingly. This means the knowledge base stays current without manual intervention.

Communication connectors

Slack, Microsoft Teams, and Google Calendar use a distinct ingestion model: snapshot-based ingestion. Instead of treating each message as a separate document, the system captures conversations and events as cohesive snapshots.

Connector health

Every connector has a health_status that Beakr monitors continuously. This lets administrators catch issues before they cause knowledge gaps.

StatusMeaningAction
healthyConnector is syncing normally. Last sync completed without errors.None required.
degradedSome items failed to sync but the connector is still partially operational.Review error logs. Often caused by permission changes on individual files or folders.
expiredOAuth token has expired and could not be refreshed automatically.Re-authenticate through the connector settings to issue a new token.
revokedAccess was revoked at the source platform (e.g., app uninstalled from Slack workspace).Re-authorize the integration from the source platform, then re-authenticate in Beakr.

Health status is surfaced in the Beakr dashboard and through the kb_stats MCP tool. Degraded or expired connectors trigger alerts so your team can resolve the issue promptly.

Scope and permissions

Connectors are scoped at three levels, and the scope determines both who can configure the connector and who can access the resulting knowledge:

All connector data respects Beakr's Row Level Security (RLS) policies at the database level. There is no application-side filtering -- tenant isolation is enforced by PostgreSQL RLS policies on every query. A user in one organization can never access connector data from another organization, regardless of how the application code is structured.

Custom integrations

The connector framework is designed for extensibility. Each provider is implemented as a handler module that follows a standard interface: enumerate items, download or fetch content, and yield structured documents.

Adding a new provider typically takes 48 hours from start to tested deployment. The framework handles OAuth, job scheduling, error handling, health tracking, and knowledge compilation. The provider handler only needs to implement the platform-specific enumeration and content-fetching logic.

If your team uses a platform not listed above, contact us. Most SaaS integrations can be built and deployed within a week.

Security

Integration security is designed around the principle that Beakr should never hold credentials it does not need.