Integrations
Beakr connects to the tools your team already uses. Connectors sync on update so knowledge stays current -- no manual re-uploads, no stale documents.
Overview
Integrations are how external systems become Beakr memory. A connector handles authentication, scope, enumeration, download or API fetch, source-record tracking, and handoff into the ingestion and compiler pipeline.
Supported integrations
Each connector pulls data from a source platform, processes it through the ingestion pipeline, and writes structured knowledge base pages into the knowledge base. The table below lists every supported integration grouped by category.
Cloud storage
File-based connectors enumerate folders, download documents (PDFs, DOCX, PPTX, XLSX, images, CSVs, and more), extract text, chunk it, and embed it. When files change in the source, the connector re-ingests only the delta.
| Platform | What syncs | Notes |
|---|---|---|
| Google Drive | All file types in selected drives or folders | Supports Shared Drives and My Drive. Google Docs exported as HTML for richer parsing. |
| Dropbox | Files and folders in selected paths | Business and personal accounts. Recursive enumeration with path filtering. |
| OneDrive | Files and folders from user or SharePoint sites | Integrated via Microsoft Graph API. Supports both personal and organizational accounts. |
| SharePoint | Document libraries and site pages | Site-level scoping. Lists and library content extracted and structured. |
| Box | Files and folders in selected directories | Enterprise Box accounts. Folder-level access control respected during enumeration. |
Communication
Communication connectors use snapshot-based ingestion. Rather than syncing individual messages as documents, they capture conversations, threads, and events as temporal snapshots with speaker attribution preserved.
| Platform | What syncs | Notes |
|---|---|---|
| Slack | Channel messages, threads, and reactions | Snapshot ingestion. Speaker names attached to each message. Thread context preserved. |
| Gmail | Email threads from selected labels or all mail | Thread-level ingestion. Sender, recipients, and timestamps extracted as metadata. |
| Microsoft Teams | Channel messages and replies | Snapshot ingestion. Team and channel hierarchy maintained. Speaker attribution preserved. |
| Outlook | Email messages and calendar events | Integrated via Microsoft Graph. Folder-level scoping available. |
Project management
Project management connectors sync issues, pages, and workspaces into the knowledge base. Each item becomes a structured knowledge base page with metadata (status, assignee, labels) preserved.
| Platform | What syncs | Notes |
|---|---|---|
| Jira | Issues, epics, and project metadata | Supports Jira Cloud. Issue descriptions, comments, status, and custom fields extracted. |
| Confluence | Spaces and pages | Full page content with attachments. Space-level scoping for selective sync. |
| Notion | Pages, databases, and nested content | Recursive page tree traversal. Database properties mapped to structured metadata. |
Development
| Platform | What syncs | Notes |
|---|---|---|
| GitHub | Repository files, READMEs, issues, and pull requests | Supports public and private repos. Branch-level scoping. Markdown files prioritized. |
Calendar
| Platform | What syncs | Notes |
|---|---|---|
| Google Calendar | Events, attendees, descriptions, and meeting notes | Snapshot-based. Temporal metadata extracted for timeline queries. |
Research tools
| Platform | What syncs | Notes |
|---|---|---|
| Zotero | Library items, PDFs, and annotations | Group and personal libraries. Citation metadata preserved. |
| Overleaf | LaTeX projects and compiled documents | Project-level sync. Compiled PDF and source .tex files both ingested. |
Scientific platforms
| Platform | What syncs | Notes |
|---|---|---|
| Benchling Enterprise | Notebook entries, protocols, sequences, and registry entities | Enterprise API integration. Structured scientific data preserved with schema context. |
| Labguru | Experiments, protocols, and inventory records | ELN data extracted with experimental metadata and relationships intact. |
Public databases
Public database connectors query external scientific and academic databases on demand and ingest results into the knowledge base. These do not require OAuth -- they use public APIs. Beakr has handlers for 25+ public and scientific sources; the table below groups representative coverage rather than listing every endpoint individually.
| Database | What syncs | Notes |
|---|---|---|
| PubMed / PMC / bioRxiv | Article abstracts, full text where available, preprints, metadata, and MeSH terms | Search by keyword, author, PMID, DOI, or public identifier. Full citation metadata preserved. |
| ClinicalTrials.gov / NIH RePORTER | Trial records, endpoints, sponsors, funded projects, and linked publications | NCT, grant, investigator, and topic lookup. Trial phase, status, and funding context tracked. |
| UniProt / AlphaFold / PDB | Protein entries, structures, sequences, and functional annotations | Accession and structure lookup with cross-references to genes, pathways, and literature. |
| KEGG / Reactome / STRING | Pathways, interactions, compounds, and gene relationships | Pathway-level ingestion with graph-ready cross-references. |
| OpenAlex / OpenNeuro / USPTO | Scholarly works, authors, institutions, datasets, and patent records | Broad academic and IP search. Metadata includes citations, topics, datasets, inventors, and assignees. |
| PubChem / ChEMBL / HMDB / openFDA | Molecules, bioactivity, metabolites, labels, adverse events, and recalls | Chemical and regulatory context for drug discovery and translational workflows. |
| Ensembl / ClinVar / GWAS Catalog / GDC / cBioPortal | Genes, variants, studies, cancer genomics, and cohort-level datasets | Variant and disease context with stable identifiers for downstream provenance. |
How sync works
Every connector follows the same five-stage pipeline, regardless of the source platform. This consistency means that once data enters Beakr, it is structured, searchable, and attributed the same way whether it came from Slack or a PDF in Google Drive.
1. Connector configuration
Each connector is configured with a scope and a mode:
| Setting | Options | Description |
|---|---|---|
| Scope | user, group, org | Determines who can access the synced data. User-scoped connectors are private. Org-scoped connectors share data across the organization. |
| Mode | all, restricted | In all mode, the connector syncs everything it has access to. In restricted mode, you select specific folders, channels, or items to sync. |
2. OAuth authentication
Authentication is handled via secure OAuth management. When a user connects a platform, they authorize through the provider's standard OAuth flow. Beakr manages token storage, refresh, and rotation securely -- OAuth tokens are never exposed to end users or stored alongside application data.
3. Enumeration
Provider-specific handlers list all available items from the source. Each provider has its own enumeration logic (e.g., listing files in Drive, channels in Slack, pages in Confluence). The enumeration step produces a manifest of items to ingest, filtered by the connector's scope and mode settings.
4. Ingestion
Each enumerated item is downloaded, parsed, chunked, and embedded. File-based items go through ingest_file_item (binary download, text extraction, chunking). Document-based items go through ingest_document_item (API-fetched content, structured parsing). Both paths produce the same output: chunked text with source metadata ready for the knowledge base.
5. Compilation
The compilation step takes ingested chunks and creates or updates structured knowledge base pages. New information is merged with existing pages. Attribution is tracked at the paragraph level so every statement can be traced back to the source document and connector that produced it.
Continuous sync
Connectors re-enumerate on a schedule and in response to webhook triggers where supported. Only changed or new items are re-ingested. Deleted items are flagged and their knowledge base contributions marked accordingly. This means the knowledge base stays current without manual intervention.
Communication connectors
Slack, Microsoft Teams, and Google Calendar use a distinct ingestion model: snapshot-based ingestion. Instead of treating each message as a separate document, the system captures conversations and events as cohesive snapshots.
- Speaker attribution -- every message is tagged with the speaker's name and role, so the knowledge base knows who said what.
- Thread context -- replies are grouped with their parent message. A Slack thread about a decision becomes a single, coherent knowledge unit rather than a bag of isolated messages.
- Temporal metadata -- timestamps are extracted and indexed, enabling timeline queries like "what did the team discuss about the trial protocol in March?"
- Channel and team scoping -- you choose which channels or teams to sync. Private channels require explicit opt-in.
Connector health
Every connector has a health_status that Beakr monitors continuously. This lets administrators catch issues before they cause knowledge gaps.
| Status | Meaning | Action |
|---|---|---|
healthy | Connector is syncing normally. Last sync completed without errors. | None required. |
degraded | Some items failed to sync but the connector is still partially operational. | Review error logs. Often caused by permission changes on individual files or folders. |
expired | OAuth token has expired and could not be refreshed automatically. | Re-authenticate through the connector settings to issue a new token. |
revoked | Access was revoked at the source platform (e.g., app uninstalled from Slack workspace). | Re-authorize the integration from the source platform, then re-authenticate in Beakr. |
Health status is surfaced in the Beakr dashboard and through the kb_stats MCP tool. Degraded or expired connectors trigger alerts so your team can resolve the issue promptly.
Scope and permissions
Connectors are scoped at three levels, and the scope determines both who can configure the connector and who can access the resulting knowledge:
- User scope -- the connector and its synced data are visible only to the user who created it. Useful for personal email or private cloud storage.
- Group scope -- data is shared with members of a specific group. Useful for team-level Slack channels or shared project drives.
- Org scope -- data is available to the entire organization. Appropriate for company-wide knowledge sources like Confluence spaces or shared Google Drives.
All connector data respects Beakr's Row Level Security (RLS) policies at the database level. There is no application-side filtering -- tenant isolation is enforced by PostgreSQL RLS policies on every query. A user in one organization can never access connector data from another organization, regardless of how the application code is structured.
Custom integrations
The connector framework is designed for extensibility. Each provider is implemented as a handler module that follows a standard interface: enumerate items, download or fetch content, and yield structured documents.
Adding a new provider typically takes 48 hours from start to tested deployment. The framework handles OAuth, job scheduling, error handling, health tracking, and knowledge compilation. The provider handler only needs to implement the platform-specific enumeration and content-fetching logic.
If your team uses a platform not listed above, contact us. Most SaaS integrations can be built and deployed within a week.
Security
Integration security is designed around the principle that Beakr should never hold credentials it does not need.
- User authentication remains separate -- Beakr application login is handled through Clerk; connector OAuth grants are managed separately and scoped only to the connected provider.
- OAuth tokens managed securely -- all token lifecycle management (issuance, storage, refresh, and rotation) is handled through secure, certified infrastructure.
- No credential exposure -- OAuth tokens, API keys, and refresh tokens are stored securely and never exposed to end users or application-level code.
- Encrypted in transit and at rest -- all data pulled from connectors is encrypted with TLS 1.2+ in transit and AES-256 at rest in Beakr's infrastructure.
- Scoped access -- connectors request only the permissions needed for the configured scope. A user-scoped Google Drive connector does not request access to other users' files.
- Audit trail -- every sync event, including what was ingested and when, is logged and attributable.