BeakrGo to website

Hybrid search

Beakr combines vector similarity search with full-text keyword search into a single retrieval system. Every query runs both pipelines in parallel, blends the results, and returns chunks ranked by a weighted score — giving you semantic understanding and keyword precision at the same time.

Why hybrid search

Neither vector search nor keyword search is sufficient on its own. Each has a blind spot that the other covers:

ApproachStrengthsWeaknesses
Vector search onlyUnderstands meaning, synonyms, paraphrasesMisses exact terms — searching "BRCA2" may return results about "gene mutations" without mentioning BRCA2
Keyword search onlyPrecise term matching, fast, predictableMisses meaning — searching "heart attack" will not find documents that only say "myocardial infarction"
Hybrid (Beakr)Semantic understanding with keyword precisionSlightly more compute per query — but the quality difference is substantial

Beakr blends both so that a search for "quarterly revenue projections" finds documents that use that exact phrase and documents that discuss "Q3 financial forecasts" without ever using the word "revenue."

Vector search

The vector pipeline converts text into high-dimensional numerical representations (embeddings) that capture semantic meaning. Similar concepts land near each other in vector space, so searching for a query means finding the nearest embedding vectors.

Embedding model

Beakr uses Google Gemini embedding-2-preview to generate 768-dimensional embedding vectors. Every chunk of ingested content — text, image captions, video transcripts — is embedded and stored as semantic embeddings.

The database retains multiple embedding generations for historical compatibility. Only the current generation (Gemini, 768-dim) is used for new queries.

Distance metric and indexing

Similarity is measured using cosine distance via a vector database extension. To avoid scanning every vector on every query, the database uses optimized indexes for approximate nearest neighbor search, providing sub-linear query times even as the number of vectors grows into the millions.

ComponentValue
Vector storeVector database extension
Distance metricCosine distance
Index typeOptimized approximate nearest neighbor index
Dimensionality768

Full-text search

The lexical pipeline uses the database's built-in full-text search engine. Each chunk's text is converted into a normalized representation that strips stop words, applies stemming (via the English dictionary), and stores word positions.

At query time, the search input is parsed and matched against the stored representations. Results are ranked using keyword relevance scoring based on term frequency and positional proximity — similar in spirit to BM25-style ranking. Documents that contain the search terms more frequently and in closer proximity receive higher scores.

What the English dictionary does

Score blending

After both pipelines return their results, Beakr blends the scores using a weighted linear combination:

score = alpha * vector_similarity + (1 - alpha) * lexical_rank

The default value of alpha = 0.7 means semantic similarity contributes 70% of the final score and keyword relevance contributes 30%.

ParameterDefaultEffect
alpha0.7Weight given to vector similarity. Higher values favor semantic understanding; lower values favor exact term matching.
vector_similarityCosine similarity between the query embedding and chunk embedding. Range: 0 to 1.
lexical_rankNormalized keyword relevance score from full-text search. Range: 0 to 1.

Why semantic gets more weight

In practice, most knowledge-base queries are natural-language questions ("What is our refund policy?") rather than keyword lookups ("refund-policy-v3"). Weighting semantic similarity at 70% means the system performs well for conversational queries while still boosting results that contain exact terms. The 30% keyword contribution ensures that technical identifiers, product names, and acronyms are not lost in the semantic space.

Multi-modal search

Beakr does not limit search to text. Every chunk carries a modality field that indicates its content type:

ModalityWhat is embeddedUse case
textRaw text contentDocuments, knowledge base pages, messages
imageImage descriptions and captionsDiagrams, screenshots, figures
videoTranscripts and frame descriptionsMeeting recordings, tutorials

All modalities share the same embedding space and the same search pipeline. Search uses a shared ranking pipeline, while the underlying representations can be text-derived, modality-specific, or joint embeddings depending on the source. A text query can surface relevant images or video segments alongside document chunks because those assets carry captions, transcripts, frame descriptions, and modality metadata.

Tenant isolation in search

Every search query is automatically scoped to the authenticated tenant. This is not an application-level filter — it is enforced by PostgreSQL Row Level Security at the database layer.

The retriever joins chunks to their parent resources and relies on database-level security policies to ensure that only resources belonging to the authenticated organization are visible. Even if a query is malformed or a code path has a bug, the database will never return chunks from another tenant's data.

Tenant isolation is enforced at the database level -- not in application code. It is transaction-scoped and cannot leak between requests. See Multi-tenancy & isolation for details.

How search feeds the agent

Search is one layer in Beakr's retrieval system, not the only one. When an agent processes a question, it has access to multiple retrieval strategies:

The agent decides which tools to use based on the question. A broad question ("What do we know about customer churn?") may start with hybrid search to find relevant pages, then follow links for deeper context. A precise question ("What is the BRCA2 variant classification in our latest report?") benefits from keyword-heavy retrieval.

This is the difference between search as a retrieval call and agentic search as a workflow. Beakr can search, read pages, traverse the graph, inspect provenance, compare dates, and then decide whether another retrieval step is needed before answering.

Performance at scale

The system is designed to remain fast as the knowledge base grows:

MechanismWhat it doesWhy it matters
Optimized vector indexesApproximate nearest neighbor search with sub-linear complexityQuery time grows logarithmically, not linearly, with data volume
Full-text indexesInverted index for fast keyword matchingKeyword lookups remain fast regardless of corpus size
Tenant-scoped queriesSearch only scans the authenticated tenant's dataMulti-tenant databases do not degrade per-tenant performance as total data grows
Parallel executionVector and text pipelines run concurrentlyLatency is the max of the two pipelines, not the sum

Beyond simple RAG

Most retrieval-augmented generation systems do one thing: embed a query, find similar chunks, and pass them to a language model. Beakr's retrieval system goes further:

The result is a retrieval system that behaves less like a search engine and more like a research assistant with structured access to your organization's knowledge.