ProductsPrivacyLibraryDocsPricingGitHubAdd to ChromeDownload for Mac

// PYTHON LIBRARY

search()

`df.omna.search(query, on, k)` finds the rows that best match your query by meaning — not just keywords — and returns them ranked by relevance. It runs on a local Rust kernel: ~9ms over 50,000 rows, no API key, no network.

2 min readHybrid semantic + keyword search

Signature

python
df.omna.search(query, on, k, hybrid=True)
ParameterDescription
queryThe search string, in plain language
onThe column to search
kNumber of top results to return, ranked by relevance
hybridHybrid (semantic + keyword) when True (default); pure semantic when False
Note: Run [df.omna.embed("column")](/docs/embed) once before searching. search() loads the saved index automatically on every call after that.

Returns a Polars DataFrame of the matched rows, with a _score column (cosine similarity, 0–1), ordered from most to least relevant.

Basic example

python
import polars as pl
import omna

df = pl.read_csv("documents.csv")
df.omna.embed("text")                                       # once

results = df.omna.search("insurance claim denied", on="text", k=5)
print(results)
code
 uid            document_type         domain      text                               _score
 67fccc1e207…   ClaimSummary          insurance   **Claim ID: 285-14-1755, Policy…   0.762
 b8ae088cd21…   ClaimSummary          insurance   **Claim Summary**…                 0.749
 de5bba0a2cc…   Insurance Claim Form  healthcare  **Insurance Claim Form**…          0.748

None of these rows contain the literal phrase "insurance claim denied" — Omna finds them by meaning.

Hybrid by default

Search fuses semantic (embedding) similarity with BM25 keyword matching using Reciprocal Rank Fusion. Semantics catch meaning; BM25 catches rare exact tokens the embeddings blur — part codes, IDs, surnames, acronyms.

python
results = df.omna.search("XJ9000", on="parts", k=5)               # exact code → BM25 nails it
results = df.omna.search("claim denied", on="text", hybrid=False)  # meaning only

vs. keyword search and regex

Regex and str.contains match only exact strings. A query for "chest pain" misses "cardiac pressure", "tightness in the chest", and "angina". Semantic search encodes meaning, so synonyms, paraphrases, and related concepts all match — with no synonym list to maintain.

All search computation happens inside your Python process. Patient records, trade secrets, and other sensitive text never leave the host machine.

What's next