ProductsPrivacyLibraryDocsPricingGitHubAdd to ChromeDownload for Mac

// GETTING STARTED

Quickstart

Go from `pip install` to semantic search, PII masking, and natural-language queries over a Polars DataFrame in about five minutes — all running locally, with no vector database and no data leaving your machine.

2 min readInstall to first query in 5 minutes

Install

bash
pip install "omna[all]"

omna[all] pulls in every feature. If you only need some, install the extras you want:

InstallWhat you get
pip install omnaSchema understanding (understand_df) — just polars, numpy, rich
pip install "omna[embed]"Local embeddings, search, and filter
pip install "omna[pii]"PII detection and masking (pii_report, mask_pii)
pip install "omna[ask]"Natural-language queries (ask) — masks rows before the API call
pip install "omna[all]"Everything above
Note: Omna supports Python 3.10–3.12 on macOS or Linux. Windows is not supported. The PII engine runs on Apple Silicon macOS and Linux — it is not available on Intel Macs. Intel-Mac users can still use search, filter, and schema understanding.

No API key is required for search, filter, embed, pii_report, mask_pii, or understand_df. Only [ask()](/docs/ask) needs an ANTHROPIC_API_KEY.

Your first five minutes

Importing omna registers the .omna namespace on every Polars DataFrame — no other setup needed.

python
import polars as pl
import omna

df = pl.read_csv("documents.csv")

# 1 — explore the schema (no LLM, no network)
omna.understand_df(df)

# 2 — audit for PII before anything touches the data
df.omna.pii_report()

# 3 — redact PII; an audit log is saved automatically
clean = df.omna.mask_pii()

# 4 — build a search index once
clean.omna.embed("text")

# 5 — search by meaning (semantic + keyword, fused)
results = clean.omna.search("insurance claim denied", on="text", k=5)

# 6 — or take everything above a similarity threshold
flagged = clean.omna.filter("insurance claim denied", on="text", threshold=0.73)

# 7 — ask a question in plain English (needs ANTHROPIC_API_KEY)
results.omna.ask("What personal data do these documents expose?")

That is the whole library: understand your data, protect it, index it, and query it — by meaning, not keywords.

Why local-first

  • No API key for core features, and no vendor account to manage.
  • No data leaves your machine — embedding, search, filter, and PII masking all run in-process on a Rust kernel.
  • No per-call cost and no rate limits — process a million rows offline, in CI, or on a plane.

What's next