// GETTING STARTED
Quickstart
Go from `pip install` to semantic search, PII masking, and natural-language queries over a Polars DataFrame in about five minutes — all running locally, with no vector database and no data leaving your machine.
Install
pip install "omna[all]"omna[all] pulls in every feature. If you only need some, install the extras you want:
| Install | What you get |
|---|---|
pip install omna | Schema understanding (understand_df) — just polars, numpy, rich |
pip install "omna[embed]" | Local embeddings, search, and filter |
pip install "omna[pii]" | PII detection and masking (pii_report, mask_pii) |
pip install "omna[ask]" | Natural-language queries (ask) — masks rows before the API call |
pip install "omna[all]" | Everything above |
Note: Omna supports Python 3.10–3.12 on macOS or Linux. Windows is not supported. The PII engine runs on Apple Silicon macOS and Linux — it is not available on Intel Macs. Intel-Mac users can still use search, filter, and schema understanding.
No API key is required for search, filter, embed, pii_report, mask_pii, or understand_df. Only [ask()](/docs/ask) needs an ANTHROPIC_API_KEY.
Your first five minutes
Importing omna registers the .omna namespace on every Polars DataFrame — no other setup needed.
import polars as pl
import omna
df = pl.read_csv("documents.csv")
# 1 — explore the schema (no LLM, no network)
omna.understand_df(df)
# 2 — audit for PII before anything touches the data
df.omna.pii_report()
# 3 — redact PII; an audit log is saved automatically
clean = df.omna.mask_pii()
# 4 — build a search index once
clean.omna.embed("text")
# 5 — search by meaning (semantic + keyword, fused)
results = clean.omna.search("insurance claim denied", on="text", k=5)
# 6 — or take everything above a similarity threshold
flagged = clean.omna.filter("insurance claim denied", on="text", threshold=0.73)
# 7 — ask a question in plain English (needs ANTHROPIC_API_KEY)
results.omna.ask("What personal data do these documents expose?")That is the whole library: understand your data, protect it, index it, and query it — by meaning, not keywords.
Why local-first
- No API key for core features, and no vendor account to manage.
- No data leaves your machine — embedding, search, filter, and PII masking all run in-process on a Rust kernel.
- No per-call cost and no rate limits — process a million rows offline, in CI, or on a plane.