// GETTING STARTED

Quickstart

Go from `pip install` to semantic search, PII masking, and natural-language queries over a Polars DataFrame in about five minutes — all running locally, with no vector database and no data leaving your machine.

2 min readInstall to first query in 5 minutes

Install

bash

pip install "omna[all]"

omna[all] pulls in every feature. If you only need some, install the extras you want:

Install	What you get
`pip install omna`	Schema understanding (`understand_df`) — just `polars`, `numpy`, `rich`
`pip install "omna[embed]"`	Local embeddings, `search`, and `filter`
`pip install "omna[pii]"`	PII detection and masking (`pii_report`, `mask_pii`)
`pip install "omna[ask]"`	Natural-language queries (`ask`) — masks rows before the API call
`pip install "omna[all]"`	Everything above

Note: Omna supports Python 3.10–3.12 on macOS or Linux. Windows is not supported. The PII engine runs on Apple Silicon macOS and Linux — it is not available on Intel Macs. Intel-Mac users can still use search, filter, and schema understanding.

No API key is required for search, filter, embed, pii_report, mask_pii, or understand_df. Only [ask()](/docs/ask) needs an ANTHROPIC_API_KEY.

Your first five minutes

Importing omna registers the .omna namespace on every Polars DataFrame — no other setup needed.

python

import polars as pl
import omna

df = pl.read_csv("documents.csv")

# 1 — explore the schema (no LLM, no network)
omna.understand_df(df)

# 2 — audit for PII before anything touches the data
df.omna.pii_report()

# 3 — redact PII; an audit log is saved automatically
clean = df.omna.mask_pii()

# 4 — build a search index once
clean.omna.embed("text")

# 5 — search by meaning (semantic + keyword, fused)
results = clean.omna.search("insurance claim denied", on="text", k=5)

# 6 — or take everything above a similarity threshold
flagged = clean.omna.filter("insurance claim denied", on="text", threshold=0.73)

# 7 — ask a question in plain English (needs ANTHROPIC_API_KEY)
results.omna.ask("What personal data do these documents expose?")

That is the whole library: understand your data, protect it, index it, and query it — by meaning, not keywords.

Why local-first

No API key for core features, and no vendor account to manage.
No data leaves your machine — embedding, search, filter, and PII masking all run in-process on a Rust kernel.
No per-call cost and no rate limits — process a million rows offline, in CI, or on a plane.

What's next

DocsLibrary overview →DocsSemantic search — search() →DocsMask PII — mask_pii() →