// PYTHON LIBRARY

filter()

`df.omna.filter(query, on, threshold)` returns **every** row that is semantically similar to your query above a similarity threshold — not just the top k. Use it when you want all the matches, not a ranked sample.

1 min readEverything above a threshold

Signature

python

df.omna.filter(query, on, threshold=0.3)

Parameter	Description
`query`	The query string, in plain language
`on`	The column to filter
`threshold`	Minimum similarity (0–1). Default `0.3`. Raise for precision, lower for recall

Note: Run [df.omna.embed("column")](/docs/embed) once before filtering.

Example

python

import polars as pl
import omna

df = pl.read_csv("documents.csv")
df.omna.embed("text")                                       # once

filtered = df.omna.filter("insurance claim denied", on="text", threshold=0.73)
# → every document above 0.73 similarity — all semantically related to claim denials

search() vs. filter()

Use	When
[`search(query, on, k)`](/docs/search)	You want the top k most relevant rows, ranked
`filter(query, on, threshold)`	You want every row above a similarity cutoff

Raise the threshold for higher precision (fewer, tighter matches); lower it for higher recall (more, looser matches).

What's next

DocsTop-k ranked results — search() →DocsAudit for PII — pii_report() →