ProductsPrivacyLibraryDocsPricingGitHubAdd to ChromeDownload for Mac

// PYTHON LIBRARY

filter()

`df.omna.filter(query, on, threshold)` returns **every** row that is semantically similar to your query above a similarity threshold — not just the top k. Use it when you want all the matches, not a ranked sample.

1 min readEverything above a threshold

Signature

python
df.omna.filter(query, on, threshold=0.3)
ParameterDescription
queryThe query string, in plain language
onThe column to filter
thresholdMinimum similarity (0–1). Default 0.3. Raise for precision, lower for recall
Note: Run [df.omna.embed("column")](/docs/embed) once before filtering.

Example

python
import polars as pl
import omna

df = pl.read_csv("documents.csv")
df.omna.embed("text")                                       # once

filtered = df.omna.filter("insurance claim denied", on="text", threshold=0.73)
# → every document above 0.73 similarity — all semantically related to claim denials

search() vs. filter()

UseWhen
[search(query, on, k)](/docs/search)You want the top k most relevant rows, ranked
filter(query, on, threshold)You want every row above a similarity cutoff

Raise the threshold for higher precision (fewer, tighter matches); lower it for higher recall (more, looser matches).

What's next