// REFERENCE

Pricing & Savings Math

How Omna calculates and displays token savings — and what the numbers actually mean.

5 min readWhat it costs, what you save

The Token Tax

AI models charge per token. One token ≈ 4 characters. Here's what a large file costs to send directly:

File	Tokens	Claude Sonnet cost (input)
1,000-row CSV	~50,000	~$0.15
10,000-row CSV	~500,000	Rejected — exceeds context window
1M-row parquet	~50,000,000	Rejected — exceeds context window

Claude's context window is 200,000 tokens. Files larger than that are rejected before any billing occurs. Most users hit this wall constantly.

What Omna changes

Instead of sending the whole file, Omna sends only the relevant rows — typically 100–500 rows, fitting comfortably within the token budget:

Without Omna	With Omna
Full 50,000-row file: rejected or $15	300 relevant rows: $0.009
Claude reads the first 200K tokens (arbitrary rows)	Claude reads the top 200K tokens (relevant rows sorted first)
Answer may hallucinate missing data	Answer is grounded in the matching rows

The four savings scenarios

Omna classifies every slice into one of four scenarios and shows honest math for each.

Token Tax Killed — small file, narrow query

Your file fits in Claude's 200K window, and Omna found a small relevant slice.

Example: 1,545 tokens in a 50-row transaction file. Question: "transactions over $500". Slice: 11 rows, 361 tokens.

code

Without Omna:  $0.005  (1,545 tokens × $3.00 / 1M)
With Omna:     $0.001  (361 tokens × $3.00 / 1M)
Token Tax saved: $0.004
Token reduction: −77%

All Rows Matched — small file, broad query

Your file fits in the window, but your question matched most of the rows — Omna kept almost everything.

Example: 50-row file, question "show me all transactions." Omna returns 48 rows.

code

Without Omna:  $0.005
With Omna:     $0.005  (nearly the same — honest reporting)
Token Tax saved: ~$0

Omna shows this honestly rather than inflating the savings number. The value here is PII masking and structured attachment, not token reduction.

Token Tax Crushed — big file, narrow query

Your file is bigger than Claude's 200K window. Omna's slice fits cleanly under the limit.

Example: 15,000-row finance CSV, 1.74M tokens. Question: "which firms are based in NYC?" Slice: 80 rows, 82K tokens — well under the 200K window.

code

Without Omna baseline: $0.60  (200K tokens — what Claude would truncate to)
With Omna:             $0.25  (82K tokens)
Token Tax saved: $0.35

Why the baseline is 200K, not the full file: Claude would never process 1.74M tokens — it would reject the request. The realistic "without Omna" scenario is you sending the first 200K tokens and getting an answer based on arbitrary rows.

Token Tax Redirected — big file, broad query

Your file is bigger than 200K AND even Omna's relevant slice exceeds 200K. Same token cost either way — but Omna's 200K contains the relevant rows, not the first arbitrary ones.

code

Without Omna: $0.60  (first 200K tokens — arbitrary rows)
With Omna:    $0.60  (top 200K tokens — ranked by relevance)
Token Tax saved: $0 in cost, but quality win

The result card labels this "Token Tax redirected into relevance" with an amber pill. No fake dollar amount is shown — the value is answer quality, not cost.

Pricing rates used

Model	Input rate	Used for
Claude Sonnet 4.6	$3.00 / 1M tokens	Desktop capsule result card; browser extension popup

Rates are updated when Anthropic changes pricing. The rate lives in one place per surface: extension/src/pricing.ts for the browser extension, and native/src/card.rs (CLAUDE_SONNET_INPUT_USD_PER_M_TOKENS, line 101) for the desktop capsule and tray. Both constants hold the same value.

Output tokens (Claude's response) are not included in Omna's savings math. Omna only affects what Claude reads, not what it writes.

The lifetime counter

The menu bar shows:

code

1.2M token tax intercepted
~$3.60 on API · ~240 extra prompts
47 files sliced · 4m ago

Token tax intercepted: cumulative tokens trimmed across all slices since install
~$X on API: estimated API cost of those trimmed tokens at Claude Sonnet rates
~X extra prompts: rough estimate of how many additional Claude messages that cost would have paid for (at average prompt size)
Files sliced: total number of file drops processed

These numbers count up permanently and survive app restarts. They're stored in ~/Library/Application Support/Omna/stats.json.

What "theoretical" means

Sometimes the result card shows savings labeled as "Theoretical." This happens when the file is small enough to fit in Claude's window but you're on a free Claude plan with no API billing. In that case, the dollar savings are real for API users but not applicable to free/paid-plan users who aren't billed per token.

Omna never claims savings based on token counts that Claude would have rejected (files over 200K tokens). If the file would have been rejected before any billing, the "without Omna" baseline is capped at 200K tokens.

Omna pricing

Free: $0 forever — the Python library (MIT), the Mac app, the Chrome extension, basic PII masking, and local semantic search.
Pro · Team: $20/user/month — advanced masking and custom rules, audit log exports, team sharing, usage analytics, and priority support.
Enterprise: custom — on-prem / VPC deployment, SSO/SAML, SOC 2 & HIPAA compliance, custom models, and SLAs. Contact us at omna.dev.

The first 10,000 users who sign up get Omna free during early access and lock in that free access. Claim your spot in the extension popup or at omna.dev/pricing.