ProductsPrivacyLibraryDocsPricingGitHubAdd to ChromeDownload for Mac

// USING OMNA

Supported File Formats

Omna can slice any of these file types before sending to AI. Each format is handled differently based on its structure.

5 min readWhat you can drop in

Tabular files (rows and columns)

These are sliced row-by-row. Omna finds the rows most relevant to your question and sends only those.

FormatExtensionsNotes
CSV.csvMost common. All columns preserved.
Excel.xlsx, .xls, .xlsbAll sheets are concatenated into one table.
OpenDocument Spreadsheet.odsSame as Excel handling.
Parquet.parquetColumn-store format common in data engineering. Fully supported — no size limit beyond the 10 GB file cap.

Output format: A clean CSV with a header row and only the relevant rows. Column names are preserved exactly as they appear in the original.

What counts as a "relevant row": Omna uses a combination of keyword matching (BM25) and semantic similarity (sentence embeddings) to score every row against your question. Rows with scores below a minimum threshold are dropped. For numeric questions ("over $500", "between 10 and 20"), an exact arithmetic filter runs first — all rows that arithmetically satisfy the condition are kept, regardless of semantic score.


Documents (text with structure)

These are sliced section-by-section. Omna finds the sections most relevant to your question.

FormatExtensionsNotes
Word.docx, .doc, .odt, .rtfSliced by paragraph or heading section.
PDF.pdfSliced by page. Each page becomes one searchable unit.

Output format: A .txt file with structural markers. Word docs include [§ Heading Name] markers before each section. PDFs include [p. N] markers before each page's content.

Why PDFs are sliced by page: Unlike tabular data where each row is an independent fact, PDFs contain flowing text where context within a page matters. Splitting mid-paragraph would break meaning. One page = one chunk is a safe, predictable unit.


Plain text files

These are sliced line-by-line.

FormatExtensionsNotes
Plain text.txt, .md, .logEach non-empty line is one searchable unit.
JSON.jsonEach top-level array element or top-level object key is one unit.

Markdown files: Treated as plain text. Omna doesn't parse markdown structure — each line is indexed independently. For large .md files with clear headings, results will typically cluster around the relevant sections naturally.

Log files: Log line format varies widely. Omna indexes each line as a unit and uses keyword + semantic search to surface the relevant entries. Works well for error logs, access logs, and structured log lines.


File size limits

LimitValueWhy
Maximum file size10 GBHard cap per file
Maximum rows for full indexing5,000,000 rowsAbove this, only BM25 keyword index is built (no embeddings)
Minimum row length for embeddings50 tokensVery short rows (like single IDs) are BM25-only

Files above 5M rows can still be sliced — the keyword search runs on all rows, but the semantic reranking step only applies to the BM25 survivors.


What is NOT supported

Screenshots and images (OCR-sliced)

.png, .jpg, .jpeg, .webp, .heic files are supported via OCR.

When you drop a screenshot onto the capsule or attach it in the browser extension, Omna uses macOS Vision to extract the text, then slices and masks it exactly like a .txt file. The AI receives extracted text (~200–400 tokens) instead of the raw image (~1,600 Vision API tokens).

AI Vision APIs charge ~1,600 tokens per image regardless of content. OCR + slicing can cut that by 70–90% when the image contains relevant text.

Unsupported image formats: .gif, .bmp, .svg — these are not screenshot formats and are released to the AI unchanged.

Other unsupported formats

FormatStatus
Images (.gif, .bmp, .svg)Not supported — not screenshot formats
Audio (.mp3, .m4a, .wav)Not supported
Video (.mp4, .mov)Not supported
Encrypted / password-protected filesNot supported — Omna cannot read encrypted content
ZIP / RAR archivesNot supported — extract first, then drop the individual file
Binary files (.exe, .bin, .dmg)Not supported

Extension vs. desktop app — same formats?

Yes. The browser extension uses the same slicing engine as the desktop capsule (the Mac app). When you attach a file on claude.ai or chatgpt.com, the extension sends it to the Mac app for slicing and swaps the result in — same formats, same quality, same output.

One difference: The extension does its own PDF and Word text extraction in Chrome (using pdfjs-dist and mammoth) before sending to the Mac app. The Mac app handles all formats natively. End result for the user is identical.


How Omna picks the output format

InputOutput
Tabular (CSV, Excel, Parquet).csv — header + relevant rows
PDF.txt — page markers + relevant page content
Word / RTF / ODF.txt — section markers + relevant sections
Plain text / log / markdown.txt — relevant lines
JSON.json — relevant top-level elements
Image (PNG/JPG/JPEG/WEBP/HEIC).txt — OCR'd text, relevant lines sliced

The output file is attached to the AI chat as if you'd attached it yourself. The AI sees the sliced file — it never sees the original.