Benchmarks

Start Here

Goal	Use
Best default	`gemini:gemini-2.5-flash-lite` + default `ExtractOptions(...)`
Best provenance	`mode="hybrid"` with `document_input="native"` and `page_input="image"`
Fastest	`gemini:gemini-2.5-flash-lite` + default path + fewer repair attempts

Tip

Start with the default recipe unless page-level provenance or strict latency targets are the main requirement.

Best default:

from parsantic.extract import ExtractOptions, extract

result = extract(
    document,
    Schema,
    model="gemini:gemini-2.5-flash-lite",
    options=ExtractOptions(
        repair="targeted",
        max_repair_attempts=2,
    ),
)

Best provenance:

from parsantic.extract import ExtractOptions, extract

result = extract(
    document,
    Schema,
    model="gemini:gemini-2.5-flash-lite",
    options=ExtractOptions(
        mode="hybrid",
        document_input="native",
        page_input="image",
        repair="targeted",
        max_repair_attempts=2,
    ),
)

Fastest:

from parsantic.extract import ExtractOptions, extract

result = extract(
    document,
    Schema,
    model="gemini:gemini-2.5-flash-lite",
    options=ExtractOptions(
        repair="targeted",
        max_repair_attempts=1,
    ),
)

Benchmark Labels

Benchmark label	Library config
`document_auto`	default whole-document path from `ExtractOptions(...)`
`document_grounded`	`ExtractOptions(strategy=Strategy(plan="document_grounded"))`
`hybrid_targeted`	`ExtractOptions(mode="hybrid", document_input="native", page_input="image", repair="targeted")`

How To Read The Tables

Accuracy: exact field match rate
Wrong values: rate of returned fields that were wrong
Source grounding: correct source scope/page attribution
Latency: total runtime

Higher is better for Accuracy and Source grounding. Lower is better for Wrong values and Latency.

The strategy snapshots keep the model fixed so strategy differences are easier to read. The page-scale snapshot keeps the default model fixed so latency growth is easier to read.

Current Snapshot

Oncology

Model: gemini:gemini-3.1-flash-lite-preview

Strategy	Accuracy	Wrong values	Source grounding	Latency
`document_auto`	`0.639`	`0.333`	`0.611`	`7.23s`
`document_grounded`	`0.639`	`0.333`	`0.611`	`9.97s`

Nasal Melanoma

Model: gemini:gemini-3.1-flash-lite-preview

Strategy	Accuracy	Wrong values	Source grounding	Latency
`document_auto`	`0.831`	`0.158`	`0.419`	`14.02s`
`document_grounded`	`0.831`	`0.158`	`0.419`	`15.68s`

Page Scale

Model: gemini:gemini-2.5-flash-lite

Strategy	5 pages	10 pages	15 pages	Slope (s/page)
`document_auto`	`7.12s`	`10.79s`	`16.06s`	`0.89`
`document_grounded`	`6.47s`	`10.57s`	`15.04s`	`0.86`

Useful Knobs

Quality:

model
prompt wording
prompt examples
repair
max_repair_attempts
structured_output

Provenance:

strategy or mode
document_input
page_input

Latency:

model
max_repair_attempts
max_workers
whether you use whole-document or hybrid extraction