How It Works

How It Works

Insurance documents go in. Evidence-backed data comes out.

PolDex is the vertical API for insurance extraction. Developers, operators, and agents use different surfaces, but every route feeds the same classifier, extraction core, credit ledger, export model, and deterministic output contracts.

Core Model

One rail for the insurance document lifecycle.

Commercial P&C was the first proof wedge; the release-ready surface now spans the full 56-schema insurance universe. The workflow is built for the whole insurance universe: classify, extract, verify evidence, export, deliver, and reuse the same API contract.

01

Classify the document

PolDex identifies the insurance document family, schema fit, source type, and evidence posture inside the 56-schema release-ready universe.

02

Extract to a schema

The live schema contracts enforce required fields, evidence expectations, conflicts, confidence, unresolved items, abstention rules, and schema-scoped output shaping.

03

Return system-ready truth

Results leave as canonical JSON plus CSV, XLSX, ZIP artifacts, signed links, webhooks, and connector events.

04

Prove the line

Every schema sits on the same 100+ document release gate behind the public 99%+ accuracy posture.

01

Async, not synchronous

PolDex returns 202 Accepted immediately. No long blocking request cycle.

02

One production pipeline

API, processor, playground, signed links, and email intake all converge on the same extraction core.

03

Estimate before processing

Paid production work shows credit estimates before holds are created and work begins.

04

Deterministic output

Canonical JSON is enforced first, scoped to the requested schema, then CSV and XLSX are derived from that same result.

05

Signed delivery

Webhooks are signed. Polling remains available. Delivery retries and DLQ states are explicit.

Full Lifecycle

Step-by-step flow.

01

Choose how you want to use PolDex

Developers use the API. Non-developers use the processor review cockpit. Proof and playground stay free for evaluation. Signed links and email intake support operational handoff.

# Direct API
POST /v1/extract

# No-code processor
POST /v1/batches/estimate
POST /v1/batches

# Playground
POST /v1/playground/run

# Agent interfaces
npx -y @poldex/mcp-server@0.0.3
poldex extract policy.pdf

# Signed intake links
POST /v1/intake/links
POST /v1/intake/links/{token}/submit

# Email intake
POST /v1/intake/email/register
02

Verify credits and estimate before processing

Paid production work starts by verifying the API key and estimating credits from source type, page count, and complexity band. The user sees the estimate before work begins.

POST /v1/batches/estimate

{
  "estimated_credits": 6,
  "items": [
    {
      "source_name": "contractor-coi.pdf",
      "estimated_pages": 4,
      "estimated_credits": 2
    }
  ],
  "available_credits": 100
}
03

Confirm and queue work

When the user confirms the estimate, PolDex holds credits, creates jobs or batch items, and returns quickly. Long documents run asynchronously by default.

HTTP 202 Accepted

{
  "batch_id": "batch_01hx4mz9p3kqa8",
  "status": "queued",
  "credits_held": 6,
  "items": [
    {
      "item_id": "item_01hx4n0",
      "source_name": "contractor-coi.pdf",
      "status": "queued"
    }
  ]
}
04

Safe document retrieval

Direct browser uploads stay capped for launch safety. Large documents use URL-based async processing. Workers stream document content in bounded chunks and avoid loading large PDFs as single blobs.

# Internal worker behavior
# streamed retrieval -> temp storage
# bounded memory profile
# cleanup after processing
05

Schema-aware extraction

FastScript is the managed extraction engine across the completed schema families. Underlying model routing and failover stay internal while customer-facing output remains stable.

{
  "schema_id": "commercial_gl",
  "extractor": "fastscript-engine",
  "model_name": "fastscript-engine",
  "fallback_reason": null
}
06

Truth and conflict resolution

Facts are normalized, evidence-linked, truth-scored, and filtered back to the requested schema. Contradictions between forms and endorsements are returned as explicit conflicts.

{
  "facts": [{
    "field": "policy_number",
    "status": "active",
    "evidence": [{ "page": 1, "citation": "Declarations" }]
  }],
  "conflicts": []
}
07

Result bundle generation

One canonical JSON result is produced, then CSV and XLSX exports are generated from the same source truth. Processor downloads inherit source filenames so bulk users do not lose track of files.

{
  "source_name": "contractor-coi.pdf",
  "result_links": {
    "json": "https://.../contractor-coi.json",
    "csv": "https://.../contractor-coi.csv",
    "xlsx": "https://.../contractor-coi.xlsx"
  }
}
08

Delivery to your system

Developers receive data through signed webhooks, polling, or signed links. Processor users review facts and evidence, save decisions or field notes, then download JSON, CSV, XLSX, manifest CSV, ZIP, or combined exports.

POST https://your.app/webhook
X-PolDex-Signature: t=1745280000,v1=...

GET /v1/jobs/{job_id}
-> { "status": "complete", "result_links": { ... } }
09

Settlement, retries, and DLQ

Successful jobs capture held credits. URL jobs reconcile holds after fetch/page estimation. Internal failures release holds. Delivery retries are bounded and terminal failures keep replay metadata.

{
  "status": "dlq",
  "failure_code": "delivery_failed_max_retries",
  "credits_captured": 0
}
Intake Surfaces

How documents enter the live system.

Direct API

Best for engineering teams that want full control with POST /v1/extract and webhook orchestration.

Agent interfaces

Best for AI agents and terminal environments that need MCP, CLI, OpenAPI, discovery files, and safe confirmation before paid extraction.

Processor review cockpit

Best for operators who paste an API key, add files or URLs, approve an estimate, and download named JSON, CSV, or XLSX outputs.

Playground

Non-developers can run URL, upload, or pasted text through /playground against the same backend contracts.

Signed intake links

Ops teams can receive private intake links with strict expiry and submit documents without handling raw API authentication.

Email intake

Registered inboxes can forward attachments directly into intake. Jobs are created on the same extraction engine and return signed result links.

Workflow Connections

PolDex fits around existing systems.

PolDex connects to existing workflows without replacing them: API for developers, webhooks for systems, processor review cockpit for ops teams, email intake for inbox workflows, signed links for third-party submissions, and CSV/XLSX exports for legacy systems.

Developer systems

Use the API, idempotency keys, polling, signed webhooks, and stable result contracts to wire PolDex into internal platforms.

Operations teams

Use the processor, email intake, signed links, review notes, decisions, and named exports when a team needs results without opening Postman.

Legacy imports

CSV, XLSX, manifest files, and signed result links let teams move extracted data into spreadsheets and older systems.

Connector roadmap

Prebuilt connectors wrap the same rails for Reducto parse imports, Zapier, Make, n8n, Airtable, Google Sheets, Excel, OneDrive, SharePoint, Slack, and Teams.

Insurance systems

AMS, CRM, underwriting, compliance, and procurement connectors are sequenced by customer demand, not guessed upfront.

No workflow lock-in

The API remains the product boundary. Connectors are adapters, not a replacement dashboard or forced workflow suite.

Credits

Credits are estimated, held, captured, or released.

PolDex does not ask users to guess cost. Short/simple documents start at 1 credit, while larger documents and batches are estimated before processing. Invalid, unsafe, or inaccessible inputs are rejected before capture.

Estimate

Page and complexity bands produce a visible estimate before production work starts.

Hold

Confirmed work holds the estimated credits so balances cannot go negative.

Capture

Successful completed results capture the held credits from the immutable ledger.

Release

System-side failures release held credits. Invalid user inputs are rejected before capture.

Outputs

One truth, multiple formats.

Canonical JSON

System-of-record payload with schema, evidence, conflicts, and metadata.

CSV export

Flat operational views generated from the canonical JSON result for quick downstream use.

Excel export

XLSX workbook output for broker and ops workflows that still run in spreadsheet environments.

Failure Behavior

Failures are bounded and explicit.

Extraction failure

Internal extraction failures release held credits and return explicit failure codes.

Delivery retry + DLQ

Delivery retries with backoff. Terminal failures move to DLQ and can be replayed from admin.

Rate and credit controls

Insufficient credits return 402. Abuse controls return 429 with stable, contract-safe errors.

Run the full flow.

Use live proof, playground, or direct API integration on the live product surface.