Most production AI workflows need structured data โ not prose. You want an invoice parsed into fields, a sentiment classified as positive/neutral/negative, or an entity extracted as a typed object. Each approach has different reliability tradeoffs.
Three approaches compared
| Method | Reliability | Flexibility | Supported by |
|---|---|---|---|
| Prompt ("output JSON") | ~70โ85% | Any schema | All models |
| JSON Mode | ~95% | Valid JSON, no schema | GPT-4o, Gemini |
| Structured Output / Function Calling | ~99.9% | Any schema | GPT-4o, Claude, Gemini |
Structured outputs with Pydantic (recommended)
from pydantic import BaseModel
from moltbot import Client
class Invoice(BaseModel):
vendor_name: str
invoice_number: str
total_amount: float
line_items: list[dict]
due_date: str | None
client = Client()
# Schema enforced at API level โ no hallucinated fields
invoice = client.extract(
model="gpt-4o",
schema=Invoice,
content=pdf_text,
prompt="Extract all invoice fields."
)
print(invoice.total_amount) # always a float, never a string
Handling failures gracefully
- Retry with feedback: Pass the validation error back to the model: "Your output was invalid because: [error]. Please retry."
- Partial extraction: Extract field by field for complex documents โ more reliable than full-schema extraction.
- Confidence scores: Include a
confidence: floatfield. Flag low-confidence extractions for human review. - Fallback cascade: Structured output โ JSON mode โ prompted JSON โ manual review queue.
When to use each method
- Function calling / Structured outputs โ Any production pipeline. Always prefer when the model supports it.
- JSON mode โ When you need valid JSON but no fixed schema (e.g., "extract all metadata from this document").
- Prompted JSON โ Prototyping only, or models without structured output support.
Schema-enforced extraction on MoltBot
Pydantic enforcement, auto-retry on validation failure, confidence scoring. 14-day free trial.
Start Free Trial โ