Recipe: AI-Powered Extraction
Use LLM extraction for pages that are hard to scrape with CSS or XPath.
curl -X POST https://api.verid.dev/v1/monitors \
-H "Authorization: Bearer $VERID_API_KEY" \
-d '{
"name": "Product Page AI Monitor",
"url": "https://example.com/product/42",
"schedule_interval_seconds": 86400,
"extract_config": {
"method": "prompt",
"prompt": "Extract the product name, current price (as a number without currency symbol), availability status, and any active discount percentage from this page.",
"schema": {
"name": "string",
"price": "number",
"available": "boolean",
"discount_pct": "number or null"
}
},
"diff_predicate": {
"type": "composite",
"operator": "OR",
"conditions": [
{ "type": "field_changes", "field": "available" },
{ "type": "field_decreases_by_percent", "field": "price", "threshold": 5 }
]
},
"deliveries": [
{ "type": "webhook", "url": "https://your-app.com/hooks/product-change" }
]
}'LLM extraction caching
LLM results are cached for 30 days using a content hash. If the page content doesn't change between runs, the cached result is returned at no LLM cost.
LLM calls are only counted against your monthly quota on cache misses.
Models used
Verid runs your prompt against a fast, JSON-capable LLM. If the primary model returns malformed JSON or is unreachable, Verid automatically falls back to a secondary model so extraction stays resilient. You don't pick the model — Verid manages this for you.
The page content sent to the model is truncated at 50,000 characters.
Prompt rules
- Minimum 10 characters, maximum 2,000 characters
- Be specific about the keys you want returned
- Mention how to handle missing fields (e.g. "use null if not listed")
- Provide a
schemawhen you need consistent output shape