POST /v1/extract
Return the structured data a page already declares — JSON-LD, OpenGraph, and meta tags.
Deterministic, no AI. Onto parses the structured data the page itself publishes — it never invents fields. For free-form questions over a page, use /v1/read and let your own model reason over the clean Markdown.
Endpoint
http
POST https://api.buildonto.dev/v1/extract
Authorization: Bearer onto_sk_...
Content-Type: application/jsonRequest body
urlstringrequiredThe public URL to extract structured data from.
freshbooleanoptionalIf true, bypass the 1-hour cache. Default: false.
PDFs are supported. Point this endpoint at a PDF and Onto extracts its text. PDFs declare no HTML-level structured data, so
structured comes back empty and the AIO score is null. Image-only PDFs return IMAGE_PDF (422) and are automatically refunded.Response
Success (200): the parsed structured data (arrays/maps; empty when the page declares none), counts per type, and the AIO trust score for the page. To extract across a whole site in one call, use /v1/batch with mode: "extract".
json
{
"status": "success",
"url": "https://vercel.com",
"title": "Vercel: Build and deploy…",
"aio_score": 90,
"grade": "Excellent",
"hallucination_risk": "low",
"structured": {
"jsonLd": [
{ "@context": "https://schema.org", "@type": "SoftwareApplication", "name": "Vercel" }
],
"openGraph": {
"og:title": "Vercel: Build and deploy the best web experiences…",
"og:type": "website"
},
"meta": {
"description": "Vercel provides the developer tools…",
"author": "Vercel"
}
},
"counts": { "json_ld": 1, "open_graph": 13, "meta": 10 },
"cache": { "hit": false, "ttl_seconds": 3600 }
}Errors: INVALID_URL (400), UNAUTHORIZED (401), ROBOTS_BLOCKED (403), WAF_BLOCKED (403), URL_NOT_FOUND (404), IMAGE_PDF (422), RATE_LIMITED (429), PAYMENT_REQUIRED (402). See error codes.
Examples
cURL:
bash
curl -X POST https://api.buildonto.dev/v1/extract \
-H "Authorization: Bearer $ONTO_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "url": "https://vercel.com" }'Node (fetch):
ts
const res = await fetch('https://api.buildonto.dev/v1/extract', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.ONTO_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({ url: 'https://vercel.com' }),
});
const { structured } = await res.json();
console.log(structured.jsonLd, structured.openGraph['og:title']);