POST /v1/extract

Return the structured data a page already declares — JSON-LD, OpenGraph, and meta tags.

Deterministic, no AI. Onto parses the structured data the page itself publishes — it never invents fields. For free-form questions over a page, use /v1/read and let your own model reason over the clean Markdown.

Endpoint

http
POST https://api.buildonto.dev/v1/extract
Authorization: Bearer onto_sk_...
Content-Type: application/json

Request body

urlstringrequiredThe public URL to extract structured data from.
freshbooleanoptionalIf true, bypass the 1-hour cache. Default: false.
PDFs are supported. Point this endpoint at a PDF and Onto extracts its text. PDFs declare no HTML-level structured data, so structured comes back empty and the AIO score is null. Image-only PDFs return IMAGE_PDF (422) and are automatically refunded.

Response

Success (200): the parsed structured data (arrays/maps; empty when the page declares none), counts per type, and the AIO trust score for the page. To extract across a whole site in one call, use /v1/batch with mode: "extract".

json
{
  "status": "success",
  "url": "https://vercel.com",
  "title": "Vercel: Build and deploy…",
  "aio_score": 90,
  "grade": "Excellent",
  "hallucination_risk": "low",
  "structured": {
    "jsonLd": [
      { "@context": "https://schema.org", "@type": "SoftwareApplication", "name": "Vercel" }
    ],
    "openGraph": {
      "og:title": "Vercel: Build and deploy the best web experiences…",
      "og:type": "website"
    },
    "meta": {
      "description": "Vercel provides the developer tools…",
      "author": "Vercel"
    }
  },
  "counts": { "json_ld": 1, "open_graph": 13, "meta": 10 },
  "cache": { "hit": false, "ttl_seconds": 3600 }
}

Errors: INVALID_URL (400), UNAUTHORIZED (401), ROBOTS_BLOCKED (403), WAF_BLOCKED (403), URL_NOT_FOUND (404), IMAGE_PDF (422), RATE_LIMITED (429), PAYMENT_REQUIRED (402). See error codes.

Examples

cURL:

bash
curl -X POST https://api.buildonto.dev/v1/extract \
  -H "Authorization: Bearer $ONTO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "url": "https://vercel.com" }'

Node (fetch):

ts
const res = await fetch('https://api.buildonto.dev/v1/extract', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.ONTO_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ url: 'https://vercel.com' }),
});
const { structured } = await res.json();
console.log(structured.jsonLd, structured.openGraph['og:title']);