Error codes
Every error returns JSON with a stable code and a human-readable message.
Error shape
All Read API errors share the same body. The HTTP status code mirrors the standard meaning; the code field is what you should branch on in code (messages can change wording without notice).
json
{
"status": "error",
"code": "RATE_LIMITED",
"message": "Monthly quota exceeded",
"retry_after": "2026-06-01T00:00:00.000Z"
}Codes
Every code links to its dedicated reference page with full handling guidance, an example response body, retry semantics, and a TypeScript snippet.
| Code | HTTP | When | What to do |
|---|---|---|---|
INVALID_URL | 400 | The request body wasn't valid JSON, or the url field was missing or unparseable as a URL. | See handling → |
UNAUTHORIZED | 401 | The Bearer token is missing, malformed, or has been revoked. | See handling → |
PAYMENT_REQUIRED | 402 | You're past the plan cap AND your credit balance is zero. | See handling → |
WAF_BLOCKED | 403 | The target site's WAF or CDN refused the Onto crawler (origin returned 401 / 403). | See handling → |
ROBOTS_BLOCKED | 403 | The target site's robots.txt explicitly disallows GPTBot or a wildcard user-agent. | See handling → |
PLAN_RESTRICTED | 403 | The action requires a higher plan tier — e.g. purchasing credits on the Free tier. | See handling → |
URL_NOT_FOUND | 404 | The target URL returned 404 — the page doesn't exist on the origin. | See handling → |
IMAGE_PDF | 422 | The URL resolved to a PDF that contains only scanned images — no machine-readable text to extract. | See handling → |
RATE_LIMITED | 429 | Monthly request quota exceeded — your plan is hard-capped (typically Free). | See handling → |
CONCURRENT_LIMIT | 429 | Too many in-flight requests for your tier — backoff for ~1 second and retry. | See handling → |
EXTRACTION_FAILED | 500 | The cleaning engine errored mid-process — rare, usually transient. | See handling → |
TIMEOUT | 504 | The target site took longer than 10 seconds to respond. | See handling → |
Always branch on
code, not on message or status code alone. Two 403s (WAF_BLOCKED vs ROBOTS_BLOCKED) mean very different things — the former is "site doesn't like crawlers", the latter is "site explicitly opted out". Two 429s (RATE_LIMITED vs CONCURRENT_LIMIT) need very different retry strategies.Handling
Suggested branching in a Node client:
ts
const r = await fetch(url, opts);
const data = await r.json();
if (!r.ok) {
switch (data.code) {
case 'CONCURRENT_LIMIT':
// Slot is full; retry after the suggested delay (typically 1s).
await sleep((data.retry_after ?? 1) * 1000);
return retry();
case 'RATE_LIMITED':
case 'PAYMENT_REQUIRED':
// Out of plan + credits. Don't retry blindly — escalate.
notifyOps(`Onto quota exhausted: ${data.message}`);
throw new Error(data.message);
case 'ROBOTS_BLOCKED':
// The site asked us not to crawl. Honor it; skip silently.
return null;
case 'WAF_BLOCKED':
case 'URL_NOT_FOUND':
// Origin issue; skip this URL.
return null;
case 'TIMEOUT':
case 'EXTRACTION_FAILED':
// Transient; retry once with fresh=true.
return retry({ fresh: true });
default:
throw new Error(`Onto: ${data.code} — ${data.message}`);
}
}