Error codes

Every error returns JSON with a stable code and a human-readable message.

Error shape

All Read API errors share the same body. The HTTP status code mirrors the standard meaning; the code field is what you should branch on in code (messages can change wording without notice).

json
{
  "status": "error",
  "code": "RATE_LIMITED",
  "message": "Monthly quota exceeded",
  "retry_after": "2026-06-01T00:00:00.000Z"
}

Codes

Every code links to its dedicated reference page with full handling guidance, an example response body, retry semantics, and a TypeScript snippet.

CodeHTTPWhenWhat to do
INVALID_URL400The request body wasn't valid JSON, or the url field was missing or unparseable as a URL.See handling →
UNAUTHORIZED401The Bearer token is missing, malformed, or has been revoked.See handling →
PAYMENT_REQUIRED402You're past the plan cap AND your credit balance is zero.See handling →
WAF_BLOCKED403The target site's WAF or CDN refused the Onto crawler (origin returned 401 / 403).See handling →
ROBOTS_BLOCKED403The target site's robots.txt explicitly disallows GPTBot or a wildcard user-agent.See handling →
PLAN_RESTRICTED403The action requires a higher plan tier — e.g. purchasing credits on the Free tier.See handling →
URL_NOT_FOUND404The target URL returned 404 — the page doesn't exist on the origin.See handling →
IMAGE_PDF422The URL resolved to a PDF that contains only scanned images — no machine-readable text to extract.See handling →
RATE_LIMITED429Monthly request quota exceeded — your plan is hard-capped (typically Free).See handling →
CONCURRENT_LIMIT429Too many in-flight requests for your tier — backoff for ~1 second and retry.See handling →
EXTRACTION_FAILED500The cleaning engine errored mid-process — rare, usually transient.See handling →
TIMEOUT504The target site took longer than 10 seconds to respond.See handling →
Always branch on code, not on message or status code alone. Two 403s (WAF_BLOCKED vs ROBOTS_BLOCKED) mean very different things — the former is "site doesn't like crawlers", the latter is "site explicitly opted out". Two 429s (RATE_LIMITED vs CONCURRENT_LIMIT) need very different retry strategies.

Handling

Suggested branching in a Node client:

ts
const r = await fetch(url, opts);
const data = await r.json();

if (!r.ok) {
  switch (data.code) {
    case 'CONCURRENT_LIMIT':
      // Slot is full; retry after the suggested delay (typically 1s).
      await sleep((data.retry_after ?? 1) * 1000);
      return retry();

    case 'RATE_LIMITED':
    case 'PAYMENT_REQUIRED':
      // Out of plan + credits. Don't retry blindly — escalate.
      notifyOps(`Onto quota exhausted: ${data.message}`);
      throw new Error(data.message);

    case 'ROBOTS_BLOCKED':
      // The site asked us not to crawl. Honor it; skip silently.
      return null;

    case 'WAF_BLOCKED':
    case 'URL_NOT_FOUND':
      // Origin issue; skip this URL.
      return null;

    case 'TIMEOUT':
    case 'EXTRACTION_FAILED':
      // Transient; retry once with fresh=true.
      return retry({ fresh: true });

    default:
      throw new Error(`Onto: ${data.code} — ${data.message}`);
  }
}