Edge middleware
The runtime piece — detects AI agents, serves clean Markdown, lets humans see the regular site.
What it does
On every request, the Onto middleware:
| Step | Action |
|---|---|
| 1 | Inspect the User-Agent. Match against the AI_BOTS registry (GPTBot, ClaudeBot, PerplexityBot, etc.) and the Accept header (text/markdown). |
| 2 | If not a bot → NextResponse.next() with identifying headers only. Zero impact on humans. |
| 3 | If a bot → fetch the pre-built /.onto/{route}.md (compiled by onto-next at build time). |
| 4 | Fire telemetry to /api/track (fire-and-forget; logs the hit + bot name + byte counts). |
| 5 | Fetch active per-route injection from /api/injections (Pro+ feature) — append to payload. |
| 6 | Return clean Markdown response with X-Onto-* headers. |
Setup
Create middleware.ts at your project root (or update existing):
middleware.ts
import { ontoMiddleware as middleware } from '@ontosdk/next/middleware';
export { middleware };
export const config = {
matcher: ['/((?!api|_next|.*\\..*).*)'],
};The matcher excludes /api/*, Next internals, and anything with a dot (static assets, the /.onto/*.md targets) so the middleware doesn't recurse on its own subrequest.
Bot detection
A request is treated as a bot when EITHER:
| Signal | Examples |
|---|---|
| User-Agent matches AI_BOTS | GPTBot, ClaudeBot, PerplexityBot, Googlebot, Bingbot, anthropic-ai, OAI-SearchBot |
| Accept: text/markdown header | Explicit content negotiation — used by Claude Code, Cursor, MCP clients, your own agents |
| ?onto query param | Debug — preview the bot view in your browser without changing UA |
Response headers
| Header | Meaning |
|---|---|
| Content-Type | text/markdown; charset=utf-8 (on bot match) or untouched |
| X-Onto-Bot | name + company of detected bot, e.g. 'GPTBot (OpenAI)' |
| X-Onto-Matched | 'true' when bot/markdown branch hit |
| X-Onto-Identified | 'true' on every request that has any detected bot (even non-matched) |
| X-Onto-Trace | The raw User-Agent we received — useful for debugging |
| X-Onto-Debug | Present only when ?onto query param was used |
| Cache-Control | no-store, must-revalidate on bot responses |
Debugging
Three ways to verify the middleware is working:
bash
# 1) Browser: add ?onto to any page URL
https://yoursite.com/pricing?onto
# 2) Terminal: bot UA
curl -sI -A 'GPTBot/1.0' https://yoursite.com/pricing
# 3) Terminal: explicit content negotiation
curl -sI -H 'Accept: text/markdown' https://yoursite.com/pricingLook for
X-Onto-Matched: true + Content-Type: text/markdown. If you see X-Vercel-Cache: HIT, the CDN is serving from cache (middleware didn't run, telemetry didn't fire). Cache headers on our responses say no-store but Vercel sometimes caches static rewrites independently.