Edge middleware

The runtime piece — detects AI agents, serves clean Markdown, lets humans see the regular site.

What it does

On every request, the Onto middleware:

StepAction
1Inspect the User-Agent. Match against the AI_BOTS registry (GPTBot, ClaudeBot, PerplexityBot, etc.) and the Accept header (text/markdown).
2If not a bot → NextResponse.next() with identifying headers only. Zero impact on humans.
3If a bot → fetch the pre-built /.onto/{route}.md (compiled by onto-next at build time).
4Fire telemetry to /api/track (fire-and-forget; logs the hit + bot name + byte counts).
5Fetch active per-route injection from /api/injections (Pro+ feature) — append to payload.
6Return clean Markdown response with X-Onto-* headers.

Setup

Create middleware.ts at your project root (or update existing):

middleware.ts
import { ontoMiddleware as middleware } from '@ontosdk/next/middleware';

export { middleware };

export const config = {
  matcher: ['/((?!api|_next|.*\\..*).*)'],
};

The matcher excludes /api/*, Next internals, and anything with a dot (static assets, the /.onto/*.md targets) so the middleware doesn't recurse on its own subrequest.

Bot detection

A request is treated as a bot when EITHER:

SignalExamples
User-Agent matches AI_BOTSGPTBot, ClaudeBot, PerplexityBot, Googlebot, Bingbot, anthropic-ai, OAI-SearchBot
Accept: text/markdown headerExplicit content negotiation — used by Claude Code, Cursor, MCP clients, your own agents
?onto query paramDebug — preview the bot view in your browser without changing UA

Response headers

HeaderMeaning
Content-Typetext/markdown; charset=utf-8 (on bot match) or untouched
X-Onto-Botname + company of detected bot, e.g. 'GPTBot (OpenAI)'
X-Onto-Matched'true' when bot/markdown branch hit
X-Onto-Identified'true' on every request that has any detected bot (even non-matched)
X-Onto-TraceThe raw User-Agent we received — useful for debugging
X-Onto-DebugPresent only when ?onto query param was used
Cache-Controlno-store, must-revalidate on bot responses

Debugging

Three ways to verify the middleware is working:

bash
# 1) Browser: add ?onto to any page URL
https://yoursite.com/pricing?onto

# 2) Terminal: bot UA
curl -sI -A 'GPTBot/1.0' https://yoursite.com/pricing

# 3) Terminal: explicit content negotiation
curl -sI -H 'Accept: text/markdown' https://yoursite.com/pricing
Look for X-Onto-Matched: true + Content-Type: text/markdown. If you see X-Vercel-Cache: HIT, the CDN is serving from cache (middleware didn't run, telemetry didn't fire). Cache headers on our responses say no-store but Vercel sometimes caches static rewrites independently.