Europe/Warsaw
BlogMarch 21, 2026

LLMs are crawling your Next.js site right now

Kacper Siniło
They're downloading your full HTML page - RSC payloads, hydration scripts, font preloads, inline styles, the works - just to pull out a product title, a price, and a description. 26 KB parsed. 101 bytes kept. That's not a failure of the LLM. It's a failure of the server. HTTP solved this decades ago. It's called content negotiation. The client sends an Accept header telling the server what format it wants. The server responds accordingly.
Browser:    Accept: text/html
LLM agent:  Accept: text/markdown
API client: Accept: application/json
Backend developers do this routinely. Express, Django, Rails - they all support it out of the box. Next.js doesn't. So you're stuck with two bad options:
  1. Separate endpoints like /api/products/123.md - duplicates your routing, drifts out of sync, and forces clients to know about a non-standard URL scheme.
  2. Markdown-only pages - breaks the experience for humans.
I built next-md-negotiate to close this gap. Same URL. Same route. Different response based on what the client actually wants.
Browser   → GET /products/42  Accept: text/html     → Normal Next.js page
LLM agent → GET /products/42  Accept: text/markdown → Clean Markdown
No new URLs. No duplicate routing. The client just sets a header. The library hooks into Next.js at the rewrite layer (or middleware, your choice). When a request comes in with Accept: text/markdown:
  1. Request comes in with Accept: text/markdown
  2. If a configured route matches, rewrite to /md-api/...
  3. Catch-all handler runs your Markdown function
  4. Respond with 200 text/markdown
If the header is missing or the route isn't configured, Normal Next.js page renders as usual. It checks for text/markdown, application/markdown, or text/x-markdown in the Accept header. If a configured route matches, the request is internally rewritten to a catch-all API handler that calls your Markdown function. Your browser users never see any of this. They get the same HTML they always did.
Bash
npm install next-md-negotiate
Or scaffold everything automatically:
Bash
npx next-md-negotiate init
Ts
// md.config.ts
import { createMdVersion } from "next-md-negotiate";

export const mdConfig = [
  createMdVersion("/products/[productId]", async ({ productId }) => {
    const product = await db.products.find(productId);
    return `# ${product.name}\n\nPrice: ${product.price}\n\n${product.description}`;
  }),

  createMdVersion("/blog/[slug]", async ({ slug }) => {
    const post = await db.posts.find(slug);
    return `# ${post.title}\n\n${post.content}`;
  }),
];
Parameters are type-safe - { productId } is inferred directly from the [productId] in the pattern.
Ts
// next.config.ts
import { createRewritesFromConfig } from "next-md-negotiate";
import { mdConfig } from "./md.config";

export default {
  async rewrites() {
    return {
      beforeFiles: createRewritesFromConfig(mdConfig),
    };
  },
};
And create the catch-all handler:
Ts
// app/md-api/[...path]/route.ts  (App Router)
import { createMdHandler } from "next-md-negotiate";
import { mdConfig } from "@/md.config";

export const GET = createMdHandler(mdConfig);
That's it. Three files, zero config duplication. 257x smaller payloads. That's the ratio between a typical Next.js HTML response and the equivalent Markdown for a simple product page. For LLMs, this means:
  • Fewer tokens consumed - you're not burning context window on script tags and hydration data
  • Better extraction accuracy - no parsing HTML soup to find the three fields that matter
  • Faster responses - less data over the wire, less processing on the model side
For you, it means:
  • Single source of truth - one URL, one routing layer, multiple representations
  • No drift - your Markdown definitions live next to your page definitions
  • Standard HTTP - any client that sets an Accept header gets the right format
If you already have a middleware.ts handling auth, i18n, or redirects, you can integrate content negotiation there instead of using rewrites:
Ts
// middleware.ts
import { createNegotiatorFromConfig } from "next-md-negotiate";
import { mdConfig } from "./md.config";

const md = createNegotiatorFromConfig(mdConfig);

export function middleware(request: Request) {
  const mdResponse = md(request);
  if (mdResponse) return mdResponse;

  // ...your other middleware logic
}
Same config, same single source of truth. Just a different integration point.
Bash
# Normal HTML
curl http://localhost:3000/products/42

# Markdown for LLMs
curl -H "Accept: text/markdown" http://localhost:3000/products/42
Two commands to see the difference.
The repo is MIT-licensed and works with both App Router and Pages Router. GitHub: github.com/kasin-it/next-md-negotiate
Bash
npm install next-md-negotiate
npx next-md-negotiate init
If you're building anything that LLMs or AI agents interact with, HTTP already has the answer. We just need Next.js to speak the language.