Skip to main content

llms.txt: Should You Add One in 2026? What the 300k-Domain Data Actually Shows.

llms.txt adoption is surging, but the public data on whether it actually lifts AI citations tells a different story than LinkedIn does. Here's the honest read.

Jhonty Barreto

By Jhonty Barreto

Founder of SEO Engico|May 1, 2026|10 min read

llms.txt: Should You Add One in 2026? What the 300k-Domain Data Actually Shows.

llms.txt: Should You Add One in 2026? What the 300k-Domain Data Actually Shows.

Every week a client asks me the same thing. "Should we add llms.txt?"

For the last six months I've been giving the same shrug answer. "Probably won't hurt." That wasn't good enough. So I sat down with the public data, the spec, the major adopters, and what I'm seeing across our agency portfolio, and tried to give a real answer.

This post walks through what llms.txt is, who has actually adopted it, what the largest study to date found across 300,000 domains, and what I'd recommend you do about it. Spoiler: the answer is more boring than the LinkedIn hype suggests.

What is llms.txt?

llms.txt is a markdown file you place at the root of your domain (yoursite.com/llms.txt) that gives large language models a curated, token-efficient map of your most important content. It was proposed by Jeremy Howard at Answer.AI on September 3, 2024, the same Jeremy Howard who co-founded fast.ai and built the practical deep learning courses that trained a generation of ML engineers.

Howard's logic was simple. Most web pages are 90% navigation, ads, scripts, and chrome. LLMs have a finite context window. If you can hand them a clean markdown index of your best content, they don't have to chew through your nav bar to find it.

The official spec at llmstxt.org defines a strict format. An H1 with the site name. An optional blockquote summary. Optional prose context. Then H2 sections containing markdown bullet lists of links, each with a short description after a colon. There's also an "Optional" H2 reserved for content that LLMs can skip when context budget gets tight.

That's it. The whole spec fits on one page.

How llms.txt differs from robots.txt

This is where most articles confuse people. They're not the same thing. They're not even cousins.

robots.txt is permission. It tells crawlers what they're allowed to access. It's exclusion-focused.

llms.txt is recommendation. It tells LLMs what's worth reading. It's curation-focused.

As Carolyn Shelby at Search Engine Land put it bluntly, llms.txt is "a map, not a muzzle." If you're still confused on the older standard, our robots.txt SEO guide walks through what robots.txt actually controls in 2026.

Who has actually adopted llms.txt?

The list of adopters reads impressive on paper. Anthropic, Stripe, Cloudflare, Mintlify, Perplexity, and Vercel all maintain llms.txt files. The llms-txt-hub directory on GitHub indexes over 1,000 known implementations, and the SecretiveShell Awesome-llms-txt repo tracks even more.

Cloudflare went further than most. Their developer docs publish a full llms.txt with product-specific sub-files for Workers, Vectorize, Agents, and the rest of the developer platform. Mintlify auto-generates llms.txt for every customer site, hosting it at both the root and at /.well-known/llms.txt for compatibility.

So the story you read on LinkedIn is: "everyone's adopting it."

The reality from the actual data is more boring. SE Ranking's analysis of about 300,000 domains found 10.13% adoption overall. The Rankability report found just 3 of the top 1,000 websites have one. Most adopters are devtools, SaaS docs, and AI-native companies. The long tail of regular business sites? Almost no one.

That mismatch matters. If the brands AI tools cite most rarely have llms.txt, the file isn't doing the heavy lifting people claim.

What the largest study to date actually found

The most serious public analysis on llms.txt and AI citations comes from SE Ranking. Their team looked at roughly 300,000 domains, segmented by adoption, and ran correlation tests and machine learning models against AI citation frequency on ChatGPT, Perplexity, and Google AI Mode.

Their headline conclusion, published in Search Engine Journal, was direct: "llms.txt doesn't seem to directly impact AI citation frequency. At least not yet."

Three things to note about that finding.

  1. The study controlled for site authority and content quality, so the result isn't simply "big sites get cited more." It's specifically asking whether the file itself adds lift, and the answer was no measurable effect.
  2. The sample skews toward sites that adopted llms.txt early. If anything, the data is biased toward showing a positive effect. It still didn't.
  3. The major platforms have not publicly confirmed they ingest llms.txt as a ranking or retrieval signal. Google has framed AI Mode and AI Overviews as evolutions of Search using existing systems. OpenAI's crawler documentation focuses on robots.txt, not llms.txt. Anthropic publishes their own llms.txt but hasn't confirmed Claude treats third-party llms.txt files specially.

What I'm seeing across our agency portfolio lines up with the SE Ranking finding. None of our clients with llms.txt deployed have reported a clear, isolated citation lift that couldn't be explained by something else they were doing in parallel (publishing more, earning links, fixing schema). The one segment where I do see anecdotal lift is documentation-style SaaS sites, which is the use case llms.txt was designed for.

Where llms.txt does seem to matter

If you read between the lines of the SE Ranking data and Mintlify's customer reports, the same pattern keeps showing up.

SaaS documentation sites see something. Mintlify customers, Stripe-style API docs, dev tool reference content. These are the implementations where Perplexity in particular seems to pull from llms.txt-mapped pages more often. That tracks. llms.txt was built to help AI tools navigate technical documentation, and Perplexity's retrieval is more aggressive on markdown-style content than the other major engines.

Local service businesses, ecommerce stores, and most B2B sites show no measurable lift. AI Mode is going to cite Google Business Profile data, Reddit threads, and review sites long before it reads your llms.txt. For a clearer view of how AI engines actually choose what to cite across platforms, our breakdown of how to get your brand into AI answers covers the levers that actually move the needle. The first-500-words ChatGPT citation study shows where in your content AI tools actually pull quotes from. Hint: it's not your llms.txt.

What an llms.txt file actually looks like

If you've never seen one, here's the format from the official spec.

# SEO Engico

> SEO Engico is an SEO agency that helps businesses get found in Google and AI search platforms.

We focus on technical SEO, AI search optimisation, and link building for B2B and local service businesses.

## Services

- [Technical SEO Audit](https://seoengico.com/services/technical-seo-audit.md): Full technical health audit with crawl, index, and Core Web Vitals analysis.
- [Link Building](https://seoengico.com/link-building.md): White-hat link acquisition for SaaS and service businesses.

## Guides

- [Robots.txt SEO Guide](https://seoengico.com/blog/robots-txt-seo-guide.md): How to optimise robots.txt for crawl budget in 2026.
- [LLM Optimization Guide](https://seoengico.com/blog/llm-optimization-how-to-get-your-brand-into-ai-answers.md): Practitioner guide to AI citations.

## Optional

- [About SEO Engico](https://seoengico.com/about.md): Background on the team and methodology.

A few rules from the spec worth knowing:

  1. The H1 is the only required line.
  2. Each link should have a .md suffix pointing to a clean markdown version of the page (a separate but complementary proposal Howard made in the same announcement).
  3. The "Optional" H2 is special. LLMs can drop it when context is tight.
  4. Keep descriptions short and informational. No marketing fluff.
  5. List your most important content first.

If you run docs on Mintlify, Vercel, or any modern doc platform, llms.txt is generated automatically. You don't have to write it.

Should you implement llms.txt? My actual recommendation

This is where I disappoint everyone selling llms.txt courses.

Implement it if

  • You run technical documentation for a SaaS, API, or developer tool. The Perplexity behaviour is real for this segment, and Anthropic, Stripe, and Cloudflare all do it for a reason.
  • You publish on a platform that generates it automatically. There's zero cost to leaving it on.
  • You want to future-proof. The major platforms might start using it. The cost of having one is essentially nothing.
  • You have structured, evergreen content that AI tools should know about (knowledge bases, glossaries, comprehensive guides).

Don't bother if

  • You run a local service business. AI Mode is going to cite Google Business Profile data, Reddit threads, and review sites long before it reads your llms.txt.
  • You expect a quick citation lift. The public data does not support that expectation.
  • You're trading off actual content work for it. An hour spent improving a pillar page beats ten hours perfecting an llms.txt file every single time.
  • Your dev team is busy with real technical SEO debt. Fix that first.

For a clearer view of what's actually working in AI search across all three platforms, our AI search platform citation strategy piece breaks down the platform-specific tactics that move citations. The get cited in ChatGPT and AI Overviews guide goes into the structural content moves that matter more than any robots-style file.

A 5-step implementation checklist if you do decide to add it

If you've decided to ship llms.txt anyway, here's how to do it without wasting time.

  1. List your top 20 highest-value pages. Service pages, pillar guides, glossary entries, key blog posts. Anything you'd want an AI to cite.
  2. Generate clean markdown versions at the same URL with .md appended. Most CMSs can do this with a small plugin or function.
  3. Write the llms.txt file using the format above. H1 site name, blockquote summary, H2 sections with bullet lists. Use the official spec as your reference.
  4. Deploy at /llms.txt at the root domain. Optionally also at /.well-known/llms.txt for forward compatibility.
  5. Verify with curl: curl -I https://yoursite.com/llms.txt should return a 200, content-type text/plain or text/markdown.

That's a 30-minute job for most sites. If your dev team is quoting you a sprint, something's wrong.

What's actually driving AI citations right now

I'll close with what I'd put my time into instead, based on tracking citations across our agency portfolio over the last 18 months.

  • Pillar content depth. AI tools cite the most comprehensive single page on a topic. Length matters less than coverage.
  • Structured early content. Question-format H2s and concise answers in the first 30% of the page get cited disproportionately.
  • Brand mentions across the open web. Wikipedia, industry publications, Reddit, niche forums. AI models recognise brands by web-wide footprint, not by your own site.
  • Schema markup. Article, FAQ, HowTo, Product. Clean structured data helps every retrieval system, not just Google.
  • Original data and research. AI tools love a defensible statistic. Run a study, publish a report, give them something to quote.

llms.txt is fine. It's free to deploy, it's interesting from a standards perspective, and it might matter more in 18 months than it does today. But if you're spending real budget on it, in 2026, expecting near-term citation gains, the public data isn't on your side and our client data isn't either.

Build the better page first. Add the llms.txt file in the last 30 minutes of the project, not the first.

Ready to grow?

Scale your SEO with proven systems

Get predictable delivery with our link building and content services.