GEO

What Is llms.txt? The Proposed AI Content Standard

llms.txt is a proposed Markdown file that gives AI systems a curated guide to your key content. Learn what it is, who actually uses it, and whether you need one.

Share

What Is llms.txt? The Proposed AI Content Standard

llms.txt is a proposed convention — a plain Markdown file placed at a website's root (/llms.txt) — that gives AI systems a curated, human-readable guide to a site's most important content, often with a short authoritative summary of the brand. Introduced in September 2024 by Jeremy Howard of Answer.AI, it aims to help language models find and understand the right pages without wading through cluttered HTML. It's important to be clear up front: as of 2026, llms.txt is a community convention, not a formally adopted standard, and its real-world usefulness is narrower than much of the hype suggests.

This guide explains what llms.txt is, how it differs from robots.txt, the honest state of its adoption, whether AI engines actually use it, and whether you should add one.

What is llms.txt and what's in it?

An llms.txt file is written in Markdown because that's clean and easy for language models to parse. A typical file starts with the site or project name, a one-to-three sentence summary that acts as the model's "mental model" of the brand, and then curated links to key pages with short descriptions. A companion format, llms-full.txt, bundles fuller content into one document for deeper ingestion. The pitch is token efficiency: instead of parsing navigation, scripts and ads, an AI reads a tidy roadmap of what matters.

How is llms.txt different from robots.txt?

robots.txtllms.txt
PurposeTells crawlers what they may or may not accessPoints AI systems to your most important content
ToneRestrictive ("don't go here")Curatorial ("here's the good stuff")
StatusLong-established, widely respected standardCommunity convention, not formally standardized
FormatPlain directivesMarkdown

A useful way to hold it: robots.txt is about access control, while llms.txt is about guidance. They should be kept consistent with each other, since contradicting the two is a common mistake.

What's the honest state of adoption?

Adoption is real but modest, and concentrated in specific sectors. Studies across large samples of domains have found roughly one in ten sites publishing an llms.txt — figures commonly cited in the 5–15% range. Early adopters were technically sophisticated, developer-facing companies (Stripe, Vercel, Cloudflare, Anthropic and similar), and CMS tooling like Yoast and Webflow has begun making it easy to generate one. But adoption among large mainstream publishers remains low, and there's no central governing body — it's maintained as a community project rather than ratified by a standards organization.

Do AI engines actually use llms.txt?

This is where honesty matters most. As of 2026, no major AI search engine has publicly confirmed that it reads llms.txt as a first-class input for choosing what to cite in answers, and independent analyses have found little to no measurable improvement in AI-search citation from having one. Server-log studies show AI search and answer crawlers rarely request the file at all. Where llms.txt clearly does work is a different layer: developer and IDE agents — tools like Cursor, GitHub Copilot and other coding assistants — and documentation platforms routinely fetch /llms.txt to orient themselves before pulling specific pages. So the realistic read is that llms.txt is a developer-experience and "agentic web" signal today, not a proven lever for ChatGPT, Gemini, Claude or Perplexity citations.

Should you add an llms.txt?

For most sites, llms.txt is best understood as a low-cost, low-downside bet with real value in narrow cases and optionality for the future. It's worth doing if you publish developer documentation or have a plausible agent-driven use case (agentic commerce, vendor research, comparison), since those agents genuinely consume it. Even otherwise, creating one takes little time, gives you a clean human- and machine-readable brand summary, and positions you if a major AI platform later adopts it. What it is not is a substitute for the things that actually drive AI visibility today — retrievable, authoritative, well-structured content. Ship it if it's cheap, keep it consistent with robots.txt, and don't expect a citation spike. [Editor: Cliro tie-in — auditing llms.txt presence/quality alongside the signals that truly move citations; add a measured note.]

llms.txt checklist

  1. Understand its status: a community convention, not a formal standard.
  2. Prioritize it if you're developer-facing or have an agent use case.
  3. Write a clear brand summary and curate links to key pages.
  4. Keep it consistent with robots.txt to avoid contradictions.
  5. Don't expect AI-search citation gains — invest in real content too.
  6. Revisit if a major AI platform formally adopts it.

Frequently asked questions

What is llms.txt?

llms.txt is a proposed convention: a Markdown file at a website's root that gives AI systems a curated guide to the site's most important content, usually with a short brand summary. It was introduced in September 2024 by Jeremy Howard of Answer.AI.

Is llms.txt an official standard?

No. As of 2026 it is a community convention maintained as an open project, not a standard ratified by a body like the IETF or W3C. Adoption is real but modest, commonly cited in the 5–15% range of sites.

How is llms.txt different from robots.txt?

robots.txt controls what crawlers may access (restrictive); llms.txt points AI systems to your most important content (curatorial). robots.txt is a long-established standard, while llms.txt is a newer community convention.

Do AI search engines use llms.txt?

No major AI search engine has confirmed using it as a first-class input for citation, and studies show little measurable citation gain. Its confirmed value is with developer and IDE agents and documentation platforms that fetch it to orient themselves.

Should I add an llms.txt file?

It's a low-cost, low-downside bet, most worthwhile for developer-facing sites or those with agent-driven use cases. It's not a substitute for retrievable, authoritative, well-structured content, which is what actually drives AI visibility today.

Federico Ergang

Written by

Federico Ergang

Cliro cofounder & CEO

Federico Ergang is cofounder and CEO of Cliro, the AI visibility and GEO platform for Latin America.

Related articles