Skip to content

What is llms.txt?

Understand the llms.txt proposed standard: its format, its purpose for large language models, and how it differs from robots.txt and sitemap.xml.

llms.txt is a proposed standard for a single Markdown file, placed at the root of your site, that gives large language models a curated map of your documentation. Jeremy Howard of Answer.AI proposed it on 3 September 2024. This guide explains what the file is, what problem it solves, and how it relates to standards you already know.

When an AI assistant answers a question about your product, it often needs to read your documentation at inference time. But a full documentation site is hard for a model to consume: navigation, scripts, and styling wrap each page, and the whole site is far too large to fit in a model’s context window.

llms.txt addresses this by offering a clean, curated index. Instead of crawling and parsing your entire site, a model can read one structured file that points to the pages that matter, each described in plain language.

How llms.txt differs from robots.txt and sitemap.xml

Section titled “How llms.txt differs from robots.txt and sitemap.xml”

llms.txt is often confused with two files it sits alongside. They serve different purposes.

FileAudiencePurpose
robots.txtCrawlersControls which paths a crawler may access.
sitemap.xmlSearch enginesLists every indexable URL for discovery.
llms.txtLarge language modelsProvides a curated, human-readable overview for use at inference time.

The distinction is curation. sitemap.xml aims to be complete and lists all of your pages. llms.txt is selective: it highlights the content most useful to a model and leaves out the rest.

The specification defines a precise structure built from standard Markdown, so both people and programs can read it:

  1. An H1 heading with the name of the project or site. This is the only required element.
  2. A blockquote with a short summary of the project, capturing the key information needed to understand the rest of the file.
  3. Zero or more Markdown sections of free-form detail (paragraphs, lists) with no headings.
  4. Zero or more H2 sections, each containing a list of links. Each list item is a Markdown link, optionally followed by a colon and a note.

A minimal example:

# Example project
> A short description of what the project does and who it is for.
Additional context about the project in plain prose, with no heading.
## Docs
- [Quickstart](https://example.com/quickstart.md): Get running in five minutes
- [API reference](https://example.com/api.md): Full endpoint reference
## Optional
- [Changelog](https://example.com/changelog.md): Release history

Two related conventions often accompany an llms.txt file:

  • Markdown versions of pages. The proposal recommends serving a clean .md version of each page at the same URL with .md appended (for example, quickstart.md). The links in llms.txt point to these Markdown versions so models receive content without site chrome. See Serve a Markdown version of every page.
  • llms-full.txt. Many tools also generate an llms-full.txt file that inlines the full text of the documentation in one file, rather than linking out. This is a widely adopted companion to llms.txt, not part of the original specification.

llms.txt is a community proposal, not an official or ratified standard, and adoption is still evolving.

  • A growing number of documentation platforms generate the file automatically, and many sites now publish one.
  • Support among AI vendors is not universal, and there is ongoing public debate about how many major systems actually read llms.txt today versus crawling sites directly.