Much like how robots.txt has helped search engines understand and get direction on how to crawl a website, llms.txt aims to create a standardized way for AI systems to access and interpret web content.
LLMs.txt is a proposed standard by Jeremy Howard that allows website owners to provide LLM-friendly content in a structured format. The standard addresses a critical limitation of large language models: their context windows are often too small to handle entire websites, and converting complex HTML pages with navigation, ads, and JavaScript into LLM-friendly plain text can be challenging and imprecise.
At its core, llms.txt is a markdown file placed in a website's root directory that provides AI models with the following:
Unlike robots.txt, which directs crawlers on how to crawl/interact with a website, llms.txt provides AIs with content and the content is presented. We can:
The file uses simple markdown language, making it both human and LLM-readable. You can view it kind of like a detailed sitemap.
I did create a tool that would flatten pages/websites into an llms.txt file, but there are several tools already to help website owners make their llms.txt files:
The llms.txt standard is gaining traction, with several organizations already implementing llms.txt files on their documentation sites:
Exposure of content to competitors for analysis Technically, a file like this of all your content laid out in markdown format, especially in full, would make it a lot easier for competitors to get a full export of all your content. This has always been possible, but not to the level it would be with a dedicated markdown file.
Potential security risks that require careful management
Some experts remain sceptical about the utility of llms.txt. Brett Tabke, CEO of Pubcon and WebmasterWorld, has argued that existing tools like XML sitemaps and robots.txt already serve similar purposes and that the distinction between search engines and LLMs is increasingly blurred. However, in my opinion, implementing it has no harm unless the costs outweigh the potential gain.