This workflow automatically generates an llms.txt file (following the llmstxt.org specification) for any given website. It uses ScrapegraphAI to crawl and scrape pages, an OpenAI chat model to process content, and finally uploads the generated file via FTP.
llms.txt GenerationThe workflow fully automates the creation of a compliant llms.txt file, eliminating the need for manual documentation and reducing maintenance time.
Using OpenAI and ScrapeGraphAI, the system intelligently analyzes:
This produces a high-quality output specifically optimized for AI systems and LLM indexing.
The crawler automatically extracts all internal links from the website, making the workflow scalable for:
Pages are automatically grouped into meaningful sections such as:
This improves readability and machine interpretability.
The workflow preserves the original language of the website content, ensuring consistency and localization for international projects.
After generation, the workflow converts the output into a .txt file and uploads it directly to an FTP server or CDN, enabling instant deployment without manual intervention.
The entire process — from crawling to publishing — is automated inside n8n, significantly reducing operational effort for SEO teams, developers, and AI optimization workflows.
The generated llms.txt file helps:
The workflow is built with reusable components:
This makes it easy to extend, customize, or integrate into larger automation systems.
The process begins when the workflow is manually triggered. It then:
Scraper tool (via ScrapegraphAI) to scrape each URL’s content.llms.txt) following the official spec.The AI agent is explicitly forbidden from inventing content – it must call the Scraper tool for every URL before describing it. The output is pure Markdown, starting with #.
To use this workflow in n8n, follow these steps:
gpt-5.4-mini – note: this may be a custom/typo; usual models are gpt-4o-mini or gpt-4).Go to Credentials in n8n and add:
ScrapegraphAI API
ScrapegraphAI accountOpenAI API
OpenAi account (Eure)FTP
FTP BunnyCDNIn the Set domain node, change the your_domain to your target domain (e.g., example.com).
Do not include https:// – only the domain name.
In the Wait node, change the amount (default 20) to a higher value if the target site is large or slow to crawl.
In the Upload to FTP node, update the path field.
Currently it is:
=/YOUR_PATH/{{$binary.data.fileName}}
Change YOUR_PATH to the actual remote directory (e.g., /public_html/).
The file will be saved as llms.txt.
The prompt inside the LLMS.txt Agent node can be adapted for:
Check your FTP server for the generated llms.txt.
Test it by opening in a text editor – it should be pure Markdown starting with # Site name.
👉 Subscribe to my new YouTube channel. Here I’ll share videos and Shorts with practical tutorials and FREE templates for n8n.
Contact me for consulting and support or add me on Linkedin.