AI Documentation Crawler & Knowledge Base Builder
This n8n template automatically crawls technical documentation websites, scrapes their content, and converts it into clean, structured, developer-friendly documentation.
Each page is organized into folders and saved as Google Docs, making it easy to build or maintain an internal knowledge base.
Who’s it for
- Developer teams maintaining internal or external documentation
- SaaS companies onboarding users or support teams
- AI builders creating documentation-based knowledge bases
- Anyone who wants to turn raw docs into structured, readable references
How it works / What it does
-
Manual Trigger
- The workflow starts manually whenever you want to crawl or refresh documentation.
-
Documentation Discovery (Crawler)
- The workflow crawls a root documentation URL and generates a sitemap of all discoverable documentation pages.
-
URL Processing
- The sitemap is split into individual URLs.
- The workflow dynamically analyzes URL depth to recreate the documentation hierarchy.
-
Folder Structure Creation
- A parent folder is created in Google Drive for the service.
- Subfolders are automatically generated to mirror the documentation structure (based on URL paths).
-
Content Scraping
- Each documentation page is scraped using the Olostep API.
- Clean markdown content is extracted from the page.
-
Information Extraction
- AI extracts structured technical details such as:
- API summaries
- cURL examples
- Authentication methods
- Key notes and pitfalls
-
AI Documentation Generation
- An AI agent transforms the scraped content into a polished, human-readable API reference or guide.
-
Document Creation
- A Google Doc is created for each documentation page.
- The generated content is inserted into the document and saved in the correct folder.
-
Rate Control
- A wait step prevents API throttling during large documentation crawls.
The result is a fully structured documentation library generated automatically from live documentation websites.
How to set up
- Import the template into your n8n workspace.
- Set the root documentation URL you want to crawl.
- Connect your Google Drive and Google Docs accounts.
- Add your Olostep API key and AI model credentials.
- Execute the workflow to generate your documentation library.
Requirements
- n8n account (cloud or self-hosted)
- Olostep API key
- Google Drive & Google Docs access
- AI model provider (OpenAI or Gemini)
How to customize the workflow
- Limit the number of pages crawled per run.
- Adjust AI prompts to match your documentation style.
- Store results in Notion, Confluence, or Markdown files instead of Google Docs.
- Add vector storage (Pinecone, Supabase) to turn docs into an AI knowledge base.
- Schedule automatic re-crawls to keep documentation up to date.
👉 This template turns complex technical documentation into an organized, searchable knowledge base — automatically.