This n8n template transforms any website into a fully functional RAG-ready chatbot knowledge base. It crawls sites using the AI Training Data Scraper community node, chunks content intelligently, generates embeddings, and stores everything in Pinecone for semantic search-powered conversations. Perfect for turning documentation, blogs, or marketing sites into instant AI chat assistants.
Use cases
Convert documentation sites into intelligent support chatbots
Build product knowledge bases from marketing websites
Create internal search tools from company intranets
Power customer support agents with scraped competitor analysis
Generate training data for fine-tuning company-specific AI models
Good to know
This workflow connects to external services requiring API credentials. Works on n8n Cloud and self-hosted instances. Initial setup takes 10 minutes including community node installation.
Requirements
n8n Cloud or self-hosted instance
Community Node
Apify API key
OpenAI API key
Pinecone account & index
Customising this workflow
Replace Pinecone with Qdrant, Weaviate, or pgvector for self-hosted vector storage
Swap OpenAI embeddings with Ollama or Hugging Face for zero-cost processing
Add content filtering by language, code presence, or section type
Extend with conversation memory using Redis or Postgres
Build a frontend dashboard for managing multiple website indexes
Add multi-site RAG (query across multiple domains simultaneously)