Try It
This n8n template provides a self hosted RAG implementation.
How it works
- Provides one workflow to maintain the knowledge base and another one to query the knowledge base.
- Uploaded documents are saved into the Qdrant vector store.
- When a query is made, the most relevant documents are retrieved from the vector store and sent to the LLM as context for generating a response.
How to use
- Start the workflow by clicking Execute workflow
- Use the file upload form to upload a document into the knowledge base (Qdrant db).
- Click Open chat to start asking questions related to the uploaded documents.
Setup steps
Below steps show how to setup on Amazon Linux. Consult your OS for respective steps
mkdir ollama
cd ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama --version
- Install required models ( in Amazon Linux)
ollama pull llama3:8b
ollama pull mistral:7b
ollama pull nomic-embed-text:latest
- Access ollama via http://localhost:11434
- Fire up Qdrant (e.g. via docker)
docker run -p 6333:6333 qdrant/qdrant.
- Access Qdrant via
http://localhost:6333/dashboard
- Create a Qdrant collection named
knowledge-base configured with vector length of 768.
- NB: Do not forget a persistent docker volume for Qdrant if you want to keep the data when using docker.
- Point the nodes to the respective on premise Qdrant and Ollama runtimes.
Need Help?
Join the Discord or ask in the Forum!
Happy RAGing!