Back to Templates

Implement on-prem RAG with Qdrant and Ollama for a self-hosted KB

Created by

Created by: Mabura Ze Guru || zeguru

Mabura Ze Guru

Last update

Last update 5 hours ago

Share


Try It

This n8n template provides a self hosted RAG implementation.

How it works

  • Provides one workflow to maintain the knowledge base and another one to query the knowledge base.
  • Uploaded documents are saved into the Qdrant vector store.
  • When a query is made, the most relevant documents are retrieved from the vector store and sent to the LLM as context for generating a response.

How to use

  • Start the workflow by clicking Execute workflow
  • Use the file upload form to upload a document into the knowledge base (Qdrant db).
  • Click Open chat to start asking questions related to the uploaded documents.

Setup steps

Below steps show how to setup on Amazon Linux. Consult your OS for respective steps

  • Install Ollama on prem
mkdir ollama
cd ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama --version
  • Install required models ( in Amazon Linux)
 ollama pull llama3:8b
 ollama pull mistral:7b
 ollama pull nomic-embed-text:latest
  • Access ollama via http://localhost:11434
  • Fire up Qdrant (e.g. via docker)
    docker run -p 6333:6333 qdrant/qdrant.
  • Access Qdrant via http://localhost:6333/dashboard
  • Create a Qdrant collection named knowledge-base configured with vector length of 768.
  • NB: Do not forget a persistent docker volume for Qdrant if you want to keep the data when using docker.
  • Point the nodes to the respective on premise Qdrant and Ollama runtimes.

Need Help?

Join the Discord or ask in the Forum!

Happy RAGing!