RAG over a PDF with Weaviate
This workflow allows you to upload a PDF file and ask questions about it using the Question and Answer Chain and the Weaviate Vector Store nodes.
Who it's for
This workflow is the simplest possible implementation of RAG with Weaviate in n8n. It's intended to act as an extendable template for RAG over your own documents.
Prerequisites
- An existing Weaviate cluster. You can view instructions for setting up a local cluster with Docker here or a Weaviate Cloud cluster here.
- API keys to generate embeddings and power chat models. We use OpenAI, but feel free to switch out the models as you like.
- Self-hosted n8n instance. See this video for how to get set up in just three minutes.
How it works
Part 1: Manually upload data
In this example, we manually upload a 100+ page article from arXiv called "A Survey of Large Language Models". But you can replace this with your own more advanced data pipeline, if you wish.
Part 2: Embed and load data into Weaviate collection
Here, we generate embeddings for the full-text of the article and store them in Weaviate.
Part 3: Perform RAG over PDF file with Weaviate
In this part of the workflow, you can enter your query by running the Chat Node and get a RAG response grounded in context via the Question and Answer Chain node.
How to run the workflow
- Go through the prerequisites, creating a Weaviate cluster (can be local or cloud), downloading self-hosted n8n, and adding your API keys and other credentials.
- Select the embedding and chat models you'd like to use.
- Upload a PDF file you want to ask questions about.
- Execute the rest of the workflow.