Build a RAG System with Automatic Citations using Qdrant, Gemini & OpenAI

Created by

Davide

Last update

Last update a month ago

Create Qdrant Collection
A REST API node creates a new collection in Qdrant with specified vector size (1536) and cosine similarity.
Load Files from Google Drive
The workflow lists all files in a Google Drive folder, downloads them as plain text, and loops through each.
Text Preprocessing & Embedding
- Documents are split into chunks (500 characters, with 50-character overlap).
- Embeddings are created using OpenAI embeddings (text-embedding-3-small assumed).
- Metadata (file name and ID) is attached to each chunk.
Store in Qdrant
All vectors, along with metadata, are inserted into the Qdrant collection.
Chat Input & Retrieval
- When a chat message is received, the question is embedded and matched against Qdrant.
- Top 5 relevant document chunks are retrieved.
- A Gemini model is used to generate the answer based on those sources.
Source Aggregation & Response
- File IDs and names are deduplicated.
- The AI response is combined with a list of cited documents (filenames).
- Final output:
```
AI Response

Sources: ["Document1", "Document2"]
```

End-to-end Automation: From document ingestion to chat response generation, fully automated with no manual steps.
Scalable Knowledge Base: Easy to expand by simply adding files to the Google Drive folder.
Traceable Responses: Each answer includes its source files, increasing transparency and trustworthiness.
Modular Design: Each step (embedding, storage, retrieval, response) is isolated and reusable.
Multi-provider AI: Combines OpenAI (for embeddings) and Google Gemini (for chat), optimizing performance and flexibility.
Secure & Customizable: Uses API credentials and configurable chunk size, collection name, etc.

Document Processing & Vectorization
- The workflow retrieves documents from a specified Google Drive folder.
- Each file is downloaded, split into chunks (using a recursive text splitter), and converted into embeddings via OpenAI.
- The embeddings, along with metadata (file ID and name), are stored in a Qdrant vector database under the collection negozio-emporio-verde.
Query Handling & Response Generation
- When a user submits a chat message, the workflow:
  - Embeds the query using OpenAI.
  - Retrieves the top 5 relevant document chunks from Qdrant.
  - Uses Google Gemini to generate a response based on the retrieved context.
  - Aggregates and deduplicates the source file names from the retrieved chunks.
- The final output includes both the AI-generated response and a list of source documents (e.g., Sources: ["FAQ.pdf", "Policy.txt"]).

Configure Qdrant Collection
- Replace QDRANTURL and COLLECTION in the "Create collection" HTTP node to initialize the Qdrant collection with:
  - Vector size: 1536 (OpenAI embedding dimension).
  - Distance metric: Cosine.
- Ensure the "Clear collection" node is configured to reset the collection if needed.
Google Drive & OpenAI Integration
- Link the Google Drive node to the target folder (Test Negozio in this example).
- Verify OpenAI and Google Gemini API credentials are correctly set in their respective nodes.
Metadata & Output Customization
- Adjust the "Aggregate" and "Response" nodes if additional metadata fields are needed.
- Modify the "Output" node to format the response (e.g., changing Sources: {{...}} to match your preferred style).
Testing
- Trigger the workflow manually to test document ingestion.
- Use the chat interface to verify responses include accurate source attribution.

Note: Replace placeholder values (e.g., QDRANTURL) with actual endpoints before deployment.

Contact me for consulting and support or add me on Linkedin.

There’s nothing you can’t automate with n8n