Build a RAG system for PDF documents with Google Drive, Unstructured, and OpenAI

Created by

Tomas Lubertino

Last update

Last update a day ago

How it works

Google Drive Trigger detects new files in a selected folder and downloads them.
The files are sent to Unstructured where they are split into smaller pieces (chunks).
The chunks are prepared to be sent to OpenAI where they are converted into vectors (embeddings).
The embeddings are recombined with their original data and the payload is prepared for upsert into the Pinecone index.

In Pinecone, create an index with 1536 dimensions and configure it for text-embedding-3-small.
Copy the host url and paste it on the 'Pinecone Upsert' node. It should look something like this: https://{your-index-name}.pinecone.io/vectors/upsert.
Add Google Drive, OpenAI and Pinecone credentials in n8n.
Point the trigger to your ingest folder (you can use this article for demo).
Click the 'Open chat' button and enter the following: Which Git provider do the authors use?