This template monitors a Google Drive folder, converts PDF documents into clean text chunks with Unstructured, generates OpenAI embeddings, and upserts vectors into Pinecone. It’s a practical, production-ready starting point for Retrieval-Augmented Generation (RAG) that you can plug into a chatbot, semantic search, or internal knowledge tools.
How it works
- Google Drive Trigger detects new files in a selected folder and downloads them.
- The files are sent to Unstructured where they are split into smaller pieces (chunks).
- The chunks are prepared to be sent to OpenAI where they are converted into vectors (embeddings).
- The embeddings are recombined with their original data and the payload is prepared for upsert into the Pinecone index.
Set up steps
- In Pinecone, create an index with 1536 dimensions and configure it for
text-embedding-3-small.
- Copy the host url and paste it on the 'Pinecone Upsert' node. It should look something like this: https://{your-index-name}.pinecone.io/vectors/upsert.
- Add Google Drive, OpenAI and Pinecone credentials in n8n.
- Point the trigger to your ingest folder (you can use this article for demo).
- Click the 'Open chat' button and enter the following: Which Git provider do the authors use?