📺 Full walkthrough video: https://youtu.be/r5kN_la0O7I
Author: Cole Medin
Who it's for
This workflow is for developers, data engineers, and knowledge management teams who need to automatically ingest documents stored in Google Drive into a searchable vector database — supporting RAG (retrieval-augmented generation) pipelines or semantic search applications.
How it works
- One-time setup: A chat trigger runs SQL queries to create the required Postgres tables (
documents, document_metadata, document_rows) and the vector similarity match function in Supabase/Postgres.
- Trigger: Two Google Drive triggers detect newly created or updated files in a watched folder and pass them into a batch loop.
- Clean old data: For each file, stale document rows and vector embeddings are deleted from Supabase before re-processing.
- Metadata upsert & download: Document metadata (ID, title, URL) is upserted into Postgres, then the file binary is downloaded from Google Drive.
- Route by file type: A Switch node directs each file to the correct extractor — PDF, Word/Office document, Excel spreadsheet, or CSV.
- Tabular data storage: Excel and CSV rows are inserted as raw JSONB records into Postgres and aggregated into a summary.
- Embedding & storage: All extracted text (documents, PDFs, tabular summaries) is chunked with a character text splitter, embedded via OpenAI, and inserted into the Supabase vector store.
How to set up
- [ ] Connect Google Drive OAuth2 credentials to the two trigger nodes and the download node
- [ ] Add Supabase credentials to the delete and vector store insert nodes
- [ ] Add Postgres credentials to all Postgres nodes (table creation, metadata upsert, schema update, row insert)
- [ ] Set your OpenAI API key in the
OpenAI Embeddings node
- [ ] Run the setup flow once via the chat trigger to create all database tables and the vector match function
- [ ] Set the Google Drive folder ID to watch in both trigger nodes
- [ ] Tune the Character Text Splitter chunk size and overlap to fit your document sizes
Requirements
- Google Drive account (OAuth2)
- Supabase project with pgvector extension enabled
- Postgres database (can be the Supabase Postgres instance)
- OpenAI API key
How to customize
- Add file types: Extend the Switch node with additional branches (e.g., PowerPoint, plain text) and pair each with an appropriate extractor.
- Swap the embedding model: Replace
text-embedding-3-small with a larger OpenAI model or an alternative provider (e.g., Cohere, Mistral) in the embeddings node.
- Connect a RAG chatbot: Pipe the Supabase vector store into an AI Agent or chain node to build a document Q&A assistant on top of the ingested files.