Quick Overview
This workflow reindexes Markdown documentation into a Supabase Postgres pgvector table by fetching source docs from an HTTP API, chunking and embedding them via a Supabase Edge Function, upserting the vectors, and deleting stale chunks on a daily schedule or on-demand webhook.
How it works
- Runs daily on a schedule or triggers when a POST request hits the webhook endpoint.
- Fetches Markdown sources (for example FAQ and blog posts) from a configured HTTP API endpoint.
- Strips frontmatter, splits content on H2 sections, chunks long sections with overlap, and batches chunks for embedding.
- For each batch, calls a Supabase Edge Function to generate embeddings for the chunk texts.
- Upserts each chunk’s source, index, content, and pgvector embedding into a Supabase Postgres
rag_chunks table using conflict updates.
- After processing batches, deletes rows in
rag_chunks whose updated_at timestamp is older than the current run to remove stale chunks.
Setup
- Add an HTTP Header Auth credential for the sources API request and for calling the Supabase Edge Function.
- Add Supabase Postgres credentials with access to the database where the vector table lives.
- Create a
public.rag_chunks table with a pgvector embedding column (matching your model’s dimensions) and a primary key on (source, chunk_idx).
- Update the sources API URL, the Supabase Edge Function
/embed URL, and (optionally) the daily schedule time and batch/chunk sizing constants to match your environment.