This tutorial explains how to build the backend workflow in n8n that indexes YouTube video transcripts into a Pinecone vector database. Note: This workflow handles the processing and indexing of transcripts onlyโthe retrieval agent (which searches these embeddings) is implemented separately.
This backend workflow performs the following tasks:
Fetch Video Records from Airtable ๐ฅ
Retrieves video URLs and related metadata.
Scrape YouTube Transcripts Using Apify ๐ฌ
Triggers an Apify actor to scrape transcripts with timestamps from each video.
Update Airtable with Transcript Data ๐
Stores the fetched transcript JSON back in Airtable linked via video ID.
Process & Chunk Transcripts โ๏ธ
Parses the transcript JSON, converts "mm:ss" timestamps to seconds, and groups entries into meaningful chunks. Each chunk is enriched with metadataโsuch as video title, description, start/end timestamps, and a direct URL linking to that video moment.
Generate Embeddings & Index in Pinecone ๐พ
Uses OpenAI to create vector embeddings for each transcript chunk and indexes them in Pinecone. This enables efficient semantic searches later by a separate retrieval agent.
Airtable Search Node:
url
and metadata) from your Airtable base.Loop Over Items:
Trigger Apify Actor:
https://api.apify.com/v2/acts/topaz_sharingan~youtube-transcript-scraper-1/runs?token=<YOUR_TOKEN>
{
"includeTimestamps": "Yes",
"startUrls": ["{{ $json.url }}"]
}
Wait for Processing:
Retrieve Transcript Data:
https://api.apify.com/v2/acts/topaz_sharingan~youtube-transcript-scraper-1/runs/last/dataset/items?token=<YOUR_TOKEN>
Format Transcript Data:
const jsonObject = items[0].json;
const jsonString = JSON.stringify(jsonObject, null, 2);
return { json: { stringifiedJson: jsonString } };
Extract the Video ID:
{{$json.url.split('v=')[1].split('&')[0]}}
Update Airtable Record:
Retrieve Updated Records:
Parse and Chunk Transcripts:
https://youtube.com/watch?v=VIDEOID&t=XXs
).Enrich & Split Text:
Generate Embeddings:
Index in Pinecone:
"videos"
)."transcripts"
).This backend workflow is dedicated to processing and indexing YouTube video transcripts so that a separate retrieval agent can perform efficient semantic searches. With this setup:
Transcripts Are Indexed:
Chunks of transcripts are enriched with metadata and stored as vector embeddings.
Instant Topic Retrieval:
A retrieval agent (implemented separately) can later query Pinecone to find the exact moment in a video where a topic is discussed, thanks to the direct URL and metadata stored with each chunk.
Scalable & Modular:
The separation between indexing and retrieval allows for easy updates and scalability.
Happy automating and enjoy building powerful search capabilities with your YouTube content! ๐