Automates the process of generating, storing, and publishing engaging LinkedIn posts derived from books (PDFs) using AI and vector search.
🧠 Overview
This workflow:
- Watches a Google Drive folder for new or updated book PDFs.
- Extracts and embeds the content using OpenAI.
- Stores the data in a Pinecone vector database.
- Uses a LangChain agent to generate post ideas.
- Creates concise LinkedIn posts with hook, insight, CTA.
- Updates a Google Sheet and posts to LinkedIn.
🛠 Workflow Breakdown
📥 1. Google Drive Trigger
- Trigger: Watches a folder for new or updated PDF files.
- Action: Downloads the updated PDF.
📄 2. Extract and Embed Content
- Extract from File: Parses PDF to extract text.
- Text Splitter: Breaks text into chunks.
- Embeddings (OpenAI): Converts chunks into vector embeddings.
- Pinecone Vector Store: Saves the embeddings with the book name as namespace.
🧠 3. Post Idea Generation (LangChain Agent)
- Uses a prompt to:
- Search Pinecone DB
- Extract insights
- Format into 5 LinkedIn post ideas with:
- Memory buffer and structured output parser are used for clean AI interaction.
✍️ 4. Post Creation
- Each idea is:
- Split
- Rewritten with a GPT model prompt to match LinkedIn tone
- Styled for under 600 characters
- Includes emojis, hashtags, and tone guidelines
📊 5. Google Sheet Integration
- Saves all generated posts to a Google Sheet.
- Marks status:
"published"
or "no"
.
🔁 6. Scheduled Publishing
- Every day:
- Pulls an unpublished post
- Publishes it to LinkedIn
- Updates the post's status and timestamp in the Google Sheet
⚙️ Setup Guide
📂 Google Drive
- Create a folder for book PDFs
- Connect your Google Drive account to n8n
- Provide access token with file read permission
📊 Google Sheets
- Create a Google Sheet with columns:
bookname
, hook
, insight
, cta
, postContent
, published
, date
- Add credentials in n8n with read/write permission
🧠 Pinecone
- Set up a Pinecone project and index (
linkdenpost
)
- Namespace will be auto-named using the book filename
🔑 API Credentials Required
- OpenAI API (for embeddings and post generation)
- Pinecone API (for vector storage and retrieval)
- LinkedIn OAuth2 (to publish posts)
- Google Drive & Sheets credentials
🔁 Flow Summary
graph TD
A[Google Drive Trigger] --> B[Download PDF]
B --> C[Extract Text]
C --> D[Text Splitter]
D --> E[Create Embeddings]
E --> F[Pinecone Vector Store]
F --> G[LangChain Agent]
G --> H[Structured Output (5 Post Ideas)]
H --> I[Split Ideas]
I --> J[Format as LinkedIn Post (GPT)]
J --> K[Store in Google Sheet]
L[Schedule Trigger] --> M[Get Unpublished Post]
M --> N[Post to LinkedIn]
N --> O[Mark as Published]
🧪 Prompt Example (Used in LangChain Agent)
You are a content strategist. Search the Pinecone vector DB containing a book. Generate 5 unique LinkedIn post ideas with:
- A Hook (curiosity driven)
- Insight (summary < 100 words)
- CTA ("Agree or disagree?", etc.)
Respond in structured JSON:
[
{ "Hook": "...", "Insight": "...", "CTA": "..." },
...
]
✅ Output Sample
{
"Hook": "Why your lab's results might be invalid 😱",
"Insight": "ISO/IEC 17025 stresses that labs must plan and address risks to impartiality and validity.",
"CTA": "Does your lab audit for these risks?"
}
📆 Schedule Control
- Uses Schedule Trigger to post daily at a set time.
- Ensures automation with LinkedIn and accurate Google Sheet syncing.
📝 Notes
- Posts remain professional and concise for a LinkedIn audience
- Works with any PDF book
- Supports multi-book pipelines
- You can filter and tag books by filename or folder for segmenting post styles