Back to Templates

Generate LinkedIn Posts from Books using OpenAI, LangChain & Pinecone Vector Search

Last update

Last update 2 days ago

Share


Automates the process of generating, storing, and publishing engaging LinkedIn posts derived from books (PDFs) using AI and vector search.


🧠 Overview

This workflow:

  1. Watches a Google Drive folder for new or updated book PDFs.
  2. Extracts and embeds the content using OpenAI.
  3. Stores the data in a Pinecone vector database.
  4. Uses a LangChain agent to generate post ideas.
  5. Creates concise LinkedIn posts with hook, insight, CTA.
  6. Updates a Google Sheet and posts to LinkedIn.

🛠 Workflow Breakdown

📥 1. Google Drive Trigger

  • Trigger: Watches a folder for new or updated PDF files.
  • Action: Downloads the updated PDF.

📄 2. Extract and Embed Content

  • Extract from File: Parses PDF to extract text.
  • Text Splitter: Breaks text into chunks.
  • Embeddings (OpenAI): Converts chunks into vector embeddings.
  • Pinecone Vector Store: Saves the embeddings with the book name as namespace.

🧠 3. Post Idea Generation (LangChain Agent)

  • Uses a prompt to:
    • Search Pinecone DB
    • Extract insights
    • Format into 5 LinkedIn post ideas with:
      • Hook
      • Insight
      • CTA
  • Memory buffer and structured output parser are used for clean AI interaction.

✍️ 4. Post Creation

  • Each idea is:
    • Split
    • Rewritten with a GPT model prompt to match LinkedIn tone
    • Styled for under 600 characters
    • Includes emojis, hashtags, and tone guidelines

📊 5. Google Sheet Integration

  • Saves all generated posts to a Google Sheet.
  • Marks status: "published" or "no".

🔁 6. Scheduled Publishing

  • Every day:
    • Pulls an unpublished post
    • Publishes it to LinkedIn
    • Updates the post's status and timestamp in the Google Sheet

⚙️ Setup Guide

📂 Google Drive

  • Create a folder for book PDFs
  • Connect your Google Drive account to n8n
  • Provide access token with file read permission

📊 Google Sheets

  • Create a Google Sheet with columns:
    • bookname, hook, insight, cta, postContent, published, date
  • Add credentials in n8n with read/write permission

🧠 Pinecone

  • Set up a Pinecone project and index (linkdenpost)
  • Namespace will be auto-named using the book filename

🔑 API Credentials Required

  • OpenAI API (for embeddings and post generation)
  • Pinecone API (for vector storage and retrieval)
  • LinkedIn OAuth2 (to publish posts)
  • Google Drive & Sheets credentials

🔁 Flow Summary

graph TD
  A[Google Drive Trigger] --> B[Download PDF]
  B --> C[Extract Text]
  C --> D[Text Splitter]
  D --> E[Create Embeddings]
  E --> F[Pinecone Vector Store]
  F --> G[LangChain Agent]
  G --> H[Structured Output (5 Post Ideas)]
  H --> I[Split Ideas]
  I --> J[Format as LinkedIn Post (GPT)]
  J --> K[Store in Google Sheet]
  L[Schedule Trigger] --> M[Get Unpublished Post]
  M --> N[Post to LinkedIn]
  N --> O[Mark as Published]

🧪 Prompt Example (Used in LangChain Agent)

You are a content strategist. Search the Pinecone vector DB containing a book. Generate 5 unique LinkedIn post ideas with:
- A Hook (curiosity driven)
- Insight (summary < 100 words)
- CTA ("Agree or disagree?", etc.)

Respond in structured JSON:
[
  { "Hook": "...", "Insight": "...", "CTA": "..." },
  ...
]

✅ Output Sample

{
  "Hook": "Why your lab's results might be invalid 😱",
  "Insight": "ISO/IEC 17025 stresses that labs must plan and address risks to impartiality and validity.",
  "CTA": "Does your lab audit for these risks?"
}

📆 Schedule Control

  • Uses Schedule Trigger to post daily at a set time.
  • Ensures automation with LinkedIn and accurate Google Sheet syncing.

📝 Notes

  • Posts remain professional and concise for a LinkedIn audience
  • Works with any PDF book
  • Supports multi-book pipelines
  • You can filter and tag books by filename or folder for segmenting post styles