See llms.txt for all machine-readable content.
Transform raw investment memorandums and financial decks into comprehensive, professional Due Diligence (DD) PDF reports. This workflow automates document parsing via LlamaParse, enriches internal data with real-time web intelligence using Decodo, and utilizes an AI Agent to synthesize structured financial analysis, risk assessments, and investment theses.

| Requirement | Type | Purpose |
|---|---|---|
| n8n instance | Essential | Core automation and workflow orchestration |
| LlamaIndex Cloud | Essential | High-accuracy document parsing (LlamaParse) |
| Pinecone | Essential | Vector database for document and web evidence storage |
| OpenAI API | Essential | LLM for embeddings and expert analysis (Embedding Small & GPT-5.2) |
| Decodo API | Essential | Real-time web searching and markdown scraping |
| R2 Bucket | Essential | Secure storage for the generated PDF reports |
poc).Authorization: Bearer YOUR_KEY).baseUrl to match your S3 bucket's public endpoint or CDN.| Node | Purpose | Key Configuration |
|---|---|---|
| LlamaParse (HTTP) | Document Conversion | Uses the /parsing/upload and /job/result endpoints for high-fidelity markdown |
| Pinecone Vector Store | Context Storage | Implements namespace-based isolation using the unique dealId |
| Decodo Search/Scrape | Web Intelligence | Dynamically identifies the official domain and extracts corporate metadata |
| AI Agent | Strategic Analysis | Configured with a "Senior Investment Analyst" system prompt and 6-step retrieval logic |
| Puppeteer | PDF Generation | Renders the styled HTML report into a print-ready A4 PDF |
The workflow uses a Multi-Query Retrieval strategy. Instead of asking one generic question, the AI Agent is forced to perform six distinct searches against the vector database (Revenue History, Key Risks, etc.). This ensures that even if a document is 100 pages long, the AI doesn't "miss" critical financial tables or risk disclosures buried in the text.
#new-deals Slack channel.| Problem | Cause | Solution |
|---|---|---|
| Parsing Timeout | File is too large for synchronous processing | Increase the "Wait" node duration or check LlamaParse job limits |
| Low Analysis Quality | Insufficient context in documents | Ensure documents are text-based PDFs (not scans) or enable OCR in LlamaParse |
| PDF Layout Broken | CSS incompatibility in Puppeteer | Simplify CSS in the HTML node; avoid complex Flexbox/Grid if Puppeteer version is older |
Challenge: A VC associate receives 20 pitch decks a day and spends hours manually summarizing company profiles.
Solution: This workflow parses the deck and web-scrapes the startup's site to verify claims.
Result: The associate receives a 3-page PDF summary for every deck, allowing them to reject or move forward in seconds.
Challenge: Analyzing a 150-page CIM (Information Memorandum) for specific financial "red flags."
Solution: The AI Agent is programmed to specifically hunt for customer concentration and margin fluctuations.
Result: Consistent risk identification across all deals, regardless of which analyst is assigned to the project.
Created by: Khmuhtadin
Category: Business Intelligence | Tags: Decodo, AI, RAG, Due Diligence, LlamaIndex, Pinecone
Need custom workflows? Contact us
Connect with the creator:
Portfolio • Store • LinkedIn • Medium • Threads