Extract embedded images from Google Drive documents with VLM run agent

Created by

Last update

Last update 11 days ago

⚙️ What This Workflow Does

This workflow automates the process of extracting images from uploaded documents in Google Drive using the VLM Run Execute Agent, then downloads and saves those extracted images into a designated Drive folder.

🧩 Requirements

Google Drive OAuth2 credentials
VLM Run API credentials with Execute Agent access
A reachable n8n Webhook URL (e.g., /image-extract-via-agent)

⚡Quick Setup

Configure Google Drive OAuth2 and create upload folder and folder for saving extracted images.
Install the verified VLM Run node by searching for VLM Run in the node list, then click Install. Once installed, you can start using it in your workflows.
Add VLM Run API credentials for document parsing.

⚙️ How It Works

Monitor Uploads – The workflow watches a specific Google Drive folder for new file uploads (e.g., receipts, reports, or PDFs).
Download File – When a file is created, it’s automatically downloaded in binary form.
Extract Images (VLM Run) – The file is sent to the VLM Run Execute Agent, which analyzes the document and extracts image URLs via its callback.
Receive Image Links (Webhook) – The workflow’s Webhook node listens for the agent’s response containing extracted image URLs.
Split & Download – The Split Out node processes each extracted link, and the HTTP Request node downloads each image.
Save Image – Finally, each image is uploaded to your chosen Google Drive folder for storage or further processing.

💡Why Use This Workflow

Manual image extraction from PDFs and scanned files is repetitive and error-prone.
This pipeline automates it using VLM Run, a vision-language AI service that:

Understands document layout and structure
Handles multi-page and mixed-content files
Extracts accurate image data with minimal setup. For example- the output contains URLs to extracted images

{
  "image_urls": [
    "https://vlm.run/api/files/img1.jpg",
    "https://vlm.run/api/files/img2.jpg"
  ]
}

Works with both images and PDFs

🧠 Perfect For

Extracting photos or receipts from multi-page PDFs
Archiving embedded images from reports or invoices
Preparing image datasets for labeling or ML model training

🛠️ How to Customize

You can extend this workflow by:
Adding naming conventions or folder structures based on upload type
Integrating Slack/Email notifications when extraction completes
Including metadata logging (file name, timestamp, source) into Google Sheets or a database
Chaining with classification or OCR workflows using VLM Run’s other agents

⚠️ Community Node Disclaimer

This workflow uses community nodes (VLM Run) that may need additional permissions and custom setup.