HTTP Request node
Merge node
+5

Easy Image Captioning with Gemini 1.5 Pro

Published 2 months ago

Created by

jimleuk
Jimleuk

Template description

This n8n workflow demonstrates how to automate image captioning tasks using Gemini 1.5 Pro - a multimodal LLM which can accept and analyse images. This is a really simple example of how easy it is to build and leverage powerful AI models in your repetitive tasks.

How it works

  • For this demo, we'll import a public image from a popular stock photography website, Pexel.com, into our workflow using the HTTP request node.
  • With multimodal LLMs, there is little do preprocess other than ensuring the image dimensions fit within the LLMs accepted limits. Though not essential, we'll resize the image using the Edit image node to achieve fast processing.
  • The image is used as an input to the basic LLM node by defining a "user message" entry with the binary (data) type.
  • The LLM node has the Gemini 1.5 Pro language model attached and we'll prompt it to generate a caption title and text appropriate for the image it sees.
  • Once generated, the generated caption text is positioning over the original image to complete the task. We can calculate the positioning relative to the amount of characters produced using the code node.

An example of the combined image and caption can be found here: https://res.cloudinary.com/daglih2g8/image/upload/f_auto,q_auto/v1/n8n-workflows/l5xbb4ze4wyxwwefqmnc

Requirements

  • Google Gemini API Key.
  • Access to Google Drive.

Customising the workflow

  • Not using Google Gemini? n8n's basic LLM node supports the standard syntax for image content for models that support it - try using GPT4o, Claude or LLava (via Ollama).

  • Google Drive is only used for demonstration purposes. Feel free to swap this out for other triggers such as webhooks to fit your use case.

Share Template

More Product workflow templates

Google Sheets node
+5

🚀 Boost your customer service with this WhatsApp Business bot!

This n8n workflow demonstrates how to automate customer interactions and appointment management via WhatsApp Business bot. After submitting a Google Form, the user receives a notification via WhatsApp. These notifications are sent via a template message. In case user sends a message to the bot, the text and user data is stored in Google Sheets. To reply back to the user, fill in the ReplyText column and change the Status to 'Ready'. In a few seconds n8n will fetch the unsent replies and deliver them one by one via WhatsApp Business node. Customize this workflow to fit your specific needs, connect different online services and enhance your customer communication! 🎉 Setup Instructions To get this workflow up and running, you'll need to: 👇 Create a WhatsApp template message on the Meta Business portal. Obtain an Access Token and WhatsApp Business Account ID from the Meta Developers Portal. This is needed for the WhatsApp Business Node to send messages. Set up a WhatsApp Trigger node with App ID and App Secret from the Meta Developers Portal. Right after that copy the WhatsApp Trigger URL and add it as a Callback URL in the Meta Developers Portal. This trigger is needed to receive incoming messages and their status updates. Connect your Google Sheets account for data storage and management. Check out the documentation page. ⚠️ Important Notes WhatsApp allows automatic custom text messages only within 24 hours of the last user message. Outside with time frame only approved template messages can be sent. The workflow uses a Google Sheet to manage form submissions, incoming messages and prepare responses. You can replace these nodes and connect the WhatsApp bot with other systems.
eduard
Eduard
HTTP Request node
Google Drive node
Google Calendar node
+9

Actioning Your Meeting Next Steps using Transcripts and AI

This n8n workflow demonstrates how you can summarise and automate post-meeting actions from video transcripts fed into an AI Agent. Save time between meetings by allowing AI handle the chores of organising follow-up meetings and invites. How it works This workflow scans for the calendar for client or team meetings which were held online. * Attempts will be made to fetch any recorded transcripts which are then sent to the AI agent. The AI agent summarises and identifies if any follow-on meetings are required. If found, the Agent will use its Calendar Tool to to create the event for the time, date and place for the next meeting as well as add known attendees. Requirements Google Calendar and the ability to fetch Meeting Transcripts (There is a special OAuth permission for this action!) OpenAI account for access to the LLM. Customising the workflow This example only books follow-on meetings but could be extended to generate reports or send emails.
jimleuk
Jimleuk
Notion node
Code node
+6

Notion AI Assistant Generator

This n8n workflow template lets teams easily generate a custom AI chat assistant based on the schema of any Notion database. Simply provide the Notion database URL, and the workflow downloads the schema and creates a tailored AI assistant designed to interact with that specific database structure. Set Up Watch this quick set up video 👇 Key Features Instant Assistant Generation**: Enter a Notion database URL, and the workflow produces an AI assistant configured to the database schema. Advanced Querying**: The assistant performs flexible queries, filtering records by multiple fields (e.g., tags, names). It can also search inside Notion pages to pull relevant content from specific blocks. Schema Awareness**: Understands and interacts with various Notion column types like text, dates, and tags for accurate responses. Reference Links**: Each query returns direct links to the exact Notion pages that inform the assistant’s response, promoting transparency and easy access. Self-Validation**: The workflow has logic to check the generated assistant, and if any errors are detected, it reruns the agent to fix them. Ideal for Product Managers**: Easily access and query product data across Notion databases. Support Teams**: Quickly search through knowledge bases for precise information to enhance support accuracy. Operations Teams**: Streamline access to HR, finance, or logistics data for fast, efficient retrieval. Data Teams**: Automate large dataset queries across multiple properties and records. How It Works This AI assistant leverages two HTTP request tools—one for querying the Notion database and another for retrieving data within individual pages. It’s powered by the Anthropic LLM (or can be swapped for GPT-4) and always provides reference links for added transparency.
max-n8n
Max Tkacz

More AI workflow templates

OpenAI Chat Model node
SerpApi (Google Search) node

AI agent chat

This workflow employs OpenAI's language models and SerpAPI to create a responsive, intelligent conversational agent. It comes equipped with manual chat triggers and memory buffer capabilities to ensure seamless interactions. To use this template, you need to be on n8n version 1.50.0 or later.
n8n-team
n8n Team
HTTP Request node
Merge node
+7

Scrape and summarize webpages with AI

This workflow integrates both web scraping and NLP functionalities. It uses HTML parsing to extract links, HTTP requests to fetch essay content, and AI-based summarization using GPT-4o. It's an excellent example of an end-to-end automated task that is not only efficient but also provides real value by summarizing valuable content. Note that to use this template, you need to be on n8n version 1.50.0 or later.
n8n-team
n8n Team
HTTP Request node
Markdown node
+5

AI agent that can scrape webpages

⚙️🛠️🚀🤖🦾 This template is a PoC of a ReAct AI Agent capable of fetching random pages (not only Wikipedia or Google search results). On the top part there's a manual chat node connected to a LangChain ReAct Agent. The agent has access to a workflow tool for getting page content. The page content extraction starts with converting query parameters into a JSON object. There are 3 pre-defined parameters: url** – an address of the page to fetch method** = full / simplified maxlimit** - maximum length for the final page. For longer pages an error message is returned back to the agent Page content fetching is a multistep process: An HTTP Request mode tries to get the page content. If the page content was successfuly retrieved, a series of post-processing begin: Extract HTML BODY; content Remove all unnecessary tags to recude the page size Further eliminate external URLs and IMG scr values (based on the method query parameter) Remaining HTML is converted to Markdown, thus recuding the page lengh even more while preserving the basic page structure The remaining content is sent back to an Agent if it's not too long (maxlimit = 70000 by default, see CONFIG node). NB: You can isolate the HTTP Request part into a separate workflow. Check the Workflow Tool description, it guides the agent to provide a query string with several parameters instead of a JSON object. Please reach out to Eduard is you need further assistance with you n8n workflows and automations! Note that to use this template, you need to be on n8n version 1.19.4 or later.
eduard
Eduard

More Marketing workflow templates

Google Sheets node
HTTP Request node
Merge node
+4

OpenAI GPT-3: Company Enrichment from website content

Enrich your company lists with OpenAI GPT-3 ↓ You’ll get valuable information such as: Market (B2B or B2C) Industry Target Audience Value Proposition This will help you to: add more personalization to your outreach make informed decisions about which accounts to target I've made the process easy with an n8n workflow. Here is what it does: Retrieve website URLs from Google Sheets Extract the content for each website Analyze it with GPT-3 Update Google Sheets with GPT-3 data
lempire
Lucas Perret
Google Sheets node
HTTP Request node
Microsoft Excel 365 node
Gmail node
+5

Automated Web Scraping: email a CSV, save to Google Sheets & Microsoft Excel

How it works: The workflow starts by sending a request to a website to retrieve its HTML content. It then parses the HTML extracting the relevant information The extracted data is storted and converted into a CSV file. The CSV file is attached to an email and sent to your specified address. The data is simultaneously saved to both Google Sheets and Microsoft Excel for further analysis or use. Set-up steps: Change the website to scrape in the "Fetch website content" node Configure Microsoft Azure credentials with Microsoft Graph permissions (required for the Save to Microsoft Excel 365 node) Configure Google Cloud credentials with access to Google Drive, Google Sheets and Gmail APIs (the latter is required for the Send CSV via e-mail node).
mihailtd
Mihai Farcas
HTTP Request node
Merge node
+5

Personalize marketing emails using customer data and AI

This workflow uses AI to analyze customer sentiment from product feedback. If the sentiment is negative, AI will determine whether offering a coupon could improve the customer experience. Upon completing the sentiment analysis, the workflow creates a personalized email templates. This solution streamlines the process of engaging with customers post-purchase, particularly when addressing dissatisfaction, and ensures that outreach is both personalized and automated. This workflow won the 1st place in our last AI contest. Note that to use this template, you need to be on n8n version 1.19.4 or later.
n8n-team
n8n Team

Implement complex processes faster with n8n

red icon yellow icon red icon yellow icon