Automate Image Validation Tasks using AI Vision

Created by

Jimleuk

Last update

Last update a year ago

How it works

Our set of portaits are jpg files downloaded from our Google Drive using the Google Drive node.
Each image is resized using the Edit Image node to ensure a balance between resolution and processing speed.
Using the Basic LLM node, we'll define a "user message" option with the type of binary (data). This will allow us to pass our portrait to the LLM as an input.
With our prompt containing the criteria pulled off the passport photo requirements webpage, the LLM is able to validate the photo does or doesn't meet its criteria.
A structured output parser is used to structure the LLM's response to a JSON object which has the "is_valid" boolean property. This can be useful to further extend the workflow.

Not using Gemini? n8n's LLM node works with any compatible multimodal LLM so feel free to swap Gemini out for OpenAI's GPT4o or Antrophic's Claude Sonnet.
Don't need to validate portraits? Try other use cases such as document classification, security footage analysis, people tagging in photos and more.