Back to Templates

Transcribe voice messages and classify intent with OpenAI Whisper and GPT-4o-mini

Created by

Created by: Luis R. || xiaolux
Luis R.

Last update

Last update 3 days ago

Categories

Share


Quick overview

This workflow downloads an audio file, transcribes it with OpenAI Whisper, classifies the transcript intent using OpenAI GPT-4o-mini, and returns a simple response message based on the detected category.

How it works

  1. Runs when you manually execute the workflow.
  2. Sets a sample audio URL (JFK .flac) and downloads the audio file via an HTTP request.
  3. Sends the audio file to OpenAI Whisper to generate a text transcription.
  4. Passes the transcript to OpenAI GPT-4o-mini to classify it as GREETING, QUESTION, REQUEST, or OTHER.
  5. Normalizes the model output to an uppercase intent value and routes execution based on the intent.
  6. Returns a predefined response message for the matched intent branch.

Setup

  1. Add OpenAI API credentials for both the Whisper transcription step and the GPT-4o-mini intent classification step.
  2. Replace the sample audio URL with your own audio source, or swap the manual trigger for a webhook that provides an audio URL.
  3. If you use a different audio format, ensure the downloaded file is a supported type for OpenAI transcription (and adjust the MIME type/value if you rely on it elsewhere).

Customization

  • Connect to any WhatsApp gateway — Evolution API, Twilio, or WhatsApp Cloud API
  • Add custom intent categories to match your business (COMPLAINT, APPOINTMENT, PRICING)
  • Route each intent to a different workflow — CRM update, human escalation, auto-reply
  • Swap GPT-4o-mini for Claude Haiku to reduce costs on high-volume deployments
  • Extend with RAG to give context-aware responses based on your knowledge base

Additional info

This workflow is a simplified extract from a production multi-tenant
WhatsApp AI system handling real customer conversations.

Built with: n8n · OpenAI Whisper · GPT-4o-mini · Evolution API · Docker · Oracle Cloud

Tags: whatsapp, voice, audio, transcription, whisper, intent, classification,
chatbot, ai-agent, automation, openai, gpt4o-mini, customer-support, nlp