Quick overview
This workflow downloads an audio file, transcribes it with OpenAI Whisper, classifies the transcript intent using OpenAI GPT-4o-mini, and returns a simple response message based on the detected category.
How it works
- Runs when you manually execute the workflow.
- Sets a sample audio URL (JFK .flac) and downloads the audio file via an HTTP request.
- Sends the audio file to OpenAI Whisper to generate a text transcription.
- Passes the transcript to OpenAI GPT-4o-mini to classify it as GREETING, QUESTION, REQUEST, or OTHER.
- Normalizes the model output to an uppercase intent value and routes execution based on the intent.
- Returns a predefined response message for the matched intent branch.
Setup
- Add OpenAI API credentials for both the Whisper transcription step and the GPT-4o-mini intent classification step.
- Replace the sample audio URL with your own audio source, or swap the manual trigger for a webhook that provides an audio URL.
- If you use a different audio format, ensure the downloaded file is a supported type for OpenAI transcription (and adjust the MIME type/value if you rely on it elsewhere).
Customization
- Connect to any WhatsApp gateway — Evolution API, Twilio, or WhatsApp Cloud API
- Add custom intent categories to match your business (COMPLAINT, APPOINTMENT, PRICING)
- Route each intent to a different workflow — CRM update, human escalation, auto-reply
- Swap GPT-4o-mini for Claude Haiku to reduce costs on high-volume deployments
- Extend with RAG to give context-aware responses based on your knowledge base
Additional info
This workflow is a simplified extract from a production multi-tenant
WhatsApp AI system handling real customer conversations.
Built with: n8n · OpenAI Whisper · GPT-4o-mini · Evolution API · Docker · Oracle Cloud
Tags: whatsapp, voice, audio, transcription, whisper, intent, classification,
chatbot, ai-agent, automation, openai, gpt4o-mini, customer-support, nlp