This workflow is designed for developers, content creators, and businesses looking to automate high-quality voice synthesis using AI voice cloning technology.
It automates the process of generating natural-sounding speech from text using a sample voice file, eliminating the need for manual voice recording and providing consistent voice output for applications like audiobooks, virtual assistants, or content localization.
The workflow receives text and voice cloning parameters via webhook, reads a sample voice file from your storage, sends the data to Zyphra's Zonos API for voice synthesis, and saves the generated audio file to your specified output location.
You'll need:
The API supports multiple output formats through the mime_type
parameter:
audio/webm
audio/ogg
audio/wav
audio/mp3
or audio/mpeg
audio/mp4
or audio/aac
Endpoint: POST http://localhost:5678/webhook-test/voice-clone
Headers: Content-Type: application/json
Request Body:
{
"text": "Hello there! This voice sounds just like the sample!",
"speaking_rate": 18,
"sample_voice_path": "/data/output/sampleVoice.wav",
"output_path": "/data/output/",
"language_iso_code": "en-us",
"mime_type": "audio/wav",
"model": "zonos-v0.1-transformer",
"emotion": {
"happiness": 0.8,
"neutral": 0.3,
"sadness": 0.05,
"disgust": 0.05,
"fear": 0.05,
"surprise": 0.05,
"anger": 0.05,
"other": 0.5
}
}
15
- Speech speed"en-us"
- Language code"audio/wav"
- Output audio format"zonos-v0.1-transformer"
- AI model to use