This workflow creates a voice AI assistant accessible via Telegram that leverages ElevenLabs* powerful voice synthesis technology. Users can either clone their own voice or transform their voice using pre-existing voice models, all through simple voice messages sent to a Telegram bot.
*ONLY FOR STARTER, CREATOR, PRO PLAN
This workflow allows users to:
- Clone their voice by sending a voice message to a Telegram bot (creates a new voice profile on ElevenLabs)
- Change their voice to a cloned voice and save the output to Google Drive
For Best Results
Important Considerations for Best Results:
For optimal voice cloning via Telegram voice messages:
1. Recording Quality & Environment
- Record in a quiet room with minimal echo and background noise
- Use a consistent microphone position (10-15cm from mouth)
- Ensure clear audio without distortion or clipping
2. Content Selection & Variety
- Send 1 voice messages totaling 5-10 minutes of speech
- Include diverse vocal sounds, tones, and natural speaking cadence
- Use complete sentences rather than isolated words
3. Audio Consistency
- Maintain consistent volume, tone, and distance from microphone
- Avoid interruptions, laughter, coughs, or background voices
- Speak naturally without artificial effects or filters
4. Technical Preparation
- Ensure Telegram isn't overly compressing audio (use HQ recording)
- Record all messages in the same session with same conditions
- Include both neutral speech and varied emotional expressions
How it works
-
Trigger
The workflow starts with a Telegram trigger that listens for incoming messages (text, voice notes, or photos).
-
Authorization check
A Code node checks whether the senderās Telegram user ID matches your predefined ID. If not, the process stops.
-
Message routing
A Switch node routes the message based on its type:
- Text ā Not processed further in this flow.
- Voice message ā Sent to the āGet audioā node to retrieve the audio file from Telegram.
- Photo ā Not processed further in this flow.
-
Two main options
From the āGet audioā node, the workflow splits into two possible paths:
- Option 1 ā Clone voice
The audio file is sent to ElevenLabs via an HTTP request to create a new cloned voice. The voice ID is returned and can be saved for later use.
- Option 2 ā Voice changer
The audio is sent to ElevenLabs for speech-to-speech conversion using a pre-existing cloned voice (voice ID must be set in the node parameters). The resulting audio is saved to Google Drive.
-
Output
- Cloned voice ID (for Option 1).
- Converted audio file uploaded to Google Drive (for Option 2).
Set up steps
-
Telegram bot setup
- Create a bot via BotFather and obtain the API token.
- Set up the Telegram Trigger node with your bot credentials.
-
Authorization configuration
- In the āSanitazeā Code node, replace
XXX with your Telegram user ID to restrict access.
-
ElevenLabs API setup
- Get an API key from ElevenLabs.
- Configure the HTTP Request nodes (āCreate Cloned Voiceā and āGenerate cloned audioā) with:
- API key in the
Xi-Api-Key header.
- Appropriate endpoint URLs (including voice ID for speech-to-speech).
-
Google Drive setup (for Option 2)
- Set up Google Drive OAuth2 credentials in n8n.
- Specify the target folder ID in the āUpload fileā node.
-
Voice ID configuration
- For voice cloning: The voice name can be customized in the āCreate Cloned Voiceā node.
- For voice changing: Replace
XXX in the āGenerate cloned audioā node URL with your ElevenLabs voice ID.
-
Test the workflow
- Activate the workflow.
- Send a voice note from your authorized Telegram account to trigger cloning or voice conversion.
š Subscribe to my new YouTube channel. Here Iāll share videos and Shorts with practical tutorials and FREE templates for n8n.

Need help customizing?
Contact me for consulting and support or add me on Linkedin.