Back to Templates

Nano Banana/Gemini 2.5 Telegram Bot with Multi-modal Functionality

Created by

Created by: Denis || denisholc7

Denis

Last update

Last update 23 days ago

Share


How it works

  • Multi-modal AI Image Generator powered by Google's Nano Banana (Gemini 2.5 Flash Image) - the latest state-of-the-art image generation model
  • Accepts text, images, voice messages, and PDFs via Telegram for maximum flexibility
  • Uses OpenAI GPT models for conversation and image analysis, then Nano Banana for stunning image generation
  • Features conversation memory for iterative image modifications ("make it darker", "change to blue")
  • Processes different input types: analyzes uploaded images, transcribes voice messages, extracts PDF text
  • All inputs are converted to optimized prompts specifically tuned for Nano Banana's capabilities

Set up steps

  • Create Telegram bot via @BotFather and get API token
  • Set up Google Gemini API key from Google AI Studio for Nano Banana image generation (~$0.04/image)
  • Configure OpenAI API key for GPT models (conversation, image analysis, voice transcription)
  • Import workflow and configure all three API credentials in n8n
  • Update bot tokens in HTTP request nodes for file downloads
  • Test with text prompts, image uploads, voice messages, and PDF documents