The "WhatsApp Productivity Assistant with Memory and AI Imaging" is a comprehensive n8n workflow that transforms your WhatsApp into a powerful, multi-talented AI assistant. It's designed to handle a wide range of tasks by understanding user messages, analyzing images, and connecting to various external tools and services. The assistant can hold natural conversations, remember past interactions using a MongoDB vector store (RAG), and decide which tool is best suited for a user's request. Whether you need to check your schedule, research a topic, get the latest news, create an image, or even analyze a picture you send, this workflow orchestrates it all seamlessly through a single WhatsApp chat interface.
The workflow is structured into several interconnected components:
Route Message by Type (Image/Text)
node then intelligently routes the message based on its content type. A Typing....
node sends a typing indicator to the user for a better experience. If an image is received, it's downloaded, processed via an HTTP Request, and analyzed by the Analyze image
node. The Code1
node then standardizes both text and image analysis output into a single, unified input for the main AI agent.AI Agent1
node receives the user's input, maintains short-term conversational memory using Simple Memory
, and uses a powerful language model (gpt-oss-120b2
or gpt-oss-120b1
) to decide which tool or sub-agent to use. It orchestrates all the other agents and tools.Google Calendar
, Google Tasks
, and Gmail
, allowing you to schedule events, manage to-dos, and read emails. It leverages a language model (gpt-4.1-mini
or gemini-2.5-flash
) for understanding and executing commands within these tools.Brave Web Search
, Brave News Search
, Wikipedia
, Tavily
, and a custom perprlexcia
search) to find the most accurate and up-to-date information from the web. It uses a language model (gpt-oss-120b
or gpt-4.1-nanoChat Model1
) for reasoning.Webhook2
) that processes conversation history, extracts key information using Extract Memory Info
, and stores it in a MongoDB Atlas Vector Store
for long-term memory. This allows the AI agent to remember past preferences and facts.Webhook3
) triggered when a user asks to create an image. It uses a dedicated AI Agent
with MongoDB Atlas Vector Store1
for contextual image prompt generation, Clean Prompt Text1
to refine the prompt, an HTTP Request
to an external image generation API (e.g., Together.xyz), and then converts and sends the generated image back to the user via WhatsApp.Before importing and running this template, you will need:
gpt-oss-120b
, gpt-5-nano
, gpt-4o-mini
).codestral-embed-2505
).perprlexcia
tool (the current URL http://self hoseted perplexcia/api/search
implies a self-hosted or custom endpoint).perprlexcia
) must be publicly accessible.You will need to set up the following credentials within your n8n instance:
WhatsApp Trigger
node.Send message2
, Send message3
, Download media
, and Typing....
nodes.Analyze image
, Google Gemini Chat Model
, gemini-2.5-flash
, and Google Gemini Chat Model5
nodes.Get Weather Forecast
node.gpt-oss-120b
node.MongoDB Atlas Vector Store
nodes.gpt-5-nano
and gpt-4.1-nanoChat Model1
nodes.Get many messages
and Get a message
nodes (ensure correct Gmail OAuth2 setup for each).HTTP Request5
(used in media download).Brave Web Search
and Brave News Search
nodes.gpt-4.1-mini
, gpt-oss-120b
, gpt-oss-120b2
, and gpt-4.1-nano
nodes.Tavily web search
(create a new one named "Tavily API Key" with Authorization: Bearer YOUR_TAVILY_API_KEY
) and HTTP Request
(for Together.xyz, e.g., "Together.xyz API Key").codestral-embed-2505
, codestral-embed-
, and codestral-embed-2506
nodes.