This is a multi-modal AI agent workflow using n8n and OpenAI for WhatsApp automation:
1.Triggered on incoming WhatsApp audio, image, or text
2.Transcribes audio and analyzes content for intent
3. Processes and interprets image-based prompts
4. Routes inputs to an AI Agent (OpenAI) with memory support
5. Responds in text or generated audio, depending on user preference
6. Tech Stack: n8n, OpenAI, Whisper, Code node, WhatsApp API