Built an end-to-end image generation agent using Google Gemini's vision model, n8n, and Telegram. Users send a prompt in chat → the agent parses input, calls Gemini for image generation, formats the result, and delivers it via Telegram or saves it to disk. Includes JSON cleanup, custom filename generation, and flexible model swapping. A lightweight and fully prompt-driven image generator that’s fast, low-cost, and scalable.