Since Friday (May 15, 2026), I have been building a local AI chat app of my own with the help of generative AI🚀 The main reason is to use Gemma 4 on my laptop in an environment where I can expect what will happen. In addition, this project has been helping me learn a lot about how vibecoding tools work under the hood!
This chat app is built under the following setup:
• Next.js
• Two API Routes, separating LLM inference from the back-end part of the app
• Transformers.js (running on the server side)
• The ONNX q4 version of gemma-4-e2b-it
Note that this chat app is still in development and, despite its name, has not got any built-in tools yet. However, a first step is very important in app building, isn't it? I hope that this post inspires anyone to build their own local AI chat app!