Audio Caption Generation From Image Using Deep Learning

Abdul Hannan Sunsara

• The Android application will capture an image which will be processed using the Machine learning model and then it will generate captions which will be converted to audio to describe the whole image.
• The application has been designed with the specific needs of blind people in mind, providing them with an enhanced perspective and greater understanding of their surroundings. With built-in shortcuts, users can easily navigate the app’s features and functionalities. This ensures that blind users can use the application with ease and convenience, improving their overall experience and making it easier for them to engage with the world around them.
Technology Used: Python, Flutter, Flask, CNN, RNN.
Like this project

Posted Mar 5, 2024

Android app captures images, utilizes ML to create audio descriptions for the blind.

Dall-E 2.0 (AI-Enhanced Image Creation Platform using Dall-E API
Dall-E 2.0 (AI-Enhanced Image Creation Platform using Dall-E API
Very Long Range Spy Robot with Obstacle Detection
Very Long Range Spy Robot with Obstacle Detection
Portfolio Site
Portfolio Site
AI News Summarizer
AI News Summarizer

Join 50k+ companies and 1M+ independents

Contra Logo

© 2025 Contra.Work Inc