VisioFun

VisioFun is an interactive application designed to provide entertainment and assistance to visually impaired individuals. This project leverages various technologies to offer a range of features aimed at enhancing user experience and engagement.

Features

1. Describe Image

Upload an image and receive a descriptive text generated using a combination of machine learning and natural language processing.
The application uses a pre-trained MobileNet model for image recognition and the Cohere API for generating textual descriptions.
Additionally, the generated description can be converted into speech for auditory feedback.

2. Imagine the World

Enter the name of an object or place you want to visualize, and the application generates a descriptive text.
The user provides input, prompting the application to produce descriptive text using the Cohere API.
The generated text is converted into speech for auditory output.

3. Guess the Animal Sound

Listen to various animal sounds and guess the corresponding animals.
The application provides a set of animal sounds for users to identify, enhancing auditory recognition skills.

4. Riddles

Listen to audio riddles and type your answers.
Users engage in a riddle-solving activity, enhancing cognitive abilities and critical thinking.

5. Quiz

Participate in a quiz featuring multiple-choice questions.
Users answer questions across various topics, providing an interactive learning experience.

Usage

To use VisioFun:

Clone the repository to your local machine.
Install the required dependencies by running pip install -r requirements.txt.
Run the Streamlit application by executing streamlit run main.py.
Alternatively, you can access the application through the following link: VisioFun Web App.

Tech Stack and Technologies Used

Streamlit: Framework for building interactive web applications with Python.
TensorFlow: Open-source machine learning framework for image recognition tasks.
Cohere API: Natural language processing API for generating descriptive text based on input prompts.
PIL (Python Imaging Library): Library for image processing tasks.
gTTS (Google Text-to-Speech): Library for converting text descriptions into speech.
NumPy: Fundamental package for scientific computing in Python.
IPython.display: Module for playing audio files within the application.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

VisioFun

Features

1. Describe Image

2. Imagine the World

3. Guess the Animal Sound

4. Riddles

5. Quiz

Usage

Tech Stack and Technologies Used

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

VisioFun

Features

1. Describe Image

2. Imagine the World

3. Guess the Animal Sound

4. Riddles

5. Quiz

Usage

Tech Stack and Technologies Used

License