VisionMate

VisionMate is an assistive technology project designed to empower visually impaired individuals by providing smart, real-time guidance for safer and more independent navigation.

Project Overview

VisionMate uses computer vision, text recognition, and speech tech to help users:

Recognize objects and surroundings
Read printed or handwritten text out loud
Control the system easily with voice commands
Get real-time alerts about obstacles
Add new features easily in the future

This is the early development and ideation phase. The repository will include prototypes, research notes, and starter code as the project progresses.

Features (Planned)

✅ Real-time object detection
✅ Text-to-speech to read out text
✅ Speech-based user controls
✅ Environment awareness for obstacle detection
✅ Modular architecture for future feature integration

Technology Stack

Part	Technology / Tools
Programming Language	Python
Computer Vision	OpenCV, Google Cloud Vision API (planned)
Backend Framework	Flask / Django (to be decided)
Frontend / App	React.js / Flutter (planned)
Database	MySQL
Accessibility APIs	Text-to-Speech / Speech-to-Text APIs

How It Works

Here’s how VisionMate works step-by-step:

Captures Input:
Uses a camera to take live pictures or videos of the surroundings.
Detects Objects:
Uses computer vision to find and identify things like doors, obstacles, signs, etc.
Reads Text:
Uses OCR (Optical Character Recognition) to detect printed or handwritten text.
Speech Processing:
- Converts detected text to speech so the user can hear it.
- Listens to user’s voice commands to control the system.
Gives Feedback:
Provides real-time audio alerts about obstacles and text info to help the user move safely.
Modular Design:
Built so new features and better AI can be added later easily.

Getting Started

🚧 Work in Progress: Only starter files and prototypes are included at this stage.

Clone the repository and install dependencies:

git clone https://github.com/kaushav07/VisionMate.git
cd VisionMate
pip install -r requirements.txt

Contributing

We’d love your help! Please see CONTRIBUTING.md to learn how you can contribute.

📄 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
__pycache__		__pycache__
app		app
dummy_user_data/pictures		dummy_user_data/pictures
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
config.py		config.py
face_utils.py		face_utils.py
load_model.py		load_model.py
main.py		main.py
requirements.txt		requirements.txt
scan_logger.py		scan_logger.py
tts_utils.py		tts_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VisionMate

Project Overview

Features (Planned)

Technology Stack

How It Works

Getting Started

Contributing

📄 License

About

Uh oh!

Releases

Packages

Languages

License

khanak0509/VisionMate

Folders and files

Latest commit

History

Repository files navigation

VisionMate

Project Overview

Features (Planned)

Technology Stack

How It Works

Getting Started

Contributing

📄 License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages