VisionMate is an assistive technology project designed to empower visually impaired individuals by providing smart, real-time guidance for safer and more independent navigation.
VisionMate uses computer vision, text recognition, and speech tech to help users:
- Recognize objects and surroundings
- Read printed or handwritten text out loud
- Control the system easily with voice commands
- Get real-time alerts about obstacles
- Add new features easily in the future
This is the early development and ideation phase. The repository will include prototypes, research notes, and starter code as the project progresses.
- ✅ Real-time object detection
- ✅ Text-to-speech to read out text
- ✅ Speech-based user controls
- ✅ Environment awareness for obstacle detection
- ✅ Modular architecture for future feature integration
| Part | Technology / Tools |
|---|---|
| Programming Language | Python |
| Computer Vision | OpenCV, Google Cloud Vision API (planned) |
| Backend Framework | Flask / Django (to be decided) |
| Frontend / App | React.js / Flutter (planned) |
| Database | MySQL |
| Accessibility APIs | Text-to-Speech / Speech-to-Text APIs |
Here’s how VisionMate works step-by-step:
-
Captures Input:
Uses a camera to take live pictures or videos of the surroundings. -
Detects Objects:
Uses computer vision to find and identify things like doors, obstacles, signs, etc. -
Reads Text:
Uses OCR (Optical Character Recognition) to detect printed or handwritten text. -
Speech Processing:
- Converts detected text to speech so the user can hear it.
- Listens to user’s voice commands to control the system.
-
Gives Feedback:
Provides real-time audio alerts about obstacles and text info to help the user move safely. -
Modular Design:
Built so new features and better AI can be added later easily.
🚧 Work in Progress: Only starter files and prototypes are included at this stage.
Clone the repository and install dependencies:
git clone https://github.com/kaushav07/VisionMate.git
cd VisionMate
pip install -r requirements.txtWe’d love your help! Please see CONTRIBUTING.md to learn how you can contribute.
This project is licensed under the MIT License.