This is my B.Tech. project and its title is Smart Assistant for Visual Impaired Person.
Visually impaired people are facing a lot of problems in their daily life. So, It would be great if visually impaired people can also interact with the environment with the help of the latest technology and utilize the facilities of the technology. Utilizing technologies like Artificial Intelligence, Machine Learning, Image, and Text Recognition, we can help visually impaired people to get information about their surroundings. This can help them a lot and can make their life easier than before.
The main idea of my project is to implement a Web-based application that provides a way for visually impaired people to interact and understand their surroundings. It would focus on tools that can help these people, which includes:
- Object Detection:
The idea is to build an application that would detect the object present in front of the webcam or camera in our computer or smartphone and can tell the information about the objects to the user in the form of voice. - Voice assistant:
The idea is to build an assistant that can take the input of a user in the form of voice and can perform some basic tasks like searching on the web, switching the feature in the web app, etc. - Image to text/speech:
The idea is to make the words recognition system that can extract words from the given input image and display them in the form of text and produces an output in the form of voice. - Speech to text:
The idea is to construct a speech recognition system that would listen to the user’s speech and convert it into text. This would be a great tool that would help the visually impaired person to write a kind of information by just saying it. - Text to Speech:
The idea here is to convert the text into speech. This will help the user to read anything by just passing the text in the application, and the application would read on behalf of the user.
These are some of the deliverables that I have planned to make for this project. I have planned to integrate these tools in a single web application using React.js framework as frontend and Python - flask as a backend. I am building a web application because it can run in any system using their default browser without any need for installation of any kind of application in our system. However, it may need an internet connection to load the application in the browser.
My aim in this project is to build a smart virtual assistant for visually impaired people that would be helpful for them in their daily routine. This assistant application would provide a different look about their surroundings to the user. It would have the following modules:
- Object Detection
It is done usingYOLO
algorithm. - Image to text/speech convertor
It is done usingeasyOCR
. - Speech to text converter
It is done usingWebSpeechAPI
. - Text to Speech
It is done usingWebSpeechAPI
. - Voice assistant
It is done usingWebSpeechAPI
.
I have planned to make a web application to integrate these modules as a web application can run in any system’s browser, whether it is a desktop or a smartphone, without any need to install any software. The user only needs to go to the particular website, and this application would get loaded into the browser and start working. However, for this to work Internet connection is required. I am considering using the React.js framework and python as backend to implement this project and run these models in the system’s browser
MERN stack has been used for the development of this website.
Prerequisites: Python, Flask, npm, pip, create-react-app, etc.
- Write
cd ./client
in terminal for going in frontend folder. - Run
npm install
for installing dependencies. - Run
npm start
- Install libraries like numpy, pandas, pillow easyocr, tensorflow, openCV, keras, matplotlib, etc. by running
pip install <<Library_name>>
. - Download yolo weight and put it in model_data. For reference.
- Run
python ./server.py
Hurray, Your app is now running on port 3000 in your browser
Link : https://www.youtube.com/watch?v=0TPdp-As1Ac
Object Detection
Image to Text
Text to Speech
Speech to Text
Assistant
- It can perform task like go to , search on google, play on youtube, weather, etc...