Skip to content

AryanNanda17/GestureSense

Repository files navigation

GestureSense

๐ŸŽฅ Exploring all possible Deep Learning Models for Webcam-Based Hand
Gesture Navigation for Enhanced Accessibility. ๐ŸŽฌ

SRA Eklavya 2023


Table of Contents

โญAim

  • The aim of our project is to develop an intuitive and accessible navigation system that leverages deep learning and computer vision to enable users to control digital devices through natural hand gestures captured by webcams.

๐Ÿ“Description

We have implemented 3 Deep Learning Models for gesture Recognition and Navigation:-

1. Running Averages Model for Bg-Subtraction:-

Screenshot 2023-11-06 at 2 36 34โ€ฏAM

We used VGG7 Architecture followed by running averages model for BG subtraction for Gesture Recognition and Navigation We have used the OpenCV function "accumulateWeghts" to find the running averages of frames. We manually created a dataset through which we trained our model.

cv2.accumulateWeighted(src, dst, alpha)
Screenshot 2023-11-06 at 2 33 57โ€ฏAM

This is the motion detection class that we used in our project.

This(Running Average model โ€“ Background Subtraction)Article explains the Running Averages Approach for Gesture Recognition very clearly.

We have created our own dataset to implement this model.

2. YOLO Hand Gesture Detection Model

Screenshot 2023-11-06 at 2 43 16โ€ฏAM

We used MobileNet Pretrained Weights from Tensorflow.zoo to implement our Yolo Model. We Manually labeled the Dataset for Hand Detection using LabelImg. The dataset has 60 images of 4 different gestures(15 each).

Screenshot 2023-11-06 at 2 40 31โ€ฏAM

Although we only used 60 images for training our YOLO Model, we got great results in real-time.

3. CONV3D+LSTM Model

Screenshot 2023-11-06 at 2 46 47โ€ฏAM

We created a motion detection model for gesture recognition using Jester Dataset. This model consists of 10 CONV3D Layers and 3LSTM layers. Here, we extracted spatio-temporal features for motion detection. We implemented this model in TensorFlow as well as Pytorch. You can refer this (Attention in Convolutional LSTM for Gesture Recognition) paper for learning more about Conv3D+LSTM implementation.


๐Ÿค–Tech-Stack

Programming Language

  • Python

DL Framework

  • TensorFlow
  • Keras
  • PyTorch

Image Processing

  • OpenCV

Libraries

  • NumPy
  • Matplotlib
  • Pandas

๐Ÿ“File Structure

.
โ”œโ”€โ”€ 3b1b notes
โ”‚ย ย  โ”œโ”€โ”€ Deep Learning
โ”‚ย ย  โ””โ”€โ”€ Linear Algebra
โ”œโ”€โ”€ Coursera Notes
โ”‚ย ย  โ”œโ”€โ”€ Course_1 Neural Networks and Deep Learning (Coursera)
โ”‚ย ย  โ”œโ”€โ”€ Course_2 Improving Deep Neural Networks
โ”‚ย ย  โ””โ”€โ”€ Course_4 Convolutional Neural Networks
โ”œโ”€โ”€ Create_Dataset
โ”‚ย ย  โ”œโ”€โ”€ PreProcessingData.py
โ”‚ย ย  โ””โ”€โ”€ detect.py
โ”œโ”€โ”€ GestureDetection
โ”‚ย ย  โ””โ”€โ”€ BgEliminationAndMotionDetection.py
โ”œโ”€โ”€ Hand Detection Using OpenCV
โ”‚ย ย  โ”œโ”€โ”€ Background_subtractor_hand_detection.py
โ”‚ย ย  โ””โ”€โ”€ Skin_Segmentation.py
โ”œโ”€โ”€ Keras_Models
โ”‚ย ย  โ”œโ”€โ”€ 3DCNN_LSTM.ipynb
โ”‚ย ย  โ”œโ”€โ”€ 3DCNN_LSTM_Pytorch.ipynb
โ”‚ย ย  โ”œโ”€โ”€ GestureWiseMaverick_Masking.ipynb
โ”‚ย ย  โ”œโ”€โ”€ GestureWiseMaverick_NoMasking.ipynb
โ”‚ย ย  โ””โ”€โ”€ Yolo_MobileNet.ipynb
โ”œโ”€โ”€ MNIST From Scratch Using Jax and Numpy
โ”‚ย ย  โ”œโ”€โ”€ JAX_4L_Autodiff_MNIST_IMPLEMENTATION.ipynb
โ”‚ย ย  โ”œโ”€โ”€ JAX_4L_Without_Autodiff.ipynb
โ”‚ย ย  โ”œโ”€โ”€ NumPy_2L.ipynb
โ”‚ย ย  โ””โ”€โ”€ NumPy_4L.ipynb
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ ResNet-34
โ”‚ย ย  โ”œโ”€โ”€ Assets
โ”‚ย ย  โ”œโ”€โ”€ ResNets_34.ipynb
โ”‚ย ย  โ””โ”€โ”€ Residual model paper.pdf
โ”œโ”€โ”€ Saved_Models
โ”‚ย ย  โ”œโ”€โ”€ 1
โ”‚ย ย  โ”œโ”€โ”€ 2
โ”‚ย ย  โ”œโ”€โ”€ 3
โ”‚ย ย  โ”œโ”€โ”€ 4
โ”‚ย ย  โ””โ”€โ”€ 5
โ””โ”€โ”€ environment.yml



๐Ÿ““Dataset

For the Running Averages Model approach, we created our own dataset which consists of 14,000 images of 11 different hand gestures. We have uploaded our dataset on Kaggle with a sample notebook.


Results

Results of Running Averages BgSubtraction Model With VGG-7 Architecture

Running.Averages.Results.mp4

Results with YOLO Object Detection Model with MobileNet Pretrained Weights(Consisting of only 60 Images)

Yolo.Detection.Results.mp4



๐Ÿ’ธApplications

Demo.mp4
  • Gaming: Implement gesture-based controls in gaming applications to provide a more immersive and interactive gaming experience, allowing players to control in-game actions through hand movements.

  • Accessibility Tools: The project has the potential to create accessibility tools that empower individuals with disabilities to control computers, mobile devices, and applications using hand gestures, enhancing their digital independence.

  • Educational Platforms: The project could lead to the development of interactive educational platforms where teachers and students can engage with digital content, presentations, and simulations using gestures, fostering more engaging and immersive learning experiences.

  • Human-Robot Interaction: The project has the potential to improve human-robot interactions by enabling robots to understand and respond to human gestures, making collaborative tasks more intuitive and efficient.


๐Ÿ› Getting Started

Prerequisites

  1. Linux 18.04 or above
  2. TensorFlow Object Detection API
  3. Conda installed on system

Project Setup

๐Ÿ› Project Setup

Start by cloning the repo in a directory of your choice

git clone https://github.com/AryanNanda17/GestureSense

Navigate to the Project directory Let's say you cloned in your Desktop

cd Desktop/GestureSense

Create a virtual environment with the required dependencies

conda env create --name envname -f environment.yml

Switch environment

conda activate envname

Create a Dataset of Your choice

cd Create_Dataset

Here you can add path where you want the dataset to get collected and the name of your Gesture Label

To do that run:-

python3 detect.py -h

Screenshot 2023-11-06 at 3 40 49โ€ฏAM
for example :
  • choose an Image path:

    python3 main.py -p "Your Path Here"

  • choose Label :

    python3 main.py -l GestureName

Now, we will parse this dataset through another code for masking. For that run:-

python3 PreProcessingData.py "Image_Path"

Screenshot 2023-11-06 at 3 50 21โ€ฏAM

Now, you can train your own model with the selected gestures using the VGG-7 Architecture

After Training Export your Model

Run the Running Averages Bg_Subtraction Model

For that First Navigate there:-

cd ~/Desktop/GestureSense/GestureDetection

Now, Run the following command

python your_script.py -m /path/to/your/model -c 1finger 2finger 3finger C ThumbRight fingersclosein italydown kitli pinky spreadoutpalm yoyo

Hence the setup of the Running Averages BgSubtraction Model is completed.

Screenshot 2023-11-06 at 3 58 43โ€ฏAM

๐Ÿ”ฎFuture Scope

  • Attention Mechanism Integration: Incorporate attention mechanisms into the CONV3D+LSTM model to improve its ability to focus on relevant features in gesture sequences, enhancing accuracy.
  • Mouse Control with YOLO Object Detection API: Implement mouse control functionality using the YOLO object detection API, allowing users to manipulate their computers using gesture-based control with high accuracy.
  • Interactive Web Platform Development: Create an interactive web platform that provides users with a user-friendly interface to access and utilize the gesture control system. This platform should be compatible with various browsers and operating systems.

Contributors

Acknowledgements

A special thanks to our mentors for this project:

License

The LICENSE used in this project.


About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published