Convolutional Neural Network Digit Recognition and Bounding Box Prediction

This program performs image and bounding box recognition on a series of digtits ranging in squence from 1-5 using a Convolutional Neural Network (CNN). The MNIST and SVHN datasets are preprocessed and normalized then used to train a CNN consisting of convolutional, max pooling, dropout, fully connected, and output layers.

Install

Python 3.6
- I recommend installing Anaconda as it is alreay set up with standard machine learning libraries
- If unfamiliar with the command line there are graphical installs for macOS, Windows, and Linux
PIL
- pip install pillow for python 3
six
TensorFlow

Dataset

In this study the MNIST and SVHN datasets were used to create a combined dataset of hand drawn digits and house numbers in groupings of 1-5 digits. There are a total of roughly 320k images: 280k training images, 15k validation images, and 23k testing images.

The images are 32x32x1 grayscale format with 32 representing the pixel width and height and 1 representing the gray color dimension. Each image has a corresponding label which lists the numbers of digits in the image and digit themselves, including a label representing the absence of a digit in cases where there are less than 5 digits (the maximum number of digits in an image). The SVHN dataset also includes bounding box information which will be used in the second half of the project to determine digit location.

Parameters

depth - Alter the depths of the CNN layers using common memory sizes
epochs - number of training iterations
batch_size - set to highest number your machine has memory for during common memory sizes
keep_probability - probability of keeping activation node in dropout layer

Example Output

Run the files in the order specified below.

Command Line

python create MNIST_multi-digit-dataset.py

Creates multi-digit MNIST 32x32 dataset

python create_bbox_SVHN_dataset.py

Creates SVHN 32x32 dataset with bounding boxes

python create_combined_dataset.py

Combines and randomizes the previous two dataset

python create_real_world_dataset.py

Create a grayscaled images from real world pictures

python train_digit_recognition_CNN.py

Trains network on the combined dataset and outputs loss and accuracy data into tensorboard files

To view the tensorboard loss and accuracy outputs, follow these instruntions from the tensorflow website.

train_bounding_box_CNN.py

Trains the network on the SVHN bounding box dataset and outputs predicted bounding box examples on the real world dataset

License

The image_classification program is a public domain work, dedicated using CC0 1.0. I encourage you to use it, and enhance your understanding of CNNs and the deep learning concepts therein. :)

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
rgb_house_numbers		rgb_house_numbers
README.md		README.md
create_MNIST_multi-digit_dataset.py		create_MNIST_multi-digit_dataset.py
create_bbox_SVHN_dataset.py		create_bbox_SVHN_dataset.py
create_combined_dataset.py		create_combined_dataset.py
create_real_world_dataset.py		create_real_world_dataset.py
pickle_work_around.py		pickle_work_around.py
project_report.pdf		project_report.pdf
proposal.pdf		proposal.pdf
train_CNN_combo.py		train_CNN_combo.py
train_bbox_SVHN.py		train_bbox_SVHN.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Convolutional Neural Network Digit Recognition and Bounding Box Prediction

Install

Dataset

Parameters

Example Output

License

About

Releases

Packages

Languages

BananuhBeatDown/digit_recognition

Folders and files

Latest commit

History

Repository files navigation

Convolutional Neural Network Digit Recognition and Bounding Box Prediction

Install

Dataset

Parameters

Example Output

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages