Skip to content

Classifying Digits and their corresponding bounding boxes with a CNN using TensorFlow.

Notifications You must be signed in to change notification settings

BananuhBeatDown/digit_recognition

Repository files navigation

Convolutional Neural Network Digit Recognition and Bounding Box Prediction

This program performs image and bounding box recognition on a series of digtits ranging in squence from 1-5 using a Convolutional Neural Network (CNN). The MNIST and SVHN datasets are preprocessed and normalized then used to train a CNN consisting of convolutional, max pooling, dropout, fully connected, and output layers.

Install

  • Python 3.6
    • I recommend installing Anaconda as it is alreay set up with standard machine learning libraries
    • If unfamiliar with the command line there are graphical installs for macOS, Windows, and Linux
  • PIL
    • pip install pillow for python 3
  • six
  • TensorFlow

Dataset

In this study the MNIST and SVHN datasets were used to create a combined dataset of hand drawn digits and house numbers in groupings of 1-5 digits. There are a total of roughly 320k images: 280k training images, 15k validation images, and 23k testing images.

The images are 32x32x1 grayscale format with 32 representing the pixel width and height and 1 representing the gray color dimension. Each image has a corresponding label which lists the numbers of digits in the image and digit themselves, including a label representing the absence of a digit in cases where there are less than 5 digits (the maximum number of digits in an image). The SVHN dataset also includes bounding box information which will be used in the second half of the project to determine digit location.

Parameters

  • depth - Alter the depths of the CNN layers using common memory sizes
  • epochs - number of training iterations
  • batch_size - set to highest number your machine has memory for during common memory sizes
  • keep_probability - probability of keeping activation node in dropout layer

Example Output

Run the files in the order specified below.

Command Line

python create MNIST_multi-digit-dataset.py

  • Creates multi-digit MNIST 32x32 dataset

python create_bbox_SVHN_dataset.py

  • Creates SVHN 32x32 dataset with bounding boxes

python create_combined_dataset.py

  • Combines and randomizes the previous two dataset

python create_real_world_dataset.py

  • Create a grayscaled images from real world pictures

python train_digit_recognition_CNN.py

  • Trains network on the combined dataset and outputs loss and accuracy data into tensorboard files

To view the tensorboard loss and accuracy outputs, follow these instruntions from the tensorflow website.

train_bounding_box_CNN.py

  • Trains the network on the SVHN bounding box dataset and outputs predicted bounding box examples on the real world dataset

License

The image_classification program is a public domain work, dedicated using CC0 1.0. I encourage you to use it, and enhance your understanding of CNNs and the deep learning concepts therein. :)

About

Classifying Digits and their corresponding bounding boxes with a CNN using TensorFlow.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages