Skip to content

Commit

Permalink
Merge branch 'svhn'
Browse files Browse the repository at this point in the history
  • Loading branch information
christianversloot committed Jan 9, 2020
2 parents 1ef521f + e821904 commit 17d2f49
Show file tree
Hide file tree
Showing 5 changed files with 87 additions and 1 deletion.
26 changes: 26 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ Hi there, and welcome to the `extra-keras-datasets` module! This extension to th
* [EMNIST-MNIST](#emnist-mnist)
* [KMNIST-KMNIST](#kmnist-kmnist)
* [KMNIST-K49](#kmnist-k49)
* [SVHN-Normal](#svhn-normal)
* [SVHN-Extra](#svhn-extra)
- [Contributors and other references](#contributors-and-other-references)
- [License](#license)

Expand Down Expand Up @@ -120,12 +122,36 @@ from extra-keras-datasets import kmnist

---

### SVHN-Normal

```
from extra-keras-datasets import svhn
(input_train, target_train), (input_test, target_test) = svhn.load_data(type='normal')
```

<a href="./assets/svhn-normal.png"><img src="./assets/svhn-normal.png" width="500" style="border: 3px solid #f6f8fa;" /></a>

---

### SVHN-Extra

```
from extra-keras-datasets import svhn
(input_train, target_train), (input_test, target_test) = svhn.load_data(type='extra')
```

<a href="./assets/svhn-extra.png"><img src="./assets/svhn-extra.png" width="500" style="border: 3px solid #f6f8fa;" /></a>

---

## Contributors and other references
* **EMNIST dataset:**
* Cohen, G., Afshar, S., Tapson, J., & van Schaik, A. (2017). EMNIST: an extension of MNIST to handwritten letters. Retrieved from http://arxiv.org/abs/1702.05373
* [tlindbloom](https://stackoverflow.com/users/4008755/tlindbloom) on StackOverflow: [loading EMNIST-letters dataset](https://stackoverflow.com/questions/51125969/loading-emnist-letters-dataset/53547262#53547262) in [emnist.py](./emnist.py).
* **KMNIST dataset:**
* Clanuwat, T., Bober-Irizar, M., Kitamoto, A., Lamb, A., Yamamoto, K., & Ha, D. (2018). Deep learning for classical Japanese literature. arXiv preprint arXiv:1812.01718. Retrieved from https://arxiv.org/abs/1812.01718
* **SVHN dataset:**
* Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., & Ng, A. Y. (2011). Reading digits in natural images with unsupervised feature learning. Retrieved from http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf / http://ufldl.stanford.edu/housenumbers/

## License
The licenseable parts of this repository are licensed under a [MIT License](./LICENSE), so you're free to use this repo in your machine learning projects / blogs / exercises, and so on. Happy engineering! 🚀
Binary file added assets/svhn-extra.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/svhn-normal.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 2 additions & 1 deletion extra_keras_datasets/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from __future__ import absolute_import

from . import emnist
from . import kmnist
from . import kmnist
from . import svhn
59 changes: 59 additions & 0 deletions extra_keras_datasets/svhn.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
'''
Import the SVHN dataset
Source: http://ufldl.stanford.edu/housenumbers/
Description: Street View House Numbers
~~~ Important note ~~~
Please cite the following paper when using or referencing the dataset:
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng Reading Digits in Natural Images with Unsupervised Feature Learning NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011. Retrieved from http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf
'''

from keras.utils.data_utils import get_file
import numpy as np
from zipfile import ZipFile
from scipy import io as sio
import os

def load_data(path='svhn_matlab.npz', type='normal'):
"""Loads the SVHN dataset.
# Arguments
path: path where to cache the dataset locally
(relative to ~/.keras/datasets).
type: any of normal, extra (extra appends ~530K extra images for training)
# Returns
Tuple of Numpy arrays: `(input_train, target_train), (input_test, target_test)`.
"""
path_train = get_file(f'{path}_train',
origin='http://ufldl.stanford.edu/housenumbers/train_32x32.mat')
path_test = get_file(f'{path}_test',
origin='http://ufldl.stanford.edu/housenumbers/test_32x32.mat')

# Load data from Matlab file.
# Source: https://stackoverflow.com/a/53547262
mat_train = sio.loadmat(path_train)
mat_test = sio.loadmat(path_test)

# Prepare training data
input_train = mat_train['X']
input_train = np.rollaxis(input_train, 3, 0)
target_train = mat_train['y'].flatten()

# Prepare testing data
input_test = mat_test['X']
input_test = np.rollaxis(input_test, 3, 0)
target_test = mat_test['y'].flatten()

# Append extra data, if required
if type == 'extra':
path_extra = get_file(f'{path}_extra',
origin='http://ufldl.stanford.edu/housenumbers/extra_32x32.mat')
mat_extra = sio.loadmat(path_extra)
input_extra = mat_extra['X']
input_extra = np.rollaxis(input_extra, 3, 0)
target_extra = mat_extra['y'].flatten()
input_train = np.concatenate((input_extra, input_train))
target_train = np.concatenate((target_extra, target_train))

# Return data
return (input_train, target_train), (input_test, target_test)

0 comments on commit 17d2f49

Please sign in to comment.