Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow image data generator #2394

Closed
chsasank opened this issue Apr 19, 2016 · 8 comments
Closed

Slow image data generator #2394

chsasank opened this issue Apr 19, 2016 · 8 comments

Comments

@chsasank
Copy link
Contributor

chsasank commented Apr 19, 2016

Hi,

I observed that image augmentation slows down the training process considerably.

When I used the following,

datagen = ImageDataGenerator(
        featurewise_center=True,
        featurewise_std_normalization=True,
        rotation_range=0,
        width_shift_range=0,
        height_shift_range=0,
        horizontal_flip=False)

it took me about 300 seconds for an epoch.

When I added data augmentation,

datagen = ImageDataGenerator(
        featurewise_center=True,
        featurewise_std_normalization=True,
        rotation_range=10,
        width_shift_range=0.1,
        height_shift_range=0.1,
        shear_range=0.3,
        horizontal_flip=True) 

It takes 2900 seconds per epoch. Slow down by a factor of 10!

I suspect this is due to augmentation that's happening on CPU. I have gone through source of ImageDataGenerator and IMHO random_transform can be improved.

If rotations, translations and shear are enabled, each of these are now applied one after another.
We should be able to do all of this in one step as homographic transformation. An implementation is available in skimage.

Here's an example: https://gist.github.com/chsasank/4bda6a6dc7973ae206b09134b92d20f2

Added advantage is that random zoom is just another homography.

Have a look here for visual overview of homographies.

Sasank.

@joelthchao
Copy link
Contributor

@chsasank Hi, I write some tests on the speed issue.
Task: Rotate, shear and shift a 800x1280x3 image
skimage.transform.AffineTransform spend 0.13~0.19 sec by combining them together
Keras: spend 4.4~4.7 sec by doing them in sequence

Task: Rotate, shear and shift a 227x227x3 image
skimage.transform.AffineTransform spend 0.006~0.008 sec by combining them together
Keras: spend 0.18~0.2 sec by doing them in sequence

If possible, I would make a PR for the new random_transform

@chsasank
Copy link
Contributor Author

chsasank commented Apr 21, 2016

Hi,
Meanwhile, I rewrote parts of ImageDataGenerator here: https://github.com/chsasank/keras
I also implemented random zoom.

With my implementation,

datagen = ImageDataGenerator(
        featurewise_center=True,
        featurewise_std_normalization=True,
        rotation_range=10,
        width_shift_range=0.1,
        height_shift_range=0.1,
        shear_range=0.3,
        horizontal_flip=True) 

it takes me almost same time as without any augmentation.

I also found that keras implementation is distorting the colours if augmentation is used.(even if featurewise_center=False, featurewise_std_normalization=False).

I have tested my implementation with the data I have, but I am not sure if that will suffice.
@joelthchao Before I can raise a PR, can you please confirm the speedup?

Thanks,
Sasank.

@joelthchao
Copy link
Contributor

For rotation, it seems like your implementation is using left-top as center. You will need to do something like:

o_w = float(w)/2 + 0.5
o_h = float(h)/2 + 0.5
offset_matrix = [[1, 0,  o_w], [0, 1,  o_h], [0, 0, 1]]
reset_matrix  = [[1, 0, -o_w], [0, 1, -o_h], [0, 0, 1]]

rotation_matrix = np.dot(np.dot(offset_matrix, rotation_matrix), reset_matrix)

@chsasank
Copy link
Contributor Author

That's correct, indeed all the operation seem to use left-top as centre (0,0)
l applied offset and reset for all the operations.
Thanks for the code.

Check now?

@joelthchao
Copy link
Contributor

OK, I will give it a check and write some tests.

@chsasank
Copy link
Contributor Author

chsasank commented Apr 21, 2016

That’d be great thanks! 👍

I’m not so good at writing tests. Thanks for your help!

I'm raising a PR and added you as a collaborator to my fork.
Please add tests and push to it.

On 21-Apr-2016, at 3:10 PM, Joel [email protected] wrote:

OK, I will give it a check and write some tests.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub #2394 (comment)

@joelthchao
Copy link
Contributor

By the way, I am on a busy server. Probably need someone's help to test its speed on a better environment.

@chsasank
Copy link
Contributor Author

I can help, send the script I have to run on gist or somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants