Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict_generator cannot maintain data order #5048

Closed
iammarvelous opened this issue Jan 15, 2017 · 14 comments
Closed

predict_generator cannot maintain data order #5048

iammarvelous opened this issue Jan 15, 2017 · 14 comments

Comments

@iammarvelous
Copy link

It seems that predict_generator cannot maintain the data order when using multiprocessing. When feeding into several batches test data into predict_generator, the output array does not correspond to input batch index, which makes us have no clue which output is the prediction of which input, and that makes the function useless. One possible remedy for this might be using priority queue rather than normal queue to maintain the order.

Here is detailed test code.

## mnist_cnn.py in examples
from __future__ import print_function
import numpy as np
np.random.seed(1337)  # for reproducibility

from keras.datasets import mnist
from keras.models import Sequential
from keras.models import *
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras import backend as K

batch_size = 128
nb_classes = 10
nb_epoch = 8

# input image dimensions
img_rows, img_cols = 28, 28
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
pool_size = (2, 2)
# convolution kernel size
kernel_size = (3, 3)

# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

if K.image_dim_ordering() == 'th':
    X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
    X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)
    X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

model = Sequential()

model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
                        border_mode='valid',
                        input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adadelta',
              metrics=['accuracy'])

model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
          verbose=1, validation_data=(X_test, Y_test))


############# Core test code starts here #####################
def generator_from_array(X_test):
        while 1:
                for i in range(100):
                        yield X_test[i:i+1]

print('Predict on batch:')
out = []
for i in range(100):
        out_tmp = model.predict_on_batch(X_test[i:i+1])
        out.append(out_tmp)
print(out[1])
print(out[50])
print(out[-1])
print("Predict generator")
output = model.predict_generator(generator_from_array(X_test), 100, max_q_size=10, nb_worker=4, pickle_safe=True)
print(output.shape)
print(output[1])
print(output[50])
print(output[-1])

And here are results.

Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so.8.0 locally
X_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/8
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: Tesla P100-PCIE-16GB
major: 6 minor: 0 memoryClockRate (GHz) 0.405
pciBusID 0000:81:00.0
Total memory: 15.89GiB
Free memory: 15.61GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:81:00.0)
60000/60000 [==============================] - 7s - loss: 0.3829 - acc: 0.8815 - val_loss: 0.0859 - val_acc: 0.9743
Epoch 2/8
60000/60000 [==============================] - 5s - loss: 0.1336 - acc: 0.9603 - val_loss: 0.0606 - val_acc: 0.9806
Epoch 3/8
60000/60000 [==============================] - 5s - loss: 0.1041 - acc: 0.9690 - val_loss: 0.0533 - val_acc: 0.9833
Epoch 4/8
60000/60000 [==============================] - 5s - loss: 0.0861 - acc: 0.9735 - val_loss: 0.0441 - val_acc: 0.9852
Epoch 5/8
60000/60000 [==============================] - 5s - loss: 0.0781 - acc: 0.9763 - val_loss: 0.0409 - val_acc: 0.9861
Epoch 6/8
60000/60000 [==============================] - 5s - loss: 0.0702 - acc: 0.9793 - val_loss: 0.0387 - val_acc: 0.9870
Epoch 7/8
60000/60000 [==============================] - 5s - loss: 0.0626 - acc: 0.9815 - val_loss: 0.0379 - val_acc: 0.9867
Epoch 8/8
60000/60000 [==============================] - 5s - loss: 0.0605 - acc: 0.9817 - val_loss: 0.0352 - val_acc: 0.9891
Predict on batch:
[[  1.61985781e-07   9.81094581e-06   9.99989748e-01   2.69348943e-09
    1.97990360e-10   8.48836210e-11   4.53296529e-08   7.74509276e-11
    2.23150167e-07   2.99653670e-11]]
[[  9.75747753e-06   2.34337261e-09   5.09917042e-09   1.79785129e-08
    6.84200643e-08   6.34509252e-06   9.99983668e-01   4.00663530e-11
    1.21496996e-07   1.95249289e-10]]
[[  9.69054281e-10   1.05993847e-09   1.87508320e-09   1.94809417e-07
    1.49762297e-06   4.11489260e-08   8.54344595e-10   8.08601499e-07
    1.35151751e-07   9.99997377e-01]]
Predict generator
(100, 10)
[  1.61985781e-07   9.81094581e-06   9.99989748e-01   2.69348943e-09
   1.97990360e-10   8.48836210e-11   4.53296529e-08   7.74509276e-11
   2.23150167e-07   2.99653670e-11]
[  9.99998927e-01   6.84537635e-11   4.53024768e-07   9.15579487e-11
   1.19156296e-10   1.37983824e-09   6.24313543e-08   5.71949954e-09
   1.34752597e-07   4.58147241e-07]
[  6.04119035e-04   2.68195297e-08   1.23279997e-05   2.34821496e-10
   9.99363124e-01   1.72202430e-08   1.96394576e-05   6.58836768e-07
   1.14492806e-07   3.96185520e-08]

@patyork
Copy link
Contributor

patyork commented Jan 15, 2017

If you want to compare predictions versus known outputs, use an evaluate_generator.

@iammarvelous
Copy link
Author

@patyork I think evaluate_generator is more useful for validation. But for the purpose of 'real' predicting, it is important to know my predictions are from which exact input samples, right?

@patyork
Copy link
Contributor

patyork commented Jan 15, 2017

That's a use case, of course. If setting nb_workers=1 won't work for you, due to slower speed, and just loading all of the inputs and calling predict is too much for your memory, you'd probably be better off writing your own generator + predict/predict_on_batch routine such that you can queue the inputs how you'd like, and be able to save the predictions (and a reference to the inputs that created them) on the fly how you'd like (and then unload) to preserve memory.

That's a pretty niche/uncommon issue to need to solve (high speed, large dataset, prediction and saving); most likely too niche for inclusion in the Keras core.

@LamDang
Copy link

LamDang commented Apr 17, 2017

@patyork @iammarvelous
Hello there, I just have the same issue that makes me lose quite some time debugging.
I think this should be considered a bug, because if predict_generator() method cannot reconstruct the order then the prediction is not usable and the method is just useless.

I suggest one of the following:

  • At least have a warning message for user to warn them of the risk using predict_generator with workers >1
  • Force workers=1 all the time
  • Somehow keep the batch index in the queue and reconstruct the right order

@avn3r
Copy link
Contributor

avn3r commented Apr 18, 2017

@LamDang @patyork agreed they should force predict_generator and evaluate_generator workers=1.
I have been having the same problem. I need to aggregate the results of predict_generator but can't with workers > 1. Took me a while to figure out. The same problem with fit_generator() when using validation_generator. I want to be able to set workers > 1 for the training but evaluate_generator workers=1 to ensure I get consistent accuracy measure. Can't do it without re-writing lots of code.

I suggest to always force workers=1 for evaluate_generator and predict_generator.

@stale stale bot added the stale label Jul 18, 2017
@stale
Copy link

stale bot commented Jul 18, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

@cjmielke
Copy link

cjmielke commented Aug 8, 2017

agreed. This has long worried me. predict_generator appears to be the most efficient way of evaluating a very large collection of images, and keeping the card at full utilization, but this order ambiguity invites errors. Instead of fixing the ordering though, the interface could be changed so that the input to this is a generator that yields (ID, Data) pairs, and the generator could yield (ID, prediction) pairs.

@stale stale bot removed the stale label Aug 8, 2017
@romainVala
Copy link

I agree, I do not see how to use predict generator,
I just change the mnist_cnn example to get data from a generator
even if I set workers=1 I do not get the same order in predicted value as the input one.

@fchollet this is a bug isn't it ?

(and same thing for evaluate_generator (#6499) I do not get the same results (compare to evaluate) et re-run it will give slitly different results ...)

@saanasum
Copy link

May it be that this might be caused due to

# the data, **shuffled** and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In this case rstudio/keras3#149 I had a similar problem (although using R Keras) and the reason was that shuffle was set to TRUE when batch importing the images. Therefore, the image order was different for each repetition.

It might be also that I don´t get the point of this issue #5048 because I am a greenhorn in this topic. Then please ignore my posting.

@glrs
Copy link

glrs commented Dec 17, 2018

I recently crashed into predict_generator inconsistency too. Looks like it's a quite old issue.
@romainVala same for me, even if I use workers=1 I don't get the data in the same order. It works when I use workers=0 though. However, doing so forces the generator to execute on the main thread, which is probably not the best idea for bigger datasets (the actual reason we use a generator).
Does anyone know if this is fixed? I've seen some alternatives like #6891 with dataset API, but is there any workarounds/fixes for predict_generator?

@rahullak
Copy link

Has there been a fix on this please? I have the exact same issue, as I am trying to create bottlenecks using a model with the top off. I need to store the bottlenecks in separate files on disk, but without knowing which input file produced which output bottleneck, the whole thing would end up a mess. Or does anyone know a better way to create bottleneck files for individual inputs?

@Attila94
Copy link

Attila94 commented Mar 5, 2019

I bumped into this issue when I used the same generator for both evaluate_generator and predict_generator in that exact order. Re-initialising the generator or simply calling predict_generator first and evaluate_generator second solved it for me (although it cost me an hour to figure that out).

@popew
Copy link

popew commented May 4, 2019

I had similar issue where the predictions got slightly worse each time I called predict_generator. Setting workers = 0 somehow helped.

predictions = model.predict_generator(testGenerator, steps = np.ceil(testGenerator.samples / testGenerator.batch_size), verbose=1, workers=0)

@HSaidaoui
Copy link

When loading your test_set using .flow_from_dataframe/directory make sure to disable the shuffle option; shuffle=None, that will force it to have the original ordering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests