-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sparse_categorical_crossentropy doesn't seem to be working #2444
Comments
If you're using that objective you manually have to add an extra dimension, e.g. I think this was documented somewhere in keras 0.3, it may have been lost in the shuffle. This was a work around for some stuff where keras was inferring some things about temporal outputs based on shape. Let me know if that works or does not work. |
* Fix generators methods when passing data as dicts * Callback style fix * Fix callback issue with Sequential model * Allow 'tf' ordering in ImageDataGenerator (keras-team#2291) * Update preprocessing/image documentation * Fix validation_split * Fix siamese example * Fix "trainable" argument * Expose max_q_size and other generator_queue args (keras-team#2300) * [keras-team#2287] expose generator_queue args * [keras-team#2287] only expose max_q_size * Added learning phase to callbacks (keras-team#2297) (keras-team#2303) * added learning phase to callbacks (keras-team#2297) * cleaned imports * replaced tabs by spaces * added case where uses_learning_phase is False * fixed pep8 blank line bug * Fix PEP8 * Fix Graph generator methods * Fix case where output_shape in Merge is tuple * Add set_learning_phase in TF backend. * Max Over Time in imdb_cnn.py (keras-team#2320) * Max Over Time in imdb_cnn.py Following this issue keras-team#2296 i propose this PR. The mayor optimisation a part of the Max over time are: - Dropout in the Embedding layer. - Longer input sequences (400 instead of 100), made possible from the speedup of the Max Over Time. - Adam optimizer. Overall it takes 90 to 100 sec per epoch on my laptop CPU and in two epochs it reaches 0.885 accuracy that is a 5 points improvement over the previous implementation. Moreover it requires less memory (300k parameters vs 3M+) since the number of parameters do not depend by the length of the input sequence anymore. * Update imdb_cnn.py * Fix test_image unit test * Style fixes in preprocessing/image * Shape inference fix for Embedding * Change error message in standardize_input_data (keras-team#2338) * Fix typo in docs. loss_weight should be loss_weights (keras-team#2343) * Fix support for custom metrics functions (keras-team#2351) * Update model.md (keras-team#2348) * Add TF/TH kernel conversion util * Add batch_set_value for faster TF weight loading * Fix Dropout in RNNs * fixed TensorBoard callback (keras-team#2363) * 1.0.1 release * Update topology.py (keras-team#2373) * Fix stateful unrolled RNNs in Theano * Fix wrapper learning phase * Add reset function to ImageDataGenerator * Add inception v3 example * Fixed typo. (keras-team#2401) * set input_length before reshape (keras-team#2410) * Update imagedatagenerator * add `eye` to backened (keras-team#2407) * Fix loss compatibility validation * Make merge work with pure TF/TH tensors * Add scikit_learn wrapper example (keras-team#2388) * Add scikit_learn wrapper example * Extract and evaluate best model in examples/mnist_sklearn_wrapper.py * adding built check inside TimeDistributed (keras-team#2426) * Add additional input data validation check * Fix Travis concurrent directory creation issue * DOC: models should be compiled upon loading (keras-team#2428) * fixing the constants thing in theano rnn (keras-team#2429) * fix layer/node topo sort problem (keras-team#2433) * fix layer/node topo sort problem * fix to only iterate over valid layer/node keys * clarified usage of sparse_categorical_crossentropy (keras-team#2450) - addressess keras-team#2444 * Update merge tests * allows python3.5 to build alongside < 3.5 (keras-team#2457) * correct inception_v3 network (keras-team#2472) * fix accuracy with sparse_categorical_crossentropy (keras-team#2471) * fix a benign but wrong range number in GRU's get_constants (keras-team#2475) * fixed Merge Layer functional API (keras-team#2460) * fixed Merge Layer functional API * moved test to layers/test_core * add weights for SGD optimizer (keras-team#2478) * Fix PEP8 * Update antirectifier.py (keras-team#2485) * Remove outdated comment * Update regularizer tests * Add new metrics and metrics tests * Add model_from_config in models.py * Add cos and sin to backend (keras-team#2493) * Fix build * Fixed minor typo in getting-started/sequential-model-guide (keras-team#2499) * Added simple support for returning a multitarget loss * Fix plot with show_shapes and multiple inputs/outputs. (keras-team#2421) * Fix PEP8 * Improve TF session & variable management * Fix typo in README * Add root imports * Add TF graph management warning * adding a disable_b boolean to Dense (keras-team#2512) * adding a disable_b boolean to Dense * changing 'disable_b' to 'bias' Changing the name of the boolean & flipping its behavior so that the default is True and when set to False the bias is not used. * integrating bias flag fully changed the bias flag to affect the creation of the self.b variable as well as the output calculation * fixing a blank line to appease pep8 * Rewriting image augmenter (keras-team#2446) * Much better image data augmentor * removed unnecessary functions * shift origin to centre of the image for homographies * init commit * change to zoom_range * Added scikit-image to extras_require in setup.py * add zoom_range test, exception for invalid zoom_range * add scikit-image to dependency * fix fit and retain old functions for unit test * use ndi insteadskimage in random_transform * removed buggy code in random_rotations, shears etc and replaced it with todos. * remove sci-image, implement ndimage based methods, refactor random_transform * random_zoom, array_to_img consider dim_ordering * add random_channel_shift, support fill_mode and cval * image doc, update test_image, PEP8 * fix channel shift clip * fix doc, refine code * detail explain of zoom range * check coding style * Style fixes in preprocessing/image * Fix docstring * Style touch-ups * Fix test_image path non-exist error in ci-travis (keras-team#2531) * correct inception_v3 network * store test images in class attribute * PEP8 * Minor UX fix * Re-raise exceptions to preserve stack trace (keras-team#2350) * Prepare 1.0.2 PyPI release * Make bias optional everywhere * Improved docs of ImageDataGenerator (keras-team#2565) * Misc fixes * "total_loss" -> "loss" * Added softsign activation function (keras-team#2097) * fix activity regularizer so it can deal with multiple inbound nodes as well (keras-team#2573) * Add doc page about writing custom layers. * updated for list check bug in predict/predict_on_batch (keras-team#2585) * updated for list check bug in predict/predict_on_batch * pep fix I think that's going to be the only pep complain.. * Fix typo in documentation * one line fix for TensorBoard callback issue (keras-team#2574) * one line fix for TensorBoard callback issue Ref: keras-team#2570 * handle SummaryWriter based on tensorflow version code contributed by @bnaul bnaul@e04ce5e 88286d * Fix typos in layer writing guide * Improve optimizer configuration * Add `batch_get_value` to backends (keras-team#2615) * Add function to get multiple values at once * Change to match existing batch_set_value * Fix typo * Allow use of predict without compilation * Faster LSTM (keras-team#2523) * Faster LSTM * PEP8 * RNN dropout fix * PEP * PEP * Less code duplication * LSTM benchmark example * PEP * Test implementation modes * Go through Keras backend * Style fixes * fix soft sign deprecation warning (keras-team#2623) and backward compatible * fixed docs for `Sequential.get_config`, and added a more helpful (keras-team#2635) exception to `model_from_config`. * remove unused import statement in keras dir (keras-team#2638) * remove unused import statement in keras dir * rewrite import graph statement * Revert "remove unused import statement in keras dir" (keras-team#2641) * Faster GRU (keras-team#2633) * add a simple named entity recognition example add a simple named entity recognition example * add fast version of GRU add fast version of GRU * remove useless stuff * Revert "Revert "remove unused import statement in keras dir"" (keras-team#2647) * Fix initialization of index_array (keras-team#2590) index_array should be initialized when self.batch_index is zero. * Fix weight saving issue * Style touch-up * fixed shape typo (keras-team#2679) * fixed shape typo * pep8 * functional API intermediate output doc in faq (keras-team#2682) * Residual connection should have the same dimension in case of no projection matrix (keras-team#2688) * Update documentation docstring Embedding (keras-team#2693) From the documentation it is not entirely clear that if mask_zero is set to True, the input_dim argument should be equal to the size of the vocabulary + 2, as index 0 cannot be used anymore. (This behaviour seems a bit strange, as it has as a consequence that the first column of the weights of the embeddings will never be used or updated. The resulting network thus has a redundant set of parameters). * Fix shape inference issue with TF.resize_images * Update RMSprop, Adagrad, Adadelta * Fix flaky test * Normalize layer imports in examples * Update RMSprop * Update the reference of Batch Normalization (keras-team#2700) We should refer the paper accepted in ICML 2015, instead of arXiv. * Fix common LaTeX encoding issue * Remove references to "join" merge mode * Add K.tile test * Add VAE example * Prepare 1.0.3 release * Input: proper error message for missing "shape" argument (keras-team#2727) * Fix zero division in merge mode='cos' (keras-team#2725) * fix cos zero division * use backend epsilon * save keras version & compile args when serializing models (keras-team#2690) * save keras version & compile args when serializing models * renamed prepare_config -> _updated_config + cleaner implementation * rename z_log_sigma to z_log_std to match z_mean (which is not z_mu) (keras-team#2729) * Update bibtex entry * Fix TB callback with non-standard TF version nums * Add download error suggestion for babi_rnn.py and babi_memnn.py. (keras-team#2752) * changeable print_summary (keras-team#2761) * use changeable print_summary * minor * Correction to fan_out initializaiton (keras-team#2252) * account for receptive field size in fan_out * added test for conv layer initializations * removed old reference to kernel_size * Fixed typo (keras-team#2770) Fixed the year from "7 Apr 201" to "7 Apr 2015". * Add FAQ entry about layer freezing * Fix ActivityReg layer * Fix first axis dim validation in multi-input model * Clarify error message * Fix serialization issue with nested Sequential * Simplify imports in README * correctly serialize loss function (keras-team#2806) * Change way node depth is computed for shared layer * Add stateless batchnorm mode * Default values corrected for featurewise_std_normalization and featurewise_center (keras-team#2831) For ImageDataGenerator, False is the default value for for featurewise_std_normalization and featurewise_center. * Small changes in mask caching * BN only uses learning phase in mode 0 * added required import line (keras-team#2839) * s/TimeDistributedDense/TimeDistribute(Dense(.../g (keras-team#2843) * Fix typo in doc * Fix json serialization in merge layer (keras-team#2854) Fix keras-team#2818 * Make Merge output_shape consistent with lambda * Fix JSON deserialization issue * fix typo (keras-team#2881) * fix typo * Update scikit-learn-api.md * Fix YAML serialization when using Regularizers (keras-team#2883) Fix keras-team#2871 * Added objective: Kullback Leibler Divergence (keras-team#2872) * Added objective: Kullback Leibler Divergence * KLD: Clip at 1 * fix bug: change seed range for RandomStreams in Theano (keras-team#2865) * bug fixed, numpy randint only output positive numbers ranging from 1 to 10e6 * Update theano_backend.py changed style and numpy randint range * Update theano_backend.py removed extra spaces * limit progress bar update rate (keras-team#2860) * limit progress bar update rate Limit progress bar update rate in verbose=1 mode. This patch allows to reduce terminal I/O throughput while keeping reasonable high visual update rate (defaults to 100 refreshes per second). It helps greatly when working with large but simple data sets with small batches, which leads to millions of relatively useless screen updates per second. Also it helps to keep network traffic at reasonable rates, which exceptionally useful within laggy networking conditions when using keras over telnet/ssh, and improve web browser responsibility when using keras within Jupyter Notebook. * add docstrings for 'interval' and 'force' arguments * fixed formatting error in the docstring (keras-team#2797) * fixed formatting error in the docstring * fixed formatting error in TimeDistributedDense of core.py * Make dim_ordering a global default * Remove bit of deprecated code * MaxoutDense no activation; incorrect docs (keras-team#2895) Since MaxoutDense does not have activation it might be misleading to include "activation" as one of the arguments in the function docs. * Tiny fixes in Sequential methods * Refactor ImageDataGenerator, add directory support * Improve docstring in preprocessing/image * Update image preprocessing docs * Fix some py3 generator issue * Allow absence of labels in flow() * Allow no layer names in plot() * Fix PEP8 BS * Docs adjustment * Prepare 1.0.4 PyPI release * Cleanup docs autogen script * Fix typos in image preprocessing docs (keras-team#2906) * Spellcheck source files (keras-team#2907) * Fix predict_proba method of KerasClassifier to return probabilites for both classes in case of binary classification. issue:2864 (keras-team#2924) * Fix typo in docs * fix 2852 (keras-team#2927) * Add mode=2 option to the docstring in BatchNormalization (keras-team#2919) Fix a tiny typo. * Fix description about parameter `output_shape` for function `merge` (keras-team#2933) * Make DirectoryIterator case insensitive (keras-team#2932) * make DirectoryIterator case insensitive * Also need to make filename case insensitive while appending it into self.filenames * fix bug: rename duplicated loss name (keras-team#2842) * rename duplicated loss name * make python3 happy * rewritten code to make it easy to read * Small style fixes * Eigenvalue Decay regularization (keras-team#2846) * Update regularizers.py I included a new regularizer named Eigenvalue Decay to the deep learning practitioner that aims at maximum-margin learning. This version approximates the dominant eigenvalue by a soft function given by the power method. For details, see: Oswaldo Ludwig. "Deep learning with Eigenvalue Decay regularizer." ArXiv eprint arXiv:1604.06985 [cs.LG], (2016). https://www.researchgate.net/publication/301648136_Deep_Learning_with_Eigenvalue_Decay_Regularizer The syntax for Eigenvalue Decay is similar to the other Keras weight regularizers, e.g.: model.add(Dense(100, W_regularizer=EigenvalueRegularizer(0.0005))) * Example with Eigenvalue Decay regularization. An example from Keras including regularization with Eigenvalue Decay. After training, you have to save the trained weights, create/compile a similar model without Eingenvalue Decay and save this model. Then, you can use your trained weights with this model, see lines 123-153 of CIFAR10_with_Eigenvalue_Decay.py (This is still an open issue). This example yields a gain in the accuracy by the use of Eigenvalue Decay of 2.71% (averaged over 10 runs). * Update CIFAR10_with_Eigenvalue_Decay.py * Update CIFAR10_with_Eigenvalue_Decay.py * Update CIFAR10_with_Eigenvalue_Decay.py * Update regularizers.py * Update regularizers.py * Delete CIFAR10_with_Eigenvalue_Decay.py * Update test_regularizers.py * Update regularizers.py * Update test_regularizers.py * Update regularizers.py * Update regularizers.py I needed another reading in Keras backend... * Issue to get shape of a tensor. Issue to get shape of a tensor in the class EigenvalueRegularizer: the type returned for shape is different for Theano backend (Theano tensor type) and TF backend (TF TensorShape). * Update regularizers.py * Update regularizers.py * Update regularizers.py * Update regularizers.py * Update regularizers.py * Update regularizers.py * Update regularizers.py * Fix 1D convolution layers under Theano backend (keras-team#2938) This issue is due to an unexpected loss of dimensionality when composing the backend tensor operations "reshape" and "squeeze" when there are dimensions of length 1. For example, using a Theano backend the following fails with a complaint about dimension mismatch: UpSampling1D(2)(MaxPooling1D(2)(Reshape((2,1))(Input(shape=(2,))))) The issue arises due to the conflict of two behaviors specific to the Theano backend: - Reshape uses Theano's reshape function. Theano's reshape automatically makes dimensions with length 1 "broadcastable" - MaxPooling1D's implementation class _Pooling1D has a call method which uses a dummy dimension which it has to remove. The manner in which this dummy method is removed it to call "squeeze(x, axis)" from the backend. The squeeze implementation tells Theano to make the dummy dimension broadcastable, and then calls Theano's "squeeze", which removes ALL the broadcastable dimensions; not just the dummy dimension, but also the length 1 dimension flagged as broadcastable by reshape. This causes the problem observed above. This behavior is distinct from the behavior of the TensorFlow backend, which removes only the requested dimension. This PR addresses this issue in two ways: First, it introduces a test which checks the composition of "reshape" and "squeeze" to make sure we get the same result using both Theano and TensorFlow backends. Second, it changes the implementation of squeeze(x,axis) so that the Theano backend should behave similarly to the TensorFlow backend. With this change the introduced test passes and the above example works. * Update visualization.md (keras-team#2942) * Update visualization.md Added show_layer_names argument and its default value to docs * Update visualization.md * Convolution1D: apply activation after reshape * Nadam optimizer and test for it added (keras-team#2764) * Nadam optimizer and test for it added * pep8 fix * add comment in docstring and one more pep8 fix * Nadam optimizer style fixes * Fix issue with Sequential deserialization * Fix initial variable in Evaluator. (keras-team#2955) * Resolve keras-team#2960 (keras-team#2961) * Resolve keras-team#2960 Introduce `K.var` so that the standard deviation computation can be made numerically stable. Instead of K.std(x) the user is able to write K.sqrt(K.var(x) + self.epsilon) avoiding a division by zero in the gradient computation of `sqrt`. * Fix typos * Fix issue with cascade of Merge layers * Fix tf-idf (keras-team#2980) Fix keras-team#2974 * Clarify use of two-branch models * Allow arbitrary output shapes for custom losses * Fix get_word_index (keras-team#2981) * Fix tf-idf again (keras-team#2986) Fix 53aaa84 Fix keras-team#2974 * Fix TF-IDF in Python 2 (keras-team#2992) Fix keras-team#2974 * Fix typo in docs * fix wrong calls of __init__ in callbacks (keras-team#2999) * Fix json serialization in merge layer with lamda output shape (keras-team#3011) Fix keras-team#3008 * Fix json serialization in Lambda layer (keras-team#3012) Fix keras-team#2582 Fix keras-team#3001 * Fix typo in training (keras-team#3014) * Allow re-use of EarlyStopping callback objects. (keras-team#3000) An EarlyStopping callback object has internal state variables to tell it when it has reached its stopping point. These were initialized in __init__(), so attempting to re-use the same object resulted in immediate stopping. This prevents (for example) performing early stopping during cross-validation with the scikit-learn wrapper. This patch initializes the variables in on_train_begin(), so they are re-set for each training fold. Tests included. * doc: fix example for recurrent layer (keras-team#3022) * Avoid double key lookup on callback.py (keras-team#3018) On method on_epoch_end, to add new keys to the history dict, first it is verified if a key is not on the history dict and if that is the case, a new key is created on the history dict with an empty list as value. However, this operation search for a key twice in the dict. This same behavior can be achieved in a single step using dict setdefault method. * Add comment for a note of caution (keras-team#3024) * Moved epoch_logs = {} before batch loop to avoid UnboundLocalError. (keras-team#3019) * fix: Sort subdirs before mapping them to classes. (keras-team#3052) The documentation says that [1]: > If [classes are] not provided, the list of classes will be automatically inferred (and the order of the classes, which will map to the label indices, will be alphanumeric). However, the code was adding classes in the order `os.listdir` returned them. This commit alphanumerically sorts the sub-directories before mapping them to label indices. [1] http://keras.io/preprocessing/image/ * Support for masking in merged layers (keras-team#2413) * added masking to merge layer (keras-team#2413) * added documentation, fixed stylistic issues * removed casting * changed to using K.all * Fix flaky test * Small fixes in text gen example * Remove unnecessary space * A small typo (keras-team#3067) * Fix typo (keras-team#3070) * Fix flaky test * Fix duplicated updates issue * Add attribute caching for flattened_layers * Prepare 1.0.5 PyPI release * Fix flaky test * model should use binary accuracy for binary crossentropy loss (keras-team#3098) * Fix issue with multi-io + BatchNorm mask computing * Remove unnecessary assert * Fix masking test * Style fix in test * Added optional path argument (keras-team#3118) * Validate dot_axes argument in cos mode and fix output shape (keras-team#3116) * Validate dot_axes argument in cos mode * Update topology.py * Update topology.py * Prevent image_dim_ordering from being overwritten * TimeDistributedDense -> TimeDistributed(Dense()) in doc example * Lambda should not support masking implicitly * New conv ops (keras-team#3134) * New function signature for conv2d in backend * Clean up stuff * Touch-up TF deconv op * More cleanup * Support for TF 3D conv/pool * Move pooling layers to their own file * Update TF version in Travis config * Fix conv3d tests * locally-connected layer add unittest, fix output shape PEP8 flatten weight, improve example update docstring, remove cifar10 Alex exmaple improve docstring, remove duplicate func parallel by batch_dot fix theano batch_dot dim_ordering unit test, theano only use dot dim_ordering unit test Update locally connected layers * Add tests for locally connected layers * Add MIT license badge to README * Add multiprocessing for fit generator (keras-team#3049) * Add multiprocessing for fit generator * Change maxproc to nb_worker and update documentation * Simplify multiprocessing test, clarify doc replace maxproc by nb_worker * Replace maxproc by nb_worker in test * Replace maxproc by nb_worker in test * Update the doc: specify non picklable arguments should not be used with multiprocessing * Add multiprocessing as an option with the pickle_safe argument * Lambda output shape (keras-team#2680) * updating the info for lambda * updated lambda doc a bit more made it more readable and stuff * fix docs bugs (keras-team#3142) * fix docs bugs * fix docs bugs * Added 'max' operation to Merge layer (keras-team#3128) * Added 'max' operation to Merge layer. It allows to implement convolutional maxout with two (or more) convoluion layers and one Merge. * Added 'max' to merge test * Use defaultdict for _UID_PREFIXES (keras-team#3087) The method get_uid on common.py first check if a prefix is in _UID_PREFIXED dict and if it is not, a variable is added to the dict. However, using a defaultdict, this check is no longer necessary. * Less frequent dataset tests * Style touch-ups in TF backend * fix get_output_shape_for in Merge, when mode is callable (keras-team#3144) * Added optional field name argument to RemoteMonitor callback (keras-team#3157) * Added optional path argument * Added optional field name argument * Create initial_state tensor filled with zeros without use of K.zeros (keras-team#3123) * Create initial_state tensor filled with zeros without use of K.zeros * minor PEP8 fix
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed. |
Here is an example language model, that doesn't work with sparse_categorical_crossentropy. If one uncomments lines 25-26, running the script will result in the following error:
Traceback (most recent call last): File "./sparse_softmax_example.py", line 34, in <module> main() File "./sparse_softmax_example.py", line 26, in main model.fit(input, output, batch_size=1, nb_epoch=1) File "/place/home/hr0nix/src/text_features/env2/lib/python2.7/site-packages/keras/models.py", line 402, in fit sample_weight=sample_weight) File "/place/home/hr0nix/src/text_features/env2/lib/python2.7/site-packages/keras/engine/training.py", line 1036, in fit callback_metrics=callback_metrics) File "/place/home/hr0nix/src/text_features/env2/lib/python2.7/site-packages/keras/engine/training.py", line 774, in _fit_loop outs = f(ins_batch) File "/place/home/hr0nix/src/text_features/env2/lib/python2.7/site-packages/keras/backend/theano_backend.py", line 499, in __call__ return self.function(*inputs) File "/place/home/hr0nix/src/text_features/env2/lib/python2.7/site-packages/theano/compile/function_module.py", line 815, in __call__ allow_downcast=s.allow_downcast) File "/place/home/hr0nix/src/text_features/env2/lib/python2.7/site-packages/theano/tensor/type.py", line 178, in filter data.shape)) TypeError: ('Bad input argument to theano function with name "/place/home/hr0nix/src/text_features/env2/lib/python2.7/site-packages/keras/backend/theano_backend.py:495" at index 1(0-based)', 'Wrong number of dimensions: expected 3, got 2 with shape (1, 2).')
Lines 28-30 work just fine. Am I doing something wrong or is it a bug in keras codebase?
The text was updated successfully, but these errors were encountered: