implement sparse categorical crossentropy, enable unitests #145

roywei · 2018-07-25T08:55:21Z

This PR completes the implementation of sparse categorical crossentropy

Implement sparse categorical crossentropy
Fix Bug in K.equal operator #146 found when fixing sparse categorical accuracy, fixed all element wise operators
Now K.equal() can handle two variables with different shapes using broadcast operations
Enabled unit tests:

test_training: test_model_with_crossentropy_losses_channels_first
losses_test.py: test_sparse_categorical_crossentropy
metrics_test.py: test_sparse_metrics: sparse_categorical_crossentropy
losses_test.py: test_cce_one_hot
losses_test.py: test_sparse_categorical_crossentropy_4d
metrics_test.py: test_sparse_metrics: sparse_categorical_accuracy

Enabled/Tested end to end examples:

examples: mnist_acgan.py

Performance tests

Performance comparison to categorical_crossentropy
Sparse categorical crossentropy is faster than categorical crossentropy when dealing with large number of exclusive labels. X1.78 faster with 1000 classes and 10000 samples. See detailed results in comment.

…nitests

sandeep-krishnamurthy · 2018-07-27T15:59:50Z

Super cool work @roywei

sandeep-krishnamurthy · 2018-07-27T16:03:04Z

keras/backend/mxnet_backend.py

@@ -1614,11 +1614,14 @@ def equal(x, y):
    if isinstance(y, KerasSymbol):
        y = y.symbol
        scalar = True
-    if scalar:
+    if scalar and x.infer_shape() == y.infer_shape():


There is no need for this if..else. broadcast_equal will work for both same shape and shapes that require broadcasting.

Same for all other operators below.

sandeep-krishnamurthy · 2018-07-27T16:04:41Z

keras/backend/mxnet_backend.py

@@ -2902,7 +2927,29 @@ def sparse_categorical_crossentropy(target, output, from_logits=False, axis=-1):
    # Returns
        Output tensor.
    """
-    raise NotImplementedError('MXNet Backend: Sparse operations are not supported yet.')
+    output_dimensions = list(range(len(int_shape(output))))


len(int_shape(output)) => ndim(output)?

sandeep-krishnamurthy · 2018-07-27T19:04:43Z

keras/backend/mxnet_backend.py

    if isinstance(x, mx.sym.Symbol) and isinstance(y, mx.sym.Symbol):
+        # use broadcasting to do element-wise comparison if both x and y are mxnet symbol


Usually it is best practice not to have multiple returns in the same function. (hard to maintain and debug). Can you restructure? Same in other places.

roywei · 2018-07-27T21:15:07Z

@sandeep-krishnamurthy @kalyc
PR is ready for review, addressed comments, did performance benchmark using the following script:
sparse categorical is X1.78 faster

import time

import numpy as np
from keras import backend as K
from keras import losses

num_classes = 1000
num_samples = 10000
pred = np.random.dirichlet(np.ones(num_classes),size=num_samples)
true = np.random.randint(0, 10, (num_samples,))
y_pred = K.variable(pred)
y_true = K.variable(true)
start = time.time()
loss = K.eval(losses.sparse_categorical_crossentropy(y_true, y_pred))
end = time.time()
print("sparse_categorical_crossentropy result:", np.mean(loss))
print("sparse_categorical_crossentropy time:", end - start)


start = time.time()
y_true = K.one_hot(y_true, num_classes)
loss = K.eval(losses.categorical_crossentropy(y_true, y_pred))
end = time.time()
print("categorical_crossentropy result:", np.mean(loss))
print("categorical_crossentropy time:", end - start)

output

sparse_categorical_crossentropy result: 7.4769287
sparse_categorical_crossentropy time: 0.14634299278259277
categorical_crossentropy result: 7.4769287
categorical_crossentropy time: 0.2528080940246582

kalyc

Thanks for the contributions, added comments inline

kalyc · 2018-07-27T21:20:27Z

keras/backend/mxnet_backend.py

+        try:
+            out = np.greater(x, y)
+        except:
+            raise TypeError('MXNet Backend: The inputs are not valid for not_equal operation.')


modify error message to reflect operator name - 'The inputs are not valid for greater operation'

kalyc · 2018-07-27T21:27:59Z

keras/backend/mxnet_backend.py

+        try:
+            out = np.less(x, y)
+        except:
+            raise TypeError('MXNet Backend: The inputs are not valid for not_equal operation.')


The inputs are not valid for less operation

kalyc · 2018-07-27T21:28:12Z

keras/backend/mxnet_backend.py

+        try:
+            out = np.less(x, y)
+        except:
+            raise TypeError('MXNet Backend: The inputs are not valid for not_equal operation.')


less_equal operation*

kalyc

Added few more comments

kalyc · 2018-07-27T21:44:40Z

keras/backend/mxnet_backend.py

+        out = KerasSymbol(mx.sym.Cast(mx.sym.broadcast_lesser_equal(lhs=x, rhs=y), dtype='uint8'))
+    elif scalar:
+        # directly use '<=' operator for element-wise comparison
+        out = KerasSymbol(mx.sym.Cast(x <= y, dtype='uint8'))


[minor] check style

[adjust spaces around '=']

@kalyc usually there is no space for optional params values, refer to any other operators.

kalyc · 2018-07-27T21:45:06Z

keras/backend/mxnet_backend.py

+        # directly use '<' operator for element-wise comparison
+        out = KerasSymbol(mx.sym.Cast(x < y, dtype='uint8'))
+    else:
+        # use numpy if x and x are all numbers or numpy arrays


kalyc · 2018-07-27T21:45:14Z

keras/backend/mxnet_backend.py

    else:
-        raise TypeError('MXNet Backend: The inputs are not valid for greater_equal operation.')
+        # use numpy if x and x are all numbers or numpy arrays


kalyc · 2018-07-27T21:45:25Z

keras/backend/mxnet_backend.py

    else:
-        raise TypeError('MXNet Backend: The inputs are not valid for greater operation.')
+        # use numpy if x and x are all numbers or numpy arrays


if x and y*

kalyc · 2018-07-27T21:45:37Z

keras/backend/mxnet_backend.py

    else:
-        raise TypeError('MXNet Backend: The inputs are not valid for equal operation.')
+        # use numpy if x and x are all numbers or numpy arrays


if x and y*

kalyc · 2018-07-27T21:45:54Z

keras/backend/mxnet_backend.py

    else:
-        raise TypeError('MXNet Backend: The inputs are not valid for not_equal operation.')
+        # use numpy if x and x are all numbers or numpy arrays


if x and y*

kalyc · 2018-07-27T21:48:07Z

keras/backend/mxnet_backend.py

    if isinstance(x, mx.sym.Symbol) and isinstance(y, mx.sym.Symbol):
-        return KerasSymbol(mx.sym.Cast(mx.sym.broadcast_greater(lhs=x, rhs=y), dtype='uint8'))
+        # use broadcasting to do element-wise comparison if both x and y are mxnet symbol


please follow docstring conventions as mentioned here - https://www.python.org/dev/peps/pep-0257/#one-line-docstrings

Also the comments are not necessary per operator, you can consider adding this blurb about implementation of operators at the beginning of the file depending on data type

kalyc · 2018-07-27T21:50:09Z

keras/backend/mxnet_backend.py

+                'which has {} dimensions.'.format(len(int_shape(output)))))
+
+    mx_output = output.symbol
+    # scale predictions so that the class probas of each sample sum to 1


see comment above about writing the docstring - https://www.python.org/dev/peps/pep-0257/#one-line-docstrings

@kalyc compare to element-wise comparison operators (e.g. K.equal), sparse_categorical_crossentropy and categorical_crossentropy are more complicated, and it requires different backend to do different processing logic. It's important to place the inline comments. Placing these comments in doc string will confuse users. See tensorflow_backend.py for reference.

Thanks for the explanation!
[minor] change probas to probabilities

kalyc · 2018-07-27T21:53:39Z

keras/backend/mxnet_backend.py

+    mx_output = mx.sym.clip(mx_output, a_min=epsilon(), a_max=1.0 - epsilon())
+    # For this operation, the probability of a given label is considered exclusive.
+    mx_output = mx.sym.pick(mx_output, target.symbol, axis=axis, keepdims=True)
+    mx_output = - mx.sym.log(mx_output, axis=axis)


what does this do? Why are you making this value negative? Why not use mx.sym here instead of -?

Please refer to loss function equations: http://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html#cross-entropy

kalyc

Why not commit the benchmarking script in this dir? - https://github.com/awslabs/keras-apache-mxnet/tree/master/benchmark/scripts

kalyc · 2018-07-27T23:15:50Z

keras/backend/mxnet_backend.py

+        out = KerasSymbol(mx.sym.Cast(mx.sym.broadcast_lesser_equal(lhs=x, rhs=y), dtype='uint8'))
+    elif scalar:
+        # directly use '<=' operator for element-wise comparison
+        out = KerasSymbol(mx.sym.Cast(x <= y, dtype='uint8'))


[adjust spaces around '=']

kalyc · 2018-07-27T23:18:29Z

keras/backend/mxnet_backend.py

+                'which has {} dimensions.'.format(len(int_shape(output)))))
+
+    mx_output = output.symbol
+    # scale predictions so that the class probas of each sample sum to 1


Thanks for the explanation!
[minor] change probas to probabilities

kalyc · 2018-07-27T23:18:46Z

keras/backend/mxnet_backend.py

+        mx_output = mx.sym.broadcast_div(mx_output, mx.sym.sum(mx_output,
+                                                               axis=axis,
+                                                               keepdims=True))
+    # clip to prevent NaN's and Inf's


'infinities'

'nan' and 'inf' are python values

roywei · 2018-07-27T23:40:48Z

@kalyc addressed comments.
The performance scripts is just a simple one to test the operator, it's a sanity check.
Only complete and useful models should be added to benchmark scripts.

kalyc

LGTM!

* Add saving mxnet model checkpoint callback (#132) * Add saving mxnet model checkpoint callback * Add tests for MXNetModelCheckpoint callback * Fix MXNetModelCheckpoint test case * Fix MXNetModelCheckpoint tests. Add dependency on keras_applications and keras_preprocessing * Fixed CR comments. Split tests into multiple independent tests * Fix CR comments on the code documentation * Add additional test to verify only one model is saved * Add examples of monitors * update pr and nightly buildspec,add into source control (#141) * Fix batchnorm gamma (#137) * fix gamma and beta equal to None * fix style * fix initializer, enable unit test * update comments * remove +, remove repeated install, add clear message (#142) * fix conv1d channels first (#143) * fix conv1d channels first * update data format for causal test * fix style * Adding get_mxnet_model_info API to allow users to query underlying MXNet model info (#144) * Adding get_mxnet_model_info API to allow users to query underlying MXNet model info * resolve merge conflicts in conv1d * Add more tests - functional model, compare with return values of save_mxnet_model API * update save mxnet model API document. (#147) Co-authored-by: Sandeep Krishnamurthy <[email protected]> * implement sparse categorical crossentropy, enable unitests (#145) * implement sparse categorical crossentropy, enable unitests * fix elementwise operators, fix sparse categorical accuract, enabled unitests * fix element wise opreators * simplify using ndim * fix style * reduce number of returns * fix operator name in error message * update comments and doc string * update comment spelling * update documentation, improve CI test commands (#151) * update documentation, improve CI test commands * fix conv1d initialization, fix conv1d unit test * fix documentation hyperlink * update buildspec * remove official keras installed with dependencies * update ci commands

implement sparse categorical crossentropy, enable unitests

386b663

roywei mentioned this pull request Jul 27, 2018

Bug in K.equal operator #146

Closed

fix elementwise operators, fix sparse categorical accuract, enabled u…

92d03f2

…nitests

sandeep-krishnamurthy reviewed Jul 27, 2018

View reviewed changes

roywei added 4 commits July 27, 2018 11:36

fix element wise opreators

9ee1801

simplify using ndim

2c1f0f4

fix style

037424c

trigger ci

2e46670

sandeep-krishnamurthy reviewed Jul 27, 2018

View reviewed changes

reduce number of returns

ddf4769

roywei changed the title ~~[WIP]implement sparse categorical crossentropy, enable unitests~~ implement sparse categorical crossentropy, enable unitests Jul 27, 2018

roywei requested a review from kalyc July 27, 2018 21:15

kalyc suggested changes Jul 27, 2018

View reviewed changes

fix operator name in error message

1938a08

kalyc reviewed Jul 27, 2018

View reviewed changes

update comments and doc string

84628e8

roywei mentioned this pull request Jul 27, 2018

keras-apache-mxnet update with bug fixes and new features #148

Merged

kalyc reviewed Jul 27, 2018

View reviewed changes

update comment spelling

991ab74

kalyc approved these changes Jul 27, 2018

View reviewed changes

sandeep-krishnamurthy approved these changes Jul 29, 2018

View reviewed changes

sandeep-krishnamurthy merged commit 10e7087 into awslabs:dev2 Jul 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement sparse categorical crossentropy, enable unitests #145

implement sparse categorical crossentropy, enable unitests #145

roywei commented Jul 25, 2018 •

edited

Loading

sandeep-krishnamurthy commented Jul 27, 2018

sandeep-krishnamurthy Jul 27, 2018

sandeep-krishnamurthy Jul 27, 2018

sandeep-krishnamurthy Jul 27, 2018

sandeep-krishnamurthy Jul 27, 2018

roywei commented Jul 27, 2018

kalyc left a comment

kalyc Jul 27, 2018

kalyc Jul 27, 2018

kalyc Jul 27, 2018

kalyc left a comment

kalyc Jul 27, 2018

kalyc Jul 27, 2018

roywei Jul 27, 2018

kalyc Jul 27, 2018

kalyc Jul 27, 2018

kalyc Jul 27, 2018

kalyc Jul 27, 2018

kalyc Jul 27, 2018

kalyc Jul 27, 2018

kalyc Jul 27, 2018

roywei Jul 27, 2018

kalyc Jul 27, 2018

kalyc Jul 27, 2018

roywei Jul 27, 2018

kalyc left a comment

kalyc Jul 27, 2018

kalyc Jul 27, 2018

kalyc Jul 27, 2018

roywei Jul 27, 2018

roywei commented Jul 27, 2018

kalyc left a comment

		if isinstance(x, mx.sym.Symbol) and isinstance(y, mx.sym.Symbol):
		# use broadcasting to do element-wise comparison if both x and y are mxnet symbol

implement sparse categorical crossentropy, enable unitests #145

implement sparse categorical crossentropy, enable unitests #145

Conversation

roywei commented Jul 25, 2018 • edited Loading

sandeep-krishnamurthy commented Jul 27, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roywei commented Jul 27, 2018

kalyc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kalyc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kalyc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roywei commented Jul 27, 2018

kalyc left a comment

Choose a reason for hiding this comment

roywei commented Jul 25, 2018 •

edited

Loading