-
Notifications
You must be signed in to change notification settings - Fork 65
implement sparse categorical crossentropy, enable unitests #145
implement sparse categorical crossentropy, enable unitests #145
Conversation
Super cool work @roywei |
keras/backend/mxnet_backend.py
Outdated
@@ -1614,11 +1614,14 @@ def equal(x, y): | |||
if isinstance(y, KerasSymbol): | |||
y = y.symbol | |||
scalar = True | |||
if scalar: | |||
if scalar and x.infer_shape() == y.infer_shape(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need for this if..else. broadcast_equal will work for both same shape and shapes that require broadcasting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for all other operators below.
keras/backend/mxnet_backend.py
Outdated
@@ -2902,7 +2927,29 @@ def sparse_categorical_crossentropy(target, output, from_logits=False, axis=-1): | |||
# Returns | |||
Output tensor. | |||
""" | |||
raise NotImplementedError('MXNet Backend: Sparse operations are not supported yet.') | |||
output_dimensions = list(range(len(int_shape(output)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(int_shape(output)) => ndim(output)?
keras/backend/mxnet_backend.py
Outdated
if isinstance(x, mx.sym.Symbol) and isinstance(y, mx.sym.Symbol): | ||
# use broadcasting to do element-wise comparison if both x and y are mxnet symbol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually it is best practice not to have multiple returns in the same function. (hard to maintain and debug). Can you restructure? Same in other places.
@sandeep-krishnamurthy @kalyc
output
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contributions, added comments inline
keras/backend/mxnet_backend.py
Outdated
try: | ||
out = np.greater(x, y) | ||
except: | ||
raise TypeError('MXNet Backend: The inputs are not valid for not_equal operation.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modify error message to reflect operator name - 'The inputs are not valid for greater operation'
keras/backend/mxnet_backend.py
Outdated
try: | ||
out = np.less(x, y) | ||
except: | ||
raise TypeError('MXNet Backend: The inputs are not valid for not_equal operation.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inputs are not valid for less operation
keras/backend/mxnet_backend.py
Outdated
try: | ||
out = np.less(x, y) | ||
except: | ||
raise TypeError('MXNet Backend: The inputs are not valid for not_equal operation.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
less_equal operation*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added few more comments
keras/backend/mxnet_backend.py
Outdated
out = KerasSymbol(mx.sym.Cast(mx.sym.broadcast_lesser_equal(lhs=x, rhs=y), dtype='uint8')) | ||
elif scalar: | ||
# directly use '<=' operator for element-wise comparison | ||
out = KerasSymbol(mx.sym.Cast(x <= y, dtype='uint8')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[minor] check style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[adjust spaces around '=']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kalyc usually there is no space for optional params values, refer to any other operators.
keras/backend/mxnet_backend.py
Outdated
# directly use '<' operator for element-wise comparison | ||
out = KerasSymbol(mx.sym.Cast(x < y, dtype='uint8')) | ||
else: | ||
# use numpy if x and x are all numbers or numpy arrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if x & y*
keras/backend/mxnet_backend.py
Outdated
else: | ||
raise TypeError('MXNet Backend: The inputs are not valid for greater_equal operation.') | ||
# use numpy if x and x are all numbers or numpy arrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if x & y*
keras/backend/mxnet_backend.py
Outdated
else: | ||
raise TypeError('MXNet Backend: The inputs are not valid for greater operation.') | ||
# use numpy if x and x are all numbers or numpy arrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if x and y*
keras/backend/mxnet_backend.py
Outdated
else: | ||
raise TypeError('MXNet Backend: The inputs are not valid for equal operation.') | ||
# use numpy if x and x are all numbers or numpy arrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if x and y*
keras/backend/mxnet_backend.py
Outdated
else: | ||
raise TypeError('MXNet Backend: The inputs are not valid for not_equal operation.') | ||
# use numpy if x and x are all numbers or numpy arrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if x and y*
keras/backend/mxnet_backend.py
Outdated
if isinstance(x, mx.sym.Symbol) and isinstance(y, mx.sym.Symbol): | ||
return KerasSymbol(mx.sym.Cast(mx.sym.broadcast_greater(lhs=x, rhs=y), dtype='uint8')) | ||
# use broadcasting to do element-wise comparison if both x and y are mxnet symbol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please follow docstring conventions as mentioned here - https://www.python.org/dev/peps/pep-0257/#one-line-docstrings
Also the comments are not necessary per operator, you can consider adding this blurb about implementation of operators at the beginning of the file depending on data type
keras/backend/mxnet_backend.py
Outdated
'which has {} dimensions.'.format(len(int_shape(output))))) | ||
|
||
mx_output = output.symbol | ||
# scale predictions so that the class probas of each sample sum to 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment above about writing the docstring - https://www.python.org/dev/peps/pep-0257/#one-line-docstrings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kalyc compare to element-wise comparison operators (e.g. K.equal), sparse_categorical_crossentropy and categorical_crossentropy are more complicated, and it requires different backend to do different processing logic. It's important to place the inline comments. Placing these comments in doc string will confuse users. See tensorflow_backend.py for reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
[minor] change probas to probabilities
mx_output = mx.sym.clip(mx_output, a_min=epsilon(), a_max=1.0 - epsilon()) | ||
# For this operation, the probability of a given label is considered exclusive. | ||
mx_output = mx.sym.pick(mx_output, target.symbol, axis=axis, keepdims=True) | ||
mx_output = - mx.sym.log(mx_output, axis=axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this do? Why are you making this value negative? Why not use mx.sym here instead of -
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to loss function equations: http://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html#cross-entropy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not commit the benchmarking script in this dir? - https://github.com/awslabs/keras-apache-mxnet/tree/master/benchmark/scripts
keras/backend/mxnet_backend.py
Outdated
out = KerasSymbol(mx.sym.Cast(mx.sym.broadcast_lesser_equal(lhs=x, rhs=y), dtype='uint8')) | ||
elif scalar: | ||
# directly use '<=' operator for element-wise comparison | ||
out = KerasSymbol(mx.sym.Cast(x <= y, dtype='uint8')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[adjust spaces around '=']
keras/backend/mxnet_backend.py
Outdated
'which has {} dimensions.'.format(len(int_shape(output))))) | ||
|
||
mx_output = output.symbol | ||
# scale predictions so that the class probas of each sample sum to 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
[minor] change probas to probabilities
mx_output = mx.sym.broadcast_div(mx_output, mx.sym.sum(mx_output, | ||
axis=axis, | ||
keepdims=True)) | ||
# clip to prevent NaN's and Inf's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'infinities'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'nan' and 'inf' are python values
@kalyc addressed comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
* Add saving mxnet model checkpoint callback (#132) * Add saving mxnet model checkpoint callback * Add tests for MXNetModelCheckpoint callback * Fix MXNetModelCheckpoint test case * Fix MXNetModelCheckpoint tests. Add dependency on keras_applications and keras_preprocessing * Fixed CR comments. Split tests into multiple independent tests * Fix CR comments on the code documentation * Add additional test to verify only one model is saved * Add examples of monitors * update pr and nightly buildspec,add into source control (#141) * Fix batchnorm gamma (#137) * fix gamma and beta equal to None * fix style * fix initializer, enable unit test * update comments * remove +, remove repeated install, add clear message (#142) * fix conv1d channels first (#143) * fix conv1d channels first * update data format for causal test * fix style * Adding get_mxnet_model_info API to allow users to query underlying MXNet model info (#144) * Adding get_mxnet_model_info API to allow users to query underlying MXNet model info * resolve merge conflicts in conv1d * Add more tests - functional model, compare with return values of save_mxnet_model API * update save mxnet model API document. (#147) Co-authored-by: Sandeep Krishnamurthy <[email protected]> * implement sparse categorical crossentropy, enable unitests (#145) * implement sparse categorical crossentropy, enable unitests * fix elementwise operators, fix sparse categorical accuract, enabled unitests * fix element wise opreators * simplify using ndim * fix style * reduce number of returns * fix operator name in error message * update comments and doc string * update comment spelling * update documentation, improve CI test commands (#151) * update documentation, improve CI test commands * fix conv1d initialization, fix conv1d unit test * fix documentation hyperlink * update buildspec * remove official keras installed with dependencies * update ci commands
* Add saving mxnet model checkpoint callback (#132) * Add saving mxnet model checkpoint callback * Add tests for MXNetModelCheckpoint callback * Fix MXNetModelCheckpoint test case * Fix MXNetModelCheckpoint tests. Add dependency on keras_applications and keras_preprocessing * Fixed CR comments. Split tests into multiple independent tests * Fix CR comments on the code documentation * Add additional test to verify only one model is saved * Add examples of monitors * update pr and nightly buildspec,add into source control (#141) * Fix batchnorm gamma (#137) * fix gamma and beta equal to None * fix style * fix initializer, enable unit test * update comments * remove +, remove repeated install, add clear message (#142) * fix conv1d channels first (#143) * fix conv1d channels first * update data format for causal test * fix style * Adding get_mxnet_model_info API to allow users to query underlying MXNet model info (#144) * Adding get_mxnet_model_info API to allow users to query underlying MXNet model info * resolve merge conflicts in conv1d * Add more tests - functional model, compare with return values of save_mxnet_model API * update save mxnet model API document. (#147) Co-authored-by: Sandeep Krishnamurthy <[email protected]> * implement sparse categorical crossentropy, enable unitests (#145) * implement sparse categorical crossentropy, enable unitests * fix elementwise operators, fix sparse categorical accuract, enabled unitests * fix element wise opreators * simplify using ndim * fix style * reduce number of returns * fix operator name in error message * update comments and doc string * update comment spelling * update documentation, improve CI test commands (#151) * update documentation, improve CI test commands * fix conv1d initialization, fix conv1d unit test * fix documentation hyperlink * update buildspec * remove official keras installed with dependencies * update ci commands
This PR completes the implementation of sparse categorical crossentropy
Implement sparse categorical crossentropy
Fix Bug in K.equal operator #146 found when fixing sparse categorical accuracy, fixed all element wise operators
Now K.equal() can handle two variables with different shapes using broadcast operations
Enabled unit tests:
Sparse categorical crossentropy is faster than categorical crossentropy when dealing with large number of exclusive labels. X1.78 faster with 1000 classes and 10000 samples. See detailed results in comment.