Wrong gradient of gather_nd when the indices have duplicates #9172

sxjscience · 2017-12-21T23:38:05Z

This issue is borrowed from https://discuss.gluon.ai/t/topic/3389.

It's caused by a bug in the gradient computation of gather_nd . Currently, the gradient of gather_nd is scatter_nd, which has not considered the case that the indices may be the same
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/indexing_op.h#L1156 . The correct way to implement it is to use the same backward logic as take.

import mxnet as mx
import mxnet.ndarray as nd
import numpy as np

data = mx.nd.array([[0, 1], [2, 3]])
indices = mx.nd.array([[1, 1, 0], [1, 1, 0]])
data.attach_grad()
with mx.autograd.record():
    ret = mx.nd.gather_nd(data, indices)
    loss = mx.nd.sum(ret)
loss.backward()
print(data.grad)

[[ 1.  0.]
 [ 0.  1.]]
<NDArray 2x2 @cpu(0)>

The correct result should be

[[ 1.  0.]
 [ 0.  2.]]
<NDArray 2x2 @cpu(0)>

@piiswrong @szha

The text was updated successfully, but these errors were encountered:

sxjscience · 2017-12-21T23:52:00Z

We could either use automicAdd or call the backward of take. atomicAdd seems to be simpler.

reminisce · 2017-12-21T23:57:24Z

What is the cpu version of atomicAdd?

sxjscience · 2017-12-21T23:59:04Z

Directly call += if openmp is not used. If openmp is used, we can use the atomic support of omp #pragma omp atomic

sxjscience added Autograd Bug labels Dec 21, 2017

szha added Operator and removed Autograd labels Dec 21, 2017

sxjscience mentioned this issue Dec 26, 2017

Fix the gradient of gather_nd #9200

Merged

9 tasks

szha closed this as completed Jan 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong gradient of gather_nd when the indices have duplicates #9172

Wrong gradient of gather_nd when the indices have duplicates #9172

sxjscience commented Dec 21, 2017 •

edited

Loading

sxjscience commented Dec 21, 2017

reminisce commented Dec 21, 2017

sxjscience commented Dec 21, 2017 •

edited

Loading

Wrong gradient of gather_nd when the indices have duplicates #9172

Wrong gradient of gather_nd when the indices have duplicates #9172

Comments

sxjscience commented Dec 21, 2017 • edited Loading

sxjscience commented Dec 21, 2017

reminisce commented Dec 21, 2017

sxjscience commented Dec 21, 2017 • edited Loading

sxjscience commented Dec 21, 2017 •

edited

Loading

sxjscience commented Dec 21, 2017 •

edited

Loading