Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Wrong gradient of gather_nd when the indices have duplicates #9172

Closed
sxjscience opened this issue Dec 21, 2017 · 3 comments
Closed

Wrong gradient of gather_nd when the indices have duplicates #9172

sxjscience opened this issue Dec 21, 2017 · 3 comments

Comments

@sxjscience
Copy link
Member

sxjscience commented Dec 21, 2017

This issue is borrowed from https://discuss.gluon.ai/t/topic/3389.

It's caused by a bug in the gradient computation of gather_nd . Currently, the gradient of gather_nd is scatter_nd, which has not considered the case that the indices may be the same
https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/indexing_op.h#L1156 . The correct way to implement it is to use the same backward logic as take.

import mxnet as mx
import mxnet.ndarray as nd
import numpy as np

data = mx.nd.array([[0, 1], [2, 3]])
indices = mx.nd.array([[1, 1, 0], [1, 1, 0]])
data.attach_grad()
with mx.autograd.record():
    ret = mx.nd.gather_nd(data, indices)
    loss = mx.nd.sum(ret)
loss.backward()
print(data.grad)
[[ 1.  0.]
 [ 0.  1.]]
<NDArray 2x2 @cpu(0)>

The correct result should be

[[ 1.  0.]
 [ 0.  2.]]
<NDArray 2x2 @cpu(0)>

@piiswrong @szha

@sxjscience
Copy link
Member Author

We could either use automicAdd or call the backward of take. atomicAdd seems to be simpler.

@reminisce
Copy link
Contributor

What is the cpu version of atomicAdd?

@sxjscience
Copy link
Member Author

sxjscience commented Dec 21, 2017

Directly call += if openmp is not used. If openmp is used, we can use the atomic support of omp #pragma omp atomic

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants