Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[amd] convolution kernel didn't reuse the algorithm founded. #11203

Closed
dzhwinter opened this issue Jun 5, 2018 · 4 comments
Closed

[amd] convolution kernel didn't reuse the algorithm founded. #11203

dzhwinter opened this issue Jun 5, 2018 · 4 comments

Comments

@dzhwinter
Copy link
Contributor

My PR fix the issue above https://github.com/dzhwinter/Paddle/tree/review_conv2d_1
The cudnn op is run on Cuda device, so its inputs/outputs must stay at Cuda device. In ROCm#16, it use CPU Tensor to store the algorithm selected, but our framework will automatically transform it into a temporary GPU Tensor. As a result, inside cudnn op, it can not get the real persistent Tensor.

If we allocated output and input in GPU, and copy the result to CPU, then we will get the correct result.

@dzhwinter
Copy link
Contributor Author

The result

I0605 10:56:09.376890 34846 tensor_util.cu:22] TensorCopy 3 from CUDAPlace(0) to CPUPlace
I0605 10:56:09.376969 34846 conv_cudnn_op.cu.cc:151] Find Kernel:  load  0x7f0044659080 kernel :3

@dzhwinter
Copy link
Contributor Author

dzhwinter commented Jun 21, 2018

在backward op里,是可以利用前向op的所有input, output的。需要定制一下GradOpMaker (Python端用来创建OpDesc).

class Conv2DGradMaker : public framework::SingleGradOpDescMaker {
 public:
  using framework::SingleGradOpDescMaker::SingleGradOpDescMaker;

 protected:
  std::unique_ptr<framework::OpDesc> Apply() const override {
    auto* op = new framework::OpDesc();
    op->SetType("conv2d_grad");
    op->SetInput("Input", Input("Input"));
    op->SetInput("Filter", Input("Filter"));
    op->SetInput("Algorithm", Input("Algorithm"));
    op->SetInput(framework::GradVarName("Output"), OutputGrad("Output"));

    op->SetAttrMap(Attrs());

    op->SetOutput("AlgorithmOut", Output("AlgorithmOut"));
    op->SetOutput(framework::GradVarName("Input"), InputGrad("Input"));
    op->SetOutput(framework::GradVarName("Filter"), InputGrad("Filter"));

    return std::unique_ptr<framework::OpDesc>(op);
  }
};

@dzhwinter
Copy link
Contributor Author

@shanyi15
Copy link
Collaborator

您好,此issue在近一个月内暂无更新,我们将于今天内关闭。若在关闭后您仍需跟进提问,可重新开启此问题,我们将在24小时内回复您。因关闭带来的不便我们深表歉意,请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants