[ROCM] update fluid platform for rocm35 (part1), test=develop #30639

qili93 · 2021-01-21T12:43:28Z

PR types

New features

PR changes

Others

Describe

Update paddle fluid platform for rocm35 - part1

paddle-bot-old · 2021-01-21T12:43:39Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

chenwhql

LGTM for PADDLE_ENFORCE change

chenwhql · 2021-01-26T02:41:50Z

paddle/fluid/platform/enforce.h

+
+inline const char* rocblasGetErrorString(rocblas_status stat) {
+  switch (stat) {
+    case rocblas_status_invalid_handle:


用户反馈我们第三方库报错只有一个status，没说具体原因，然后在搜索引擎又不能比较快的找到官网解释的话，用户体验会比较差，这块 @zhouwei25 后续还会做一些增强，可以关注下

zhwesky2010

目前可以先简写一版报错，后面如果官网有报错支持，后续可能统一把AMD这几种也压缩到cudaerrormessage.pb里，这个文件目前仅集成了NvidiaGPU的报错内容

zhwesky2010 · 2021-01-26T03:00:07Z

paddle/fluid/platform/enforce.h

+  return webstr.str();
+}
+
+inline std::string build_nvidia_error_msg(hipError_t e) {


这是Nvidia 5种类型的报错统一接口，是将官网信息映射为 报错码+报错内容 的形式压缩到一个cudaerrormessage.pb的文件里去，AMD GPU的报错信息可以叫build_amd_error_msg，现在那个cudaerrormessage.pb只有Nvidia的部分，没有AMD的，可以先不走这块查询逻辑，因为肯定查不到。

改成build_rocm_error_msg

zhwesky2010 · 2021-01-26T03:00:28Z

paddle/fluid/platform/enforce.h

+/***** HIP ERROR *****/
+inline bool is_error(hipError_t e) { return e != hipSuccess; }
+
+inline std::string GetCudaErrorWebsite(int32_t cuda_version) {


这个也可以先不写

去掉了GetCudaErrorWebsite

zhwesky2010 · 2021-01-26T03:20:14Z

paddle/fluid/platform/enforce.h

+  int32_t cuda_version = -1;
+#endif
+  std::ostringstream sout;
+  sout << " Hip error(" << e << "), " << hipGetErrorString(e) << ".";


先打出hipGetErrorString(e)这部分，后面的逻辑目前无法触发可以先不用写

删除了hipGetErrorString(e)之后的error string的逻辑

zhwesky2010

LGTM

qili93 requested review from luotao1 and phlrain January 21, 2021 12:49

qili93 force-pushed the rocm_platform_part1 branch from 38f330b to b708405 Compare January 25, 2021 08:13

paddle-bot-old bot referenced this pull request Jan 25, 2021

[ROCM] update fluid platform for rocm35 (part1), test=develop

b708405

qili93 force-pushed the rocm_platform_part1 branch from 2a5a3e6 to 50b5659 Compare January 25, 2021 13:22

[ROCM] update fluid platform for rocm35 (part1), test=develop

6710055

qili93 force-pushed the rocm_platform_part1 branch from 50b5659 to 6710055 Compare January 25, 2021 13:27

qili93 requested review from chenwhql and removed request for luotao1 January 26, 2021 02:15

chenwhql previously approved these changes Jan 26, 2021

View reviewed changes

zhwesky2010 reviewed Jan 26, 2021

View reviewed changes

address review comments, test=develop

8c18de9

qili93 dismissed chenwhql’s stale review via 8c18de9 January 26, 2021 03:34

qili93 requested review from zhwesky2010 and chenwhql January 26, 2021 03:39

zhwesky2010 approved these changes Jan 26, 2021

View reviewed changes

chenwhql approved these changes Jan 26, 2021

View reviewed changes

qili93 merged commit f89da4a into PaddlePaddle:develop Jan 28, 2021

qili93 deleted the rocm_platform_part1 branch January 28, 2021 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ROCM] update fluid platform for rocm35 (part1), test=develop #30639

[ROCM] update fluid platform for rocm35 (part1), test=develop #30639

qili93 commented Jan 21, 2021

paddle-bot-old bot commented Jan 21, 2021

chenwhql left a comment

chenwhql Jan 26, 2021

zhwesky2010 left a comment •

edited

Loading

zhwesky2010 Jan 26, 2021

qili93 Jan 26, 2021

zhwesky2010 Jan 26, 2021 •

edited

Loading

qili93 Jan 26, 2021

zhwesky2010 Jan 26, 2021 •

edited

Loading

qili93 Jan 26, 2021

zhwesky2010 left a comment

[ROCM] update fluid platform for rocm35 (part1), test=develop #30639

[ROCM] update fluid platform for rocm35 (part1), test=develop #30639

Conversation

qili93 commented Jan 21, 2021

PR types

PR changes

Describe

paddle-bot-old bot commented Jan 21, 2021

chenwhql left a comment

Choose a reason for hiding this comment

chenwhql Jan 26, 2021

Choose a reason for hiding this comment

zhwesky2010 left a comment • edited Loading

Choose a reason for hiding this comment

zhwesky2010 Jan 26, 2021

Choose a reason for hiding this comment

qili93 Jan 26, 2021

Choose a reason for hiding this comment

zhwesky2010 Jan 26, 2021 • edited Loading

Choose a reason for hiding this comment

qili93 Jan 26, 2021

Choose a reason for hiding this comment

zhwesky2010 Jan 26, 2021 • edited Loading

Choose a reason for hiding this comment

qili93 Jan 26, 2021

Choose a reason for hiding this comment

zhwesky2010 left a comment

Choose a reason for hiding this comment

zhwesky2010 left a comment •

edited

Loading

zhwesky2010 Jan 26, 2021 •

edited

Loading

zhwesky2010 Jan 26, 2021 •

edited

Loading