Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

system freezes when running this kaggle kernel #297

Closed
witeko opened this issue Jan 24, 2019 · 25 comments
Closed

system freezes when running this kaggle kernel #297

witeko opened this issue Jan 24, 2019 · 25 comments
Assignees
Labels
bug Something isn't working gfx803 issue specific to gfx803 GPUs

Comments

@witeko
Copy link

witeko commented Jan 24, 2019

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): NO
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.10
  • TensorFlow installed from (source or binary): rocm_tensorflow 1.12
  • GPU model and memory: radeon rx580 8gb

Describe the current behavior
When I run this exact kaggle kernel (code and data provided) https://www.kaggle.com/martinpiotte/bounding-box-model/notebook my system always freezes (I can still move the mouse cursor but nothing else).
I checked batch sizes all the way down from 32 to 2.
Problem occurs during the first epoch, but not immediately (after different periods of time).
I also checked gpu options like limiting memory to a given fraction and by allowing memory growth.

edit: switched to my other linux distro ubuntu 18.04 (everything newest: rocm, tensorflow,...) and the system is not freezing but i get the error message:
"Memory access fault by GPU node-1 (Agent handle: 0x5557c91b8950) on address 0x12dba01000. Reason: Page not present or supervisor privilege."

@parallelo
Copy link

Thanks for the report, @witeko. Looks like we've got a few similar failures. We'll look at this too.

@parallelo
Copy link

Hi @witeko - Would you be able to follow these steps on your system that gets the Memory access fault?

Please gather the logs for this run:

export HCC_SERIALIZE_KERNEL=0x3
export HCC_SERIALIZE_COPY=0x3
export HIP_TRACE_API=0x2
export MIOPEN_ENABLE_LOGGING_CMD=1
[then re-run your application]

That should help us understand where the issue is. Thanks!

@witeko
Copy link
Author

witeko commented Jan 30, 2019

@parallelo from ubuntu 18.04:
2019-01-30 23:20:16.922161: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
<<hip-api pid:46867 tid:7.4909 46867 7.4909 hipLaunchKernel 'sp3AsmConvRxSU' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171490931520
<<hip-api pid:46867 tid:7.4926 46867 7.4926 hipLaunchKernel 'miog_betac_alphaab' gridDim:{65536,1,1} groupDim:{64,1,1} sharedMem:+0 stream:0.1 @171557544688
<<hip-api pid:46867 tid:7.4935 46867 7.4935 hipLaunchKernel 'miog_betac_alphaab' gridDim:{65536,1,1} groupDim:{64,1,1} sharedMem:+0 stream:0.1 @171557633558
<<hip-api pid:46867 tid:7.4945 46867 7.4945 hipLaunchKernel 'Col2Im' gridDim:{262144,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171557710239
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 12 -c 64 -H 64 -W 64 -k 64 -y 2 -x 2 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.4961 46867 7.4961 hipLaunchKernel 'sp3AsmConvRxSU' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171558084251
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 12 -c 64 -H 64 -W 64 -k 64 -y 2 -x 2 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.4988 46867 7.4988 hipLaunchKernel 'sp3AsmConvRxSU' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171558691334
<<hip-api pid:46867 tid:7.4994 46867 7.4994 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559230266
<<hip-api pid:46867 tid:7.4996 46867 7.4996 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559271377
<<hip-api pid:46867 tid:7.4998 46867 7.4998 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559302717
<<hip-api pid:46867 tid:7.5000 46867 7.5000 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559334917
<<hip-api pid:46867 tid:7.5004 46867 7.5004 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559401607
<<hip-api pid:46867 tid:7.5006 46867 7.5006 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559437497
<<hip-api pid:46867 tid:7.5010 46867 7.5010 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559497658
<<hip-api pid:46867 tid:7.5012 46867 7.5012 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559526738
<<hip-api pid:46867 tid:7.5014 46867 7.5014 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559558878
<<hip-api pid:46867 tid:7.5016 46867 7.5016 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559590578
<<hip-api pid:46867 tid:7.5018 46867 7.5018 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559622108
<<hip-api pid:46867 tid:7.5020 46867 7.5020 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559651119
<<hip-api pid:46867 tid:7.5023 46867 7.5023 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559683669
<<hip-api pid:46867 tid:7.5025 46867 7.5025 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559714899
<<hip-api pid:46867 tid:7.5027 46867 7.5027 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559747339
<<hip-api pid:46867 tid:7.5029 46867 7.5029 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559781519
<<hip-api pid:46867 tid:7.5033 46867 7.5033 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559838119
<<hip-api pid:46867 tid:7.5037 46867 7.5037 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559893380
<<hip-api pid:46867 tid:7.5039 46867 7.5039 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559921400
<<hip-api pid:46867 tid:7.5041 46867 7.5041 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559952630
<<hip-api pid:46867 tid:7.5043 46867 7.5043 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171559982850
<<hip-api pid:46867 tid:7.5045 46867 7.5045 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560013320
<<hip-api pid:46867 tid:7.5047 46867 7.5047 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi2ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_17scalar_product_opIffEEKNS4_INS5_IKfLi2ELi1EiEELi16ES7_EEKNS_20TensorBroadcastingOpIKNS_5arrayIlLm2EEESF_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560045361
<<hip-api pid:46867 tid:7.5049 46867 7.5049 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560394832
<<hip-api pid:46867 tid:7.5051 46867 7.5051 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560435212
<<hip-api pid:46867 tid:7.5053 46867 7.5053 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560464753
<<hip-api pid:46867 tid:7.5055 46867 7.5055 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560495283
<<hip-api pid:46867 tid:7.5057 46867 7.5057 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560524903
<<hip-api pid:46867 tid:7.5059 46867 7.5059 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560554723
<<hip-api pid:46867 tid:7.5063 46867 7.5063 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560609703
<<hip-api pid:46867 tid:7.5065 46867 7.5065 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560638904
<<hip-api pid:46867 tid:7.5069 46867 7.5069 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560693974
<<hip-api pid:46867 tid:7.5071 46867 7.5071 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171560722574
<<hip-api pid:46867 tid:7.5073 46867 7.5073 hipLaunchKernel 'ZN10tensorflow7functor37SwapDimension1And2InTensor3UsingTilesIjLi256ELi32ELi32ELb0EEEvPKT_NS0_9DimensionILi3EEEPS2' gridDim:{786432,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171560759894
<<hip-api pid:46867 tid:7.5079 46867 7.5079 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561112586
<<hip-api pid:46867 tid:7.5081 46867 7.5081 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561147616
<<hip-api pid:46867 tid:7.5083 46867 7.5083 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561178216
<<hip-api pid:46867 tid:7.5085 46867 7.5085 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561211296
<<hip-api pid:46867 tid:7.5089 46867 7.5089 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561266267
<<hip-api pid:46867 tid:7.5093 46867 7.5093 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561319237
<<hip-api pid:46867 tid:7.5096 46867 7.5096 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561351867
<<hip-api pid:46867 tid:7.5098 46867 7.5098 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561384777
<<hip-api pid:46867 tid:7.5100 46867 7.5100 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561414367
<<hip-api pid:46867 tid:7.5102 46867 7.5102 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561444648
<<hip-api pid:46867 tid:7.5104 46867 7.5104 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561474378
<<hip-api pid:46867 tid:7.5106 46867 7.5106 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561504028
<<hip-api pid:46867 tid:7.5113 46867 7.5113 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561577918
<<hip-api pid:46867 tid:7.5115 46867 7.5115 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561612318
<<hip-api pid:46867 tid:7.5117 46867 7.5117 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561644599
<<hip-api pid:46867 tid:7.5119 46867 7.5119 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561677459
<<hip-api pid:46867 tid:7.5121 46867 7.5121 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171561708419
<<hip-api pid:46867 tid:7.5123 46867 7.5123 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171562002140
<<hip-api pid:46867 tid:7.5125 46867 7.5125 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171562044641
<<hip-api pid:46867 tid:7.5127 46867 7.5127 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171562074921
<<hip-api pid:46867 tid:7.5129 46867 7.5129 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171562103631
<<hip-api pid:46867 tid:7.5136 46867 7.5136 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171562174501
<<hip-api pid:46867 tid:7.5138 46867 7.5138 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171562206261
<<hip-api pid:46867 tid:7.5141 46867 7.5141 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171562239132
<<hip-api pid:46867 tid:7.5143 46867 7.5143 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171562269712
<<hip-api pid:46867 tid:7.5153 46867 7.5153 hipLaunchKernel 'ZN10tensorflow7functor37SwapDimension1And2InTensor3UsingTilesIjLi256ELi32ELi32ELb0EEEvPKT_NS0_9DimensionILi3EEEPS2' gridDim:{786432,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171562353232
<<hip-api pid:46867 tid:7.5155 46867 7.5155 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EESF_EEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171562653504
<<hip-api pid:46867 tid:7.5159 46867 7.5159 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_20TensorCwiseNullaryOpINS0_18scalar_constant_opIfEEKS8_EEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171563078036
<<hip-api pid:46867 tid:7.5160 46867 7.5160 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_20TensorCwiseNullaryOpINS0_18scalar_constant_opIfEEKS8_EEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171563113436
miopenBatchNormalizationBackward: ./bin/MIOpenDriver bnorm
<<hip-api pid:46867 tid:7.5175 46867 7.5175 hipLaunchKernel 'MIOpenBatchNormBwdSpatial' gridDim:{65536,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171563563908
<<hip-api pid:46867 tid:7.5179 46867 7.5179 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EESF_EEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171564241612
<<hip-api pid:46867 tid:7.5181 46867 7.5181 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EESF_EEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171564281412
<<hip-api pid:46867 tid:7.5183 46867 7.5183 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EESF_EEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171564310752
<<hip-api pid:46867 tid:7.5185 46867 7.5185 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171564716244
<<hip-api pid:46867 tid:7.5187 46867 7.5187 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171564753474
<<hip-api pid:46867 tid:7.5189 46867 7.5189 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171564783445
<<hip-api pid:46867 tid:7.5191 46867 7.5191 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171564811685
<<hip-api pid:46867 tid:7.5193 46867 7.5193 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_17scalar_product_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EEKNS_18TensorConversionOpIfKNS9_INS0_13scalar_cmp_opISB_SB_LNS0_14ComparisonNameE5EEESF_KNS_20TensorCwiseNullaryOpINS0_18scalar_constant_opISB_EESF_EEEEEEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171564842675
<<hip-api pid:46867 tid:7.5195 46867 7.5195 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171565255877
<<hip-api pid:46867 tid:7.5197 46867 7.5197 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171565292737
<<hip-api pid:46867 tid:7.5199 46867 7.5199 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171565321297
<<hip-api pid:46867 tid:7.5201 46867 7.5201 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171565350557
2019-01-30 23:20:16.996714: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
miopenFindConvolutionBackwardWeightsAlgorithm: ./bin/MIOpenDriver conv -n 12 -c 64 -H 64 -W 64 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.5238 46867 7.5238 hipLaunchKernel 'Im2Col' gridDim:{262144,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171569050256
<<hip-api pid:46867 tid:7.5258 46867 7.5258 hipLaunchKernel 'miog_alphaab' gridDim:{46080,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171630366736
<<hip-api pid:46867 tid:7.5267 46867 7.5267 hipLaunchKernel 'miog_alphaab' gridDim:{46080,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171630614698
<<hip-api pid:46867 tid:7.5290 46867 7.5290 hipLaunchKernel 'gcnAsmConv3x3WrW' gridDim:{256,16,32} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171634451669
<<hip-api pid:46867 tid:7.5308 46867 7.5308 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,16,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171639067925
<<hip-api pid:46867 tid:7.5315 46867 7.5315 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{36864,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171644735416
<<hip-api pid:46867 tid:7.5333 46867 7.5333 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,4,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171645176259
<<hip-api pid:46867 tid:7.5340 46867 7.5340 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{36864,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171646692887
<<hip-api pid:46867 tid:7.5361 46867 7.5361 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,4,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171646770707
<<hip-api pid:46867 tid:7.5364 46867 7.5364 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{36864,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171648286486
<<hip-api pid:46867 tid:7.5390 46867 7.5390 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,4,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171648392816
<<hip-api pid:46867 tid:7.5393 46867 7.5393 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{36864,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171649908855
<<hip-api pid:46867 tid:7.5398 46867 7.5398 hipLaunchKernel 'ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171649956815
<<hip-api pid:46867 tid:7.5402 46867 7.5402 hipLaunchKernel 'ZN12_GLOBAL__N_110hip_fill_nILj256EPjmjEEvT0_T1_T2' gridDim:{256,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171650014716
<<hip-api pid:46867 tid:7.5409 46867 7.5409 hipLaunchKernel '_ZN10tensorflow26BiasGradNCHW_SharedAtomicsIfEEvPKT_PS1_iiii' gridDim:{65536,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171650052316
<<hip-api pid:46867 tid:7.5417 46867 7.5417 hipLaunchKernel 'ZN7rocprim6detail23segmented_reduce_kernelINS0_21default_reduce_configILj0EfEEPKfPfNS_18transform_iteratorINS_17counting_iteratorIilEEN10tensorflow7functor9RowOffsetEiEEfN6hipcub3SumEEEvT0_T1_T2_SI_T4_T3' gridDim:{196608,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171650220817
<<hip-api pid:46867 tid:7.5419 46867 7.5419 hipLaunchKernel 'ZN10tensorflow7functor24ColumnReduceSimpleKernelIPKfPfN6hipcub3SumEEEvT_T0_iiiT1' gridDim:{128,1,1} groupDim:{128,1,1} sharedMem:+0 stream:0.1 @171650308347
<<hip-api pid:46867 tid:7.5430 46867 7.5430 hipLaunchKernel 'ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171650367297
2019-01-30 23:20:17.081726: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
<<hip-api pid:46867 tid:7.5474 46867 7.5474 hipLaunchKernel 'sp3AsmConv3x3F' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171656510292
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 12 -c 64 -H 64 -W 64 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.5492 46867 7.5492 hipLaunchKernel 'sp3AsmConv3x3F' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171657064075
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 12 -c 64 -H 64 -W 64 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.5519 46867 7.5519 hipLaunchKernel 'sp3AsmConv3x3F' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171657604458
<<hip-api pid:46867 tid:7.5527 46867 7.5527 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658145170
<<hip-api pid:46867 tid:7.5529 46867 7.5529 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658225961
<<hip-api pid:46867 tid:7.5533 46867 7.5533 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658317962
<<hip-api pid:46867 tid:7.5535 46867 7.5535 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658347152
<<hip-api pid:46867 tid:7.5537 46867 7.5537 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658377832
<<hip-api pid:46867 tid:7.5539 46867 7.5539 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658408852
<<hip-api pid:46867 tid:7.5541 46867 7.5541 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658440322
<<hip-api pid:46867 tid:7.5543 46867 7.5543 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658470392
<<hip-api pid:46867 tid:7.5545 46867 7.5545 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_17scalar_product_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EEKNS_18TensorConversionOpIfKNS9_INS0_13scalar_cmp_opISB_SB_LNS0_14ComparisonNameE5EEESF_KNS_20TensorCwiseNullaryOpINS0_18scalar_constant_opISB_EESF_EEEEEEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658501563
<<hip-api pid:46867 tid:7.5549 46867 7.5549 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658764974
<<hip-api pid:46867 tid:7.5553 46867 7.5553 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658817344
<<hip-api pid:46867 tid:7.5555 46867 7.5555 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658845724
<<hip-api pid:46867 tid:7.5557 46867 7.5557 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658876885
<<hip-api pid:46867 tid:7.5559 46867 7.5559 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658906825
<<hip-api pid:46867 tid:7.5561 46867 7.5561 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171658936235
<<hip-api pid:46867 tid:7.5575 46867 7.5575 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,4,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171658987415
<<hip-api pid:46867 tid:7.5578 46867 7.5578 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{36864,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171660480694
<<hip-api pid:46867 tid:7.5583 46867 7.5583 hipLaunchKernel 'ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171660523194
<<hip-api pid:46867 tid:7.5587 46867 7.5587 hipLaunchKernel 'ZN12_GLOBAL__N_110hip_fill_nILj256EPjmjEEvT0_T1_T2' gridDim:{256,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171660564264
<<hip-api pid:46867 tid:7.5588 46867 7.5588 hipLaunchKernel 'ZN7rocprim6detail23segmented_reduce_kernelINS0_21default_reduce_configILj0EfEEPKfPfNS_18transform_iteratorINS_17counting_iteratorIilEEN10tensorflow7functor9RowOffsetEiEEfN6hipcub3SumEEEvT0_T1_T2_SI_T4_T3' gridDim:{196608,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171660595084
<<hip-api pid:46867 tid:7.5590 46867 7.5590 hipLaunchKernel 'ZN10tensorflow7functor24ColumnReduceSimpleKernelIPKfPfN6hipcub3SumEEEvT_T0_iiiT1' gridDim:{128,1,1} groupDim:{128,1,1} sharedMem:+0 stream:0.1 @171660679535
<<hip-api pid:46867 tid:7.5592 46867 7.5592 hipLaunchKernel 'ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171660715035
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 12 -c 64 -H 64 -W 64 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.5606 46867 7.5606 hipLaunchKernel 'sp3AsmConv3x3F' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171660796955
<<hip-api pid:46867 tid:7.5612 46867 7.5612 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661287088
<<hip-api pid:46867 tid:7.5614 46867 7.5614 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661320828
<<hip-api pid:46867 tid:7.5618 46867 7.5618 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661377968
<<hip-api pid:46867 tid:7.5620 46867 7.5620 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661412259
<<hip-api pid:46867 tid:7.5624 46867 7.5624 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661470979
<<hip-api pid:46867 tid:7.5626 46867 7.5626 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661500009
<<hip-api pid:46867 tid:7.5628 46867 7.5628 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661530429
<<hip-api pid:46867 tid:7.5630 46867 7.5630 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661561469
<<hip-api pid:46867 tid:7.5632 46867 7.5632 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661592830
<<hip-api pid:46867 tid:7.5634 46867 7.5634 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661621870
<<hip-api pid:46867 tid:7.5636 46867 7.5636 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_17scalar_product_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EEKNS_18TensorConversionOpIfKNS9_INS0_13scalar_cmp_opISB_SB_LNS0_14ComparisonNameE5EEESF_KNS_20TensorCwiseNullaryOpINS0_18scalar_constant_opISB_EESF_EEEEEEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661650270
<<hip-api pid:46867 tid:7.5638 46867 7.5638 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661902161
<<hip-api pid:46867 tid:7.5640 46867 7.5640 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661936331
<<hip-api pid:46867 tid:7.5644 46867 7.5644 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171661994042
<<hip-api pid:46867 tid:7.5648 46867 7.5648 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171662047262
<<hip-api pid:46867 tid:7.5650 46867 7.5650 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171662075322
<<hip-api pid:46867 tid:7.5652 46867 7.5652 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171662106792
<<hip-api pid:46867 tid:7.5654 46867 7.5654 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171662136663
<<hip-api pid:46867 tid:7.5656 46867 7.5656 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171662165583
2019-01-30 23:20:17.093521: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
miopenFindConvolutionBackwardWeightsAlgorithm: ./bin/MIOpenDriver conv -n 12 -c 64 -H 128 -W 128 -k 64 -y 2 -x 2 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.5686 46867 7.5686 hipLaunchKernel 'Im2Col' gridDim:{262144,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171662349504
<<hip-api pid:46867 tid:7.5711 46867 7.5711 hipLaunchKernel 'miog_alphaab' gridDim:{32768,1,1} groupDim:{64,1,1} sharedMem:+0 stream:0.1 @171724120106
<<hip-api pid:46867 tid:7.5720 46867 7.5720 hipLaunchKernel 'miog_alphaab' gridDim:{32768,1,1} groupDim:{64,1,1} sharedMem:+0 stream:0.1 @171724611219
<<hip-api pid:46867 tid:7.5744 46867 7.5744 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,16,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171725534454
<<hip-api pid:46867 tid:7.5751 46867 7.5751 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{4096,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171729797298
<<hip-api pid:46867 tid:7.5772 46867 7.5772 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,16,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171729869228
<<hip-api pid:46867 tid:7.5775 46867 7.5775 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{4096,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171734040031
<<hip-api pid:46867 tid:7.5801 46867 7.5801 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,16,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171734141302
<<hip-api pid:46867 tid:7.5804 46867 7.5804 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{4096,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171738333575
<<hip-api pid:46867 tid:7.5809 46867 7.5809 hipLaunchKernel 'ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171738377455
<<hip-api pid:46867 tid:7.5813 46867 7.5813 hipLaunchKernel 'ZN12_GLOBAL__N_110hip_fill_nILj256EPjmjEEvT0_T1_T2' gridDim:{256,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171738434026
<<hip-api pid:46867 tid:7.5814 46867 7.5814 hipLaunchKernel 'ZN7rocprim6detail23segmented_reduce_kernelINS0_21default_reduce_configILj0EfEEPKfPfNS_18transform_iteratorINS_17counting_iteratorIilEEN10tensorflow7functor9RowOffsetEiEEfN6hipcub3SumEEEvT0_T1_T2_SI_T4_T3' gridDim:{196608,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171738470446
<<hip-api pid:46867 tid:7.5816 46867 7.5816 hipLaunchKernel 'ZN10tensorflow7functor24ColumnReduceSimpleKernelIPKfPfN6hipcub3SumEEEvT_T0_iiiT1' gridDim:{128,1,1} groupDim:{128,1,1} sharedMem:+0 stream:0.1 @171738558116
<<hip-api pid:46867 tid:7.5818 46867 7.5818 hipLaunchKernel 'ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171738601787
2019-01-30 23:20:17.169959: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
<<hip-api pid:46867 tid:7.5855 46867 7.5855 hipLaunchKernel 'sp3AsmConvRxSU' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171738716117
<<hip-api pid:46867 tid:7.5872 46867 7.5872 hipLaunchKernel 'miog_betac_alphaab' gridDim:{65536,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171806746124
<<hip-api pid:46867 tid:7.5881 46867 7.5881 hipLaunchKernel 'miog_betac_alphaab' gridDim:{65536,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171806898115
<<hip-api pid:46867 tid:7.5891 46867 7.5891 hipLaunchKernel 'Col2Im' gridDim:{1048576,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171807033706
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 12 -c 64 -H 128 -W 128 -k 64 -y 2 -x 2 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.5907 46867 7.5907 hipLaunchKernel 'sp3AsmConvRxSU' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171807309937
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 12 -c 64 -H 128 -W 128 -k 64 -y 2 -x 2 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.5934 46867 7.5934 hipLaunchKernel 'sp3AsmConvRxSU' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171809054467
<<hip-api pid:46867 tid:7.5940 46867 7.5940 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171810712426
<<hip-api pid:46867 tid:7.5942 46867 7.5942 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171810752607
<<hip-api pid:46867 tid:7.5944 46867 7.5944 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171810784627
<<hip-api pid:46867 tid:7.5946 46867 7.5946 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171810815767
<<hip-api pid:46867 tid:7.5950 46867 7.5950 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171810880467
<<hip-api pid:46867 tid:7.5952 46867 7.5952 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171810914807
<<hip-api pid:46867 tid:7.5956 46867 7.5956 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171810970898
<<hip-api pid:46867 tid:7.5958 46867 7.5958 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171810998518
<<hip-api pid:46867 tid:7.5960 46867 7.5960 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811030408
<<hip-api pid:46867 tid:7.5962 46867 7.5962 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811060908
<<hip-api pid:46867 tid:7.5964 46867 7.5964 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811091378
<<hip-api pid:46867 tid:7.5966 46867 7.5966 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811119889
<<hip-api pid:46867 tid:7.5969 46867 7.5969 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811151569
<<hip-api pid:46867 tid:7.5971 46867 7.5971 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811181499
<<hip-api pid:46867 tid:7.5973 46867 7.5973 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811214069
<<hip-api pid:46867 tid:7.5975 46867 7.5975 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811246359
<<hip-api pid:46867 tid:7.5979 46867 7.5979 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811302079
<<hip-api pid:46867 tid:7.5983 46867 7.5983 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811355960
<<hip-api pid:46867 tid:7.5985 46867 7.5985 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811383750
<<hip-api pid:46867 tid:7.5987 46867 7.5987 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811413360
<<hip-api pid:46867 tid:7.5989 46867 7.5989 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811441800
<<hip-api pid:46867 tid:7.5991 46867 7.5991 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811470471
<<hip-api pid:46867 tid:7.5993 46867 7.5993 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi2ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_17scalar_product_opIffEEKNS4_INS5_IKfLi2ELi1EiEELi16ES7_EEKNS_20TensorBroadcastingOpIKNS_5arrayIlLm2EEESF_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171811501731
<<hip-api pid:46867 tid:7.5995 46867 7.5995 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171812528656
<<hip-api pid:46867 tid:7.5997 46867 7.5997 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171812564877
<<hip-api pid:46867 tid:7.5999 46867 7.5999 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171812594147
<<hip-api pid:46867 tid:7.6001 46867 7.6001 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171812624427
<<hip-api pid:46867 tid:7.6003 46867 7.6003 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171812652707
<<hip-api pid:46867 tid:7.6005 46867 7.6005 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171812681757
<<hip-api pid:46867 tid:7.6009 46867 7.6009 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171812736058
<<hip-api pid:46867 tid:7.6011 46867 7.6011 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171812766128
<<hip-api pid:46867 tid:7.6015 46867 7.6015 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171812818938
<<hip-api pid:46867 tid:7.6017 46867 7.6017 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171812847018
<<hip-api pid:46867 tid:7.6019 46867 7.6019 hipLaunchKernel 'ZN10tensorflow7functor37SwapDimension1And2InTensor3UsingTilesIjLi256ELi32ELi32ELb0EEEvPKT_NS0_9DimensionILi3EEEPS2' gridDim:{3145728,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171812883258
<<hip-api pid:46867 tid:7.6025 46867 7.6025 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813492622
<<hip-api pid:46867 tid:7.6027 46867 7.6027 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813525102
<<hip-api pid:46867 tid:7.6029 46867 7.6029 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813554622
<<hip-api pid:46867 tid:7.6031 46867 7.6031 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813585622
<<hip-api pid:46867 tid:7.6035 46867 7.6035 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813638552
<<hip-api pid:46867 tid:7.6039 46867 7.6039 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813689363
<<hip-api pid:46867 tid:7.6042 46867 7.6042 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813720883
<<hip-api pid:46867 tid:7.6044 46867 7.6044 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813751133
<<hip-api pid:46867 tid:7.6046 46867 7.6046 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813779273
<<hip-api pid:46867 tid:7.6048 46867 7.6048 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813807653
<<hip-api pid:46867 tid:7.6050 46867 7.6050 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813835904
<<hip-api pid:46867 tid:7.6052 46867 7.6052 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_max_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813863424
<<hip-api pid:46867 tid:7.6059 46867 7.6059 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813934674
<<hip-api pid:46867 tid:7.6061 46867 7.6061 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813965804
<<hip-api pid:46867 tid:7.6063 46867 7.6063 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171813995965
<<hip-api pid:46867 tid:7.6065 46867 7.6065 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_14scalar_sqrt_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171814028185
<<hip-api pid:46867 tid:7.6067 46867 7.6067 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171814058055
<<hip-api pid:46867 tid:7.6069 46867 7.6069 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171814607718
<<hip-api pid:46867 tid:7.6071 46867 7.6071 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171814643188
<<hip-api pid:46867 tid:7.6073 46867 7.6073 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171814671408
<<hip-api pid:46867 tid:7.6075 46867 7.6075 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_sum_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171814700138
<<hip-api pid:46867 tid:7.6082 46867 7.6082 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171814770819
<<hip-api pid:46867 tid:7.6084 46867 7.6084 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_18scalar_quotient_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171814802029
<<hip-api pid:46867 tid:7.6087 46867 7.6087 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{16384,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171814833509
<<hip-api pid:46867 tid:7.6089 46867 7.6089 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_20scalar_difference_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171814863079
<<hip-api pid:46867 tid:7.6099 46867 7.6099 hipLaunchKernel 'ZN10tensorflow7functor37SwapDimension1And2InTensor3UsingTilesIjLi256ELi32ELi32ELb0EEEvPKT_NS0_9DimensionILi3EEEPS2' gridDim:{3145728,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171814962430
<<hip-api pid:46867 tid:7.6101 46867 7.6101 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EESF_EEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171815644324
<<hip-api pid:46867 tid:7.6105 46867 7.6105 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_20TensorCwiseNullaryOpINS0_18scalar_constant_opIfEEKS8_EEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171816461438
<<hip-api pid:46867 tid:7.6106 46867 7.6106 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_20TensorCwiseNullaryOpINS0_18scalar_constant_opIfEEKS8_EEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171816493148
miopenBatchNormalizationBackward: ./bin/MIOpenDriver bnorm
<<hip-api pid:46867 tid:7.6121 46867 7.6121 hipLaunchKernel 'MIOpenBatchNormBwdSpatial' gridDim:{65536,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171816956771
<<hip-api pid:46867 tid:7.6125 46867 7.6125 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EESF_EEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171818367599
<<hip-api pid:46867 tid:7.6127 46867 7.6127 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EESF_EEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171818403879
<<hip-api pid:46867 tid:7.6129 46867 7.6129 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EESF_EEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171818431639
<<hip-api pid:46867 tid:7.6131 46867 7.6131 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171819239803
<<hip-api pid:46867 tid:7.6133 46867 7.6133 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171819272344
<<hip-api pid:46867 tid:7.6135 46867 7.6135 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171819299944
<<hip-api pid:46867 tid:7.6137 46867 7.6137 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171819326894
<<hip-api pid:46867 tid:7.6139 46867 7.6139 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_17scalar_product_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EEKNS_18TensorConversionOpIfKNS9_INS0_13scalar_cmp_opISB_SB_LNS0_14ComparisonNameE5EEESF_KNS_20TensorCwiseNullaryOpINS0_18scalar_constant_opISB_EESF_EEEEEEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171819356964
<<hip-api pid:46867 tid:7.6141 46867 7.6141 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171820175749
<<hip-api pid:46867 tid:7.6143 46867 7.6143 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171820212289
<<hip-api pid:46867 tid:7.6145 46867 7.6145 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171820240519
<<hip-api pid:46867 tid:7.6147 46867 7.6147 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171820270029
<<hip-api pid:46867 tid:7.6149 46867 7.6149 hipLaunchKernel 'ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171820311409
2019-01-30 23:20:17.251671: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
<<hip-api pid:46867 tid:7.6193 46867 7.6193 hipLaunchKernel 'sp3AsmConv3x3F' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171826429433
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 12 -c 64 -H 128 -W 128 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.6211 46867 7.6211 hipLaunchKernel 'sp3AsmConv3x3F' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171828259164
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 12 -c 64 -H 128 -W 128 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.6238 46867 7.6238 hipLaunchKernel 'sp3AsmConv3x3F' gridDim:{18432,1,1} groupDim:{512,1,1} sharedMem:+0 stream:0.1 @171830129324
2019-01-30 23:20:17.263185: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
miopenFindConvolutionBackwardWeightsAlgorithm: ./bin/MIOpenDriver conv -n 12 -c 64 -H 128 -W 128 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.6279 46867 7.6279 hipLaunchKernel 'Im2Col' gridDim:{1048576,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171835884626
<<hip-api pid:46867 tid:7.6299 46867 7.6299 hipLaunchKernel 'miog_alphaab' gridDim:{46080,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171897875909
<<hip-api pid:46867 tid:7.6308 46867 7.6308 hipLaunchKernel 'miog_alphaab' gridDim:{46080,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171901023117
<<hip-api pid:46867 tid:7.6331 46867 7.6331 hipLaunchKernel 'gcnAsmConv3x3WrW' gridDim:{256,16,32} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171907424562
<<hip-api pid:46867 tid:7.6349 46867 7.6349 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,16,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171917820000
<<hip-api pid:46867 tid:7.6356 46867 7.6356 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{36864,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171931641137
<<hip-api pid:46867 tid:7.6374 46867 7.6374 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,8,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171932152709
<<hip-api pid:46867 tid:7.6381 46867 7.6381 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{36864,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171935709039
<<hip-api pid:46867 tid:7.6402 46867 7.6402 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,8,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171935790550
<<hip-api pid:46867 tid:7.6405 46867 7.6405 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{36864,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171939311339
<<hip-api pid:46867 tid:7.6431 46867 7.6431 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{16384,8,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171939423150
<<hip-api pid:46867 tid:7.6434 46867 7.6434 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{36864,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171942975889
<<hip-api pid:46867 tid:7.6439 46867 7.6439 hipLaunchKernel 'ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171943024960
<<hip-api pid:46867 tid:7.6443 46867 7.6443 hipLaunchKernel 'ZN12_GLOBAL__N_110hip_fill_nILj256EPjmjEEvT0_T1_T2' gridDim:{256,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171943081530
<<hip-api pid:46867 tid:7.6450 46867 7.6450 hipLaunchKernel '_ZN10tensorflow26BiasGradNCHW_SharedAtomicsIfEEvPKT_PS1_iiii' gridDim:{65536,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171943118100
<<hip-api pid:46867 tid:7.6458 46867 7.6458 hipLaunchKernel 'ZN7rocprim6detail23segmented_reduce_kernelINS0_21default_reduce_configILj0EfEEPKfPfNS_18transform_iteratorINS_17counting_iteratorIilEEN10tensorflow7functor9RowOffsetEiEEfN6hipcub3SumEEEvT0_T1_T2_SI_T4_T3' gridDim:{196608,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171943647463
<<hip-api pid:46867 tid:7.6460 46867 7.6460 hipLaunchKernel 'ZN10tensorflow7functor24ColumnReduceSimpleKernelIPKfPfN6hipcub3SumEEEvT_T0_iiiT1' gridDim:{128,1,1} groupDim:{128,1,1} sharedMem:+0 stream:0.1 @171943917905
<<hip-api pid:46867 tid:7.6473 46867 7.6473 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171944009335
<<hip-api pid:46867 tid:7.6475 46867 7.6475 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171944045315
<<hip-api pid:46867 tid:7.6479 46867 7.6479 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171944099586
<<hip-api pid:46867 tid:7.6481 46867 7.6481 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_13scalar_sum_opIffEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEKNS4_INS5_ISC_Li1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171944127326
<<hip-api pid:46867 tid:7.6483 46867 7.6483 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_19TensorCwiseBinaryOpINS0_17scalar_product_opIKfSB_EEKNS4_INS5_ISB_Li1ELi1ElEELi16ES7_EEKNS_18TensorConversionOpIfKNS9_INS0_13scalar_cmp_opISB_SB_LNS0_14ComparisonNameE5EEESF_KNS_20TensorCwiseNullaryOpINS0_18scalar_constant_opISB_EESF_EEEEEEEEEENS_9GpuDeviceEEElEEvT_T0' gridDim:{92160,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171944157646
<<hip-api pid:46867 tid:7.6485 46867 7.6485 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171944967171
<<hip-api pid:46867 tid:7.6487 46867 7.6487 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171945005221
<<hip-api pid:46867 tid:7.6489 46867 7.6489 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_16scalar_square_opIfEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171945036601
<<hip-api pid:46867 tid:7.6491 46867 7.6491 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_11scalar_leftIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171945064621
<<hip-api pid:46867 tid:7.6495 46867 7.6495 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171945118361
<<hip-api pid:46867 tid:7.6499 46867 7.6499 hipLaunchKernel 'ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_13scalar_min_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @171945169492
2019-01-30 23:20:17.376535: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
miopenFindConvolutionBackwardWeightsAlgorithm: ./bin/MIOpenDriver conv -n 12 -c 1 -H 128 -W 128 -k 64 -y 9 -x 9 -p 4 -q 4 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
<<hip-api pid:46867 tid:7.6529 46867 7.6529 hipLaunchKernel 'Im2Col' gridDim:{16384,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @171945414083
<<hip-api pid:46867 tid:7.6554 46867 7.6554 hipLaunchKernel 'miog_alphaab' gridDim:{19968,1,1} groupDim:{64,1,1} sharedMem:+0 stream:0.1 @172006728753
<<hip-api pid:46867 tid:7.6563 46867 7.6563 hipLaunchKernel 'miog_alphaab' gridDim:{19968,1,1} groupDim:{64,1,1} sharedMem:+0 stream:0.1 @172007199075
<<hip-api pid:46867 tid:7.6587 46867 7.6587 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{256,16,12} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @172008140811
<<hip-api pid:46867 tid:7.6594 46867 7.6594 hipLaunchKernel 'MIOpenCvBwdWrW_rdc' gridDim:{5184,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @172009498868
<<hip-api pid:46867 tid:7.6612 46867 7.6612 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{128,16,12} groupDim:{128,1,1} sharedMem:+0 stream:0.1 @172010308893
[I 23:21:45.113 NotebookApp] Saving file at /boxes.ipynb
Memory access fault by GPU node-1 (Agent handle: 0x55cc5441a000) on address 0x12dba00000. Reason: Page not present or supervisor privilege.

@sunway513
Copy link

Hi @witeko , could you try the following workaround?

wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb && 
wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && 
sudo dpkg -i rocm-opencl*.deb && rm -rf ~/.cache

@sunway513 sunway513 self-assigned this Feb 7, 2019
@sunway513 sunway513 added the bug Something isn't working label Feb 7, 2019
@witeko
Copy link
Author

witeko commented Feb 7, 2019

@sunway513 , still the end result is that i cant fit the model.
What changed:

  • previously i got an error, now it runs;
  • previously only 1 cpu thread was used, now 12 (as it should)

Whats still wrong:

Mine results (not full):
cropping-1.h5
Epoch 1/50
500/500 [==============================] - 92s 184ms/step - loss: 678.9909 - val_loss: 233.5896
Epoch 2/50
500/500 [==============================] - 71s 142ms/step - loss: 596.3531 - val_loss: 135170635751369495095979007803392.0000
Epoch 3/50
500/500 [==============================] - 73s 145ms/step - loss: 510.6091 - val_loss: 1242771570705906405631030040985600.0000
Epoch 4/50
500/500 [==============================] - 89s 178ms/step - loss: 483.7078 - val_loss: 2552272269007826933778212045455360.0000

Epoch 00004: ReduceLROnPlateau reducing learning rate to 0.00800000037997961.
Epoch 5/50
500/500 [==============================] - 84s 167ms/step - loss: 378.6286 - val_loss: 9561524027973968298368100133240832.0000
Epoch 6/50
500/500 [==============================] - 82s 164ms/step - loss: 353.6618 - val_loss: 11074247501713549773766656.0000
Epoch 7/50
500/500 [==============================] - 75s 150ms/step - loss: 338.2722 - val_loss: 30703144039333830472103066844790784.0000

Epoch 00007: ReduceLROnPlateau reducing learning rate to 0.0020000000949949026.
Epoch 8/50
500/500 [==============================] - 74s 148ms/step - loss: 311.1208 - val_loss: 8138100560581569138785225493446656.0000
Epoch 9/50
500/500 [==============================] - 75s 151ms/step - loss: 307.1484 - val_loss: 15032916350034544070250299654144000.0000
Epoch 10/50
500/500 [==============================] - 76s 151ms/step - loss: 307.3004 - val_loss: 11827829771061998205869711652028416.0000

Epoch 00010: ReduceLROnPlateau reducing learning rate to 0.002.
Epoch 00010: early stopping
200/200 [==============================] - 0s 1ms/step
cropping-2.h5
Epoch 1/50
499/500 [============================>.] - ETA: 0s - loss: 571.6973
500/500 [==============================] - 78s 156ms/step - loss: 571.7524 - val_loss: 4811547701095151819784343347265536.0000
Epoch 2/50
500/500 [==============================] - 74s 148ms/step - loss: 584.9026 - val_loss: 197963809546687656600706595757752320.0000
Epoch 3/50
500/500 [==============================] - 76s 152ms/step - loss: 589.8450 - val_loss: inf

Perfect run (as on kaggle: https://www.kaggle.com/martinpiotte/bounding-box-model/output)
cropping-1.h5
Epoch 1/50
500/500 [==============================] - 163s 325ms/step - loss: 529.4243 - val_loss: 136.6158
Epoch 2/50
500/500 [==============================] - 154s 307ms/step - loss: 359.8277 - val_loss: 154.1979
Epoch 3/50
500/500 [==============================] - 158s 317ms/step - loss: 279.5038 - val_loss: 116.9022
Epoch 4/50
500/500 [==============================] - 157s 313ms/step - loss: 247.4618 - val_loss: 98.2346
Epoch 5/50
500/500 [==============================] - 156s 312ms/step - loss: 217.0696 - val_loss: 81.2200
Epoch 6/50
500/500 [==============================] - 155s 309ms/step - loss: 219.4908 - val_loss: 137.4634
Epoch 7/50
500/500 [==============================] - 156s 312ms/step - loss: 209.9637 - val_loss: 122.9328
Epoch 8/50
500/500 [==============================] - 155s 309ms/step - loss: 190.6924 - val_loss: 118.5650

Epoch 00008: ReduceLROnPlateau reducing learning rate to 0.00800000037997961.
Epoch 9/50
500/500 [==============================] - 151s 303ms/step - loss: 167.1313 - val_loss: 111.3087
Epoch 10/50
500/500 [==============================] - 152s 304ms/step - loss: 159.7143 - val_loss: 105.5996
Epoch 11/50
500/500 [==============================] - 151s 301ms/step - loss: 142.6363 - val_loss: 93.6883

Epoch 00011: ReduceLROnPlateau reducing learning rate to 0.0020000000949949026.
Epoch 12/50
500/500 [==============================] - 149s 299ms/step - loss: 126.6333 - val_loss: 67.8425
Epoch 13/50
500/500 [==============================] - 149s 297ms/step - loss: 127.7577 - val_loss: 71.6821
Epoch 14/50
500/500 [==============================] - 154s 308ms/step - loss: 119.6318 - val_loss: 66.8878
Epoch 15/50
500/500 [==============================] - 154s 308ms/step - loss: 118.1301 - val_loss: 63.0346
Epoch 16/50
500/500 [==============================] - 168s 336ms/step - loss: 115.3690 - val_loss: 57.7408
Epoch 17/50
500/500 [==============================] - 170s 339ms/step - loss: 112.3749 - val_loss: 56.7045
Epoch 18/50
500/500 [==============================] - 155s 309ms/step - loss: 111.0754 - val_loss: 54.0247
Epoch 19/50
500/500 [==============================] - 157s 313ms/step - loss: 106.8020 - val_loss: 52.3795
Epoch 20/50
500/500 [==============================] - 155s 309ms/step - loss: 105.8095 - val_loss: 52.4786
Epoch 21/50
500/500 [==============================] - 154s 307ms/step - loss: 103.8766 - val_loss: 47.3362
Epoch 22/50
500/500 [==============================] - 156s 312ms/step - loss: 101.9328 - val_loss: 57.4360
Epoch 23/50
500/500 [==============================] - 156s 313ms/step - loss: 100.0003 - val_loss: 59.1942
Epoch 24/50
500/500 [==============================] - 157s 315ms/step - loss: 96.0528 - val_loss: 47.8618

Epoch 00024: ReduceLROnPlateau reducing learning rate to 0.002.
Epoch 25/50
386/500 [======================>.......] - ETA: 35s - loss: 94.7542

@sunway513
Copy link

@witeko thanks for confirming the vmem fault is gone with the workaround.
Let me repro the converging issue locally.

@sunway513 sunway513 added the gfx803 issue specific to gfx803 GPUs label Feb 7, 2019
@jerryyin
Copy link
Member

@witeko I'm trying to reproduce the issue now. I noticed that kaggle website don't provide pre-build gpu-enabled docker image. Just want to confirm if you get started with a fresh rocm-enabled docker or not? If you start with an image, it would be helpful for me to start with the same image so that we are at the same page

@witeko
Copy link
Author

witeko commented Feb 12, 2019

@jerryyin I don't use any images, it would me impractical for me.
I bet (99% sure) that:

  • if You download the training data
  • and start with a fresh image + install the newest versions of keras and whatever is required (You will know whats required once You start to run the code)
  • and run the code in jupyter notebook

then You will get the same issue as I did.
This shouldn't take too long. :)

@witeko
Copy link
Author

witeko commented Feb 15, 2019

@sunway513 , @jerryyin , cmon guys. :) Ppl using tensorflow for creating deep learning models are not developers, we don't use docker/images/... for living.
The way I see it: a real problem has been reported, and a real fix is need, in real-life conditions, not laboratory conditions (like docker...).
And this real fix is need not for me, but for all the users and for AMD as-well...

@jerryyin
Copy link
Member

@witeko Sure, we've been working on it closely. As an update, we've been able to reproduce it in gfx803, but not on gfx900. Still trying my best in investigating the root cause. Thank you for your patience.

@witeko
Copy link
Author

witeko commented Feb 16, 2019

@jerryyin thx, "being impatient" is actually what I do for living. :)

@jerryyin
Copy link
Member

jerryyin commented Mar 7, 2019

Giving an update of things I have tried so far, with hints from #337, and #251 non-converging kernel issues on gfx803. The claim from issue 251 is that keras.optimizers.Adam malfunctions when using together with MaxPooling2D, which is exactly the same use case of this issue. The claim from issue 337 is related only with the tf.train.AdamOptimizer.

Given this, my debugging so far focuses on trying to narrow down to a specific operator, and the couple of things I have tried:

  • Run with export MIOPEN_ENABLE_LOGGING_CMD=1. Nothing particular from MIOpen stands out.
  • Manually place MaxPooling2D on CPU, which is confirmed by log_device_placement=True. The non-converging issue persists. A further attempt on placing every non-con2D or non-batchnorm operator on CPU didn't work as well.
  • Manually place keras.optimizers.Adam on CPU. It does not work exactly as I expected. Most of the training/Adam/* operators are still placed on GPU, and only a minimal subset is on CPU. The non-converging issue persists.
  • Swap the keras.optimizers.Adam to tf.train.AdamOptimizer() / tf.contrib.opt.AdamOptimizer() / tf.contrib.opt.AdamWOptimizer(). Attempts failed due to two reasons: The interface is incompatible between regular tf domain and tf.keras domain. The incompatibility between keras domain and tf.keras domain.

Looking at the debugging attempts, it seems that there is a strong indication on the malfunction on Adam optimizer, both in tf domain and keras domain. I'm suggesting to revisit this issue after we are able to isolate the problem on Adam optimizer.

@sunway513

@witeko
Copy link
Author

witeko commented Mar 8, 2019

@jerryyin thanks for the update :)

@jerryyin
Copy link
Member

jerryyin commented Mar 8, 2019

Giving another update. After several trial runs, the very likely operator malfunctions is one of training/Adam/gradients/conv2d* operator. This is based on me manually put all Conv2D, both inference and training operators on GPU, and the model running so far is converging. I will be working on to compile a complete list of operators that get switched to CPU. This can help us narrow down the problem scope greatly.

@whchung
Copy link
Collaborator

whchung commented Mar 8, 2019

@sunway513 don’t we have other tickets where Adam optimizer behaving funky on gfx803?

@sunway513
Copy link

@whchung are you referring to #337?

@whchung
Copy link
Collaborator

whchung commented Mar 8, 2019

Yes. From the investigation of @jerryyin it seems to be similar issues?

@sunway513
Copy link

Right, I'm spsecting those two issues have same root cause. Besides, #325 can be related as well.

@jerryyin
Copy link
Member

jerryyin commented Mar 8, 2019

Giving another update. After several trial runs, the very likely operator malfunctions is one of training/Adam/gradients/conv2d* operator. This is based on me manually put all Conv2D, both inference and training operators on GPU, and the model running so far is converging. I will be working on to compile a complete list of operators that get switched to CPU. This can help us narrow down the problem scope greatly.

Just now taking a look at the operators put on CPU. It is a rather long list. A summary of that:
conv2d* operators
batch_nomalization* operators
training/Adam/gradients/* operators
training/Adam/gradients/conv2d* operators
In total there are ~2k differences according to log_device_placement. Will need to use a smaller problem to continue.

@jerryyin
Copy link
Member

jerryyin commented Mar 11, 2019

Confirmed that the following patch make the model converges. However, please note that this will make the model run 10x ~ 100x slower.

@sunway513
Copy link

@jerryyin could you help re-validate this issue with ROCm2.5 docker containers?

@jerryyin
Copy link
Member

@sunway513 I did a re-validation just now and the run straight up crashed. Looking at the tensorflow VLOG context I don't think it it even related with tensorflow. The complaint is:

Memory access fault by GPU node-8 (Agent handle: 0x7b997e0) on address 0x21dfc00000. Reason: Page not present or supervisor privilege.

@sunway513
Copy link

@jerryyin , thanks Jerry.
Is the driver on rock-dkms 2.5 as well? It's strange that we still see VM fault with the fix delivered in ROCm2.5.

@jerryyin
Copy link
Member

Providing an update here: Internal ticket opened with additional details provided to reproduce: SWDEV-193136 Tensorflow report GPUVM fault on gfx803. Will update once received further information from the ticket.

@ROCmSupport
Copy link

Thanks for reaching out.
gfx8 is not a supported config now.
We are not supporting gfx8 devices officially with ROCm and request you to follow our supported hardware section @ ROCm docs: https://github.com/RadeonOpenCompute/ROCm#Hardware-and-Software-Support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gfx803 issue specific to gfx803 GPUs
Projects
None yet
Development

No branches or pull requests

6 participants