Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory access fault by GPU node-1 (Agent handle: 0x2e0dbf0) on address 0x6dccc0000. Reason: Page not present or supervisor privilege. #302

Closed
fendiwira opened this issue Jan 29, 2019 · 44 comments
Assignees
Labels
bug Something isn't working gfx803 issue specific to gfx803 GPUs

Comments

@fendiwira
Copy link

Hello guys..

I am having issue to run rocm tensorflow with detail as follow:

System information

  • Have I written custom code : No
    I try to run this keras tensorflow codes :
    Keras Mask RCNN : https://github.com/matterport/Mask_RCNN
    Keras SSD : https://github.com/pierluigiferrari/ssd_keras
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04.1 LTS
  • TensorFlow installed from whl package : pip3 install --user tensorflow-rocm
  • TensorFlow version (use command below): 1.12
  • Python version: 3.6.7
  • ROCM version : 2.0
  • CPU Memory: 16GB
  • GPU model and memory: RADEON RX 580 8 GB
    recongnized as:
    name: Ellesmere [Radeon RX 470/480]
    AMDGPU ISA: gfx803
    memoryClockRate (GHz) 1.34
    pciBusID 0000:01:00.0
    Total memory: 8.00GiB
    Free memory: 7.75GiB

Describe the current behavior
Epoch 1/30
2019-01-29 22:25:46.392668: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
2019-01-29 22:25:46.446704: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
Memory access fault by GPU node-1 (Agent handle: 0x2e0dbf0) on address 0x6dccc0000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)

Describe the expected behavior
Running normally until epoch 30/30

Code to reproduce the issue
Keras Mask RCNN
python3 platno.py train --dataset=/home/path/to/dataset --weights=coco
Always getting error with core dumped as above message

Keras SSD
python3 ssd300_training.py
can run normally when lowering batch size from 32 to 8

python3 ssd7_training.py
getting core dumped even lowering batch size to 1

Other info / logs
Have tried to enable some env variable for debug but still get error:
HSA_ENABLE_SDMA=0
HSA_ENABLE_INTERRUPT=0
HSA_SVM_GUARD_PAGES=0
HSA_DISABLE_CACHE=1

Please assist how to resolve this problem

Thanks and Regards

@parallelo
Copy link

Thanks for reporting the issue, @fendiwira. We'll take a look.

@whchung
Copy link
Collaborator

whchung commented Jan 29, 2019

@parallelo / @sunway513 it seems quite a few recent issues raised are based on gfx803 ISA.

@parallelo
Copy link

Yep, was just looking at that too. At least two others recent gfx803 mem fault issues, right?

#282
#300

@witeko
Copy link

witeko commented Jan 29, 2019

@parallelo and mine #297 (the neglected one) :)

@sunway513
Copy link

The issue has been identified a regression in ROCM2.0 user bits, only for Polaris; will keep posted here for further updates.

@parallelo
Copy link

For future users who hit similar Memory access fault errors, just wanted to mention the typical triage process for this type of error.

This error typically occurs with an out of bounds memory access on the GPU. The first step is to serialize all GPU kernels & copies, then dump out the kernel names that are launching.

export HCC_SERIALIZE_KERNEL=0x3
export HCC_SERIALIZE_COPY=0x3
export HIP_TRACE_API=0x2
[then re-run your application]

Often (but not always) the last printed kernel will be the one to further investigate -- it might point to a numerical library or something else that can potentially be triaged with a smaller test case.

More tips are listed here: https://rocm-documentation.readthedocs.io/en/latest/Other_Solutions/Other-Solutions.html

@fendiwira
Copy link
Author

Thanks for the prompt response

Here I attach the last kernel print out:

<<hip-api pid:2395 tid:4.13829 2395 4.13829 hipLaunchKernel '_ZN10tensorflow14GatherOpKernelIfxLb1EEEvPKT_PKT0_PS1_xxxx' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302701896114
<<hip-api pid:2395 tid:5.11 2395 5.11 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702164059
<<hip-api pid:2395 tid:5.14 2395 5.14 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{13312,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702223749
<<hip-api pid:2395 tid:5.17 2395 5.17 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702277692
<<hip-api pid:2395 tid:5.20 2395 5.20 hipLaunchKernel '_ZN10tensorflow12_GLOBAL__N_119CropAndResizeKernelIfEEviPKT_PKfPKiiiiiiiiifPf' gridDim:{13312,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302702317537
<<hip-api pid:2395 tid:4.13893 2395 4.13893 hipLaunchKernel '_ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIxLi1ELi1ElEELi16ENS_11MakePointerEEEKNS_18TensorConversionOpIxKNS4_INS5_IKiLi1ELi1ElEELi16ES7_EEEEEENS_9GpuDeviceEEElEEvT_T0_' gridDim:{1024,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302703556961
<<hip-api pid:2395 tid:4.14003 2395 4.14003 hipLaunchKernel '_ZN10tensorflow7functor37SwapDimension1And2InTensor3UsingTilesIjLi256ELi32ELi32ELb0EEEvPKT_NS0_9DimensionILi3EEEPS2_' gridDim:{2867200,1,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @302759947163
<<hip-api pid:2395 tid:4.14005 2395 4.14005 hipLaunchKernel '_ZN10tensorflow7functor22ShuffleInTensor3SimpleIfLi2ELi1ELi0ELb0EEEviPKT_NS0_9DimensionILi3EEEPS2_' gridDim:{36864,1,1} groupDim:{1024,1,1} sharedMem:+0 stream:0.1 @302761067898
<<hip-api pid:2395 tid:2.31434 2395 2.31434 hipLaunchKernel 'miog_alphaab' gridDim:{43008,1,1} groupDim:{16,1,1} sharedMem:+0 stream:0.1 @302772118882
<<hip-api pid:2395 tid:2.31443 2395 2.31443 hipLaunchKernel 'miog_alphaab' gridDim:{43008,1,1} groupDim:{16,1,1} sharedMem:+0 stream:0.1 @302775748480
<<hip-api pid:2395 tid:2.31464 2395 2.31464 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{131072,2,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @302779837406
<<hip-api pid:2395 tid:2.31479 2395 2.31479 hipLaunchKernel 'MIOpenCvBwdWrW' gridDim:{131072,2,1} groupDim:{256,1,1} sharedMem:+0 stream:0.1 @302783683721
Memory access fault by GPU node-1 (Agent handle: 0x1bff4b0) on address 0x6ed57a000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)

@parallelo
Copy link

Thanks, that's helpful. Next, would you be able to additionally run with the following:

export MIOPEN_ENABLE_LOGGING_CMD=1

Then, please send us the last section of the log.

@fendiwira
Copy link
Author

OK, Here the result


Epoch 1/30
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 3 -H 1030 -W 1030 -k 64 -y 7 -x 7 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 3 -H 1030 -W 1030 -k 64 -y 7 -x 7 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenPoolingForward: ./bin/MIOpenDriver pool -n 1 -c 64 -H 512 -W 512 -y 3 -x 3 -p 0 -q 0 -u 2 -v 2 -m max -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 64 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 64 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 64 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 128 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 128 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 128 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 128 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 128 -H 128 -W 128 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 256 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 1024 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 512 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 1024 -H 64 -W 64 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 512 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 2048 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 2048 -H 32 -W 32 -k 256 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 32 -W 32 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenPoolingForward: ./bin/MIOpenDriver pool -n 1 -c 256 -H 32 -W 32 -y 1 -x 1 -p 0 -q 0 -u 2 -v 2 -m max -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 16 -W 16 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 16 -W 16 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 32 -W 32 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 16 -W 16 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 64 -W 64 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 64 -W 64 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 128 -W 128 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 128 -W 128 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 256 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 256 -H 256 -W 256 -k 512 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 12 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionForward: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
2019-01-30 07:52:27.566019: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
miopenConvolutionBackwardData: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
2019-01-30 07:52:27.652974: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
miopenFindConvolutionBackwardWeightsAlgorithm: ./bin/MIOpenDriver conv -n 1 -c 512 -H 256 -W 256 -k 6 -y 1 -x 1 -p 0 -q 0 -u 1 -v 1 -l 1 -j 1 -m conv -g 1 -t 1
Memory access fault by GPU node-1 (Agent handle: 0x3346900) on address 0x6dd90f000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)

@whchung
Copy link
Collaborator

whchung commented Jan 30, 2019

  • @daniellowell @zjing14 . Please refer to the logs above. It appears one configuration of MIOpen backward weights is failing on gfx803.

@sunway513
Copy link

Hi @fendiwira , can you try the following step and see if that can fix your issue:

cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl &&
wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb && 
wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && 
sudo dpkg -i rocm-opencl*.deb && rm -rf ~/.cache

@fendiwira
Copy link
Author

Hi @sunway513 thanks,
I'll try and revert to you

@arsenm
Copy link

arsenm commented Feb 1, 2019

The most suspicious thing I've found is this:

BB0_10:
       v_add_u32_e32 v19, vcc, 4, v15
       v_add_u32_e32 v19, vcc, 16, v19
       buffer_load_dword v19, v19, s[0:3], s11 offen
       v_add_u32_e32 v15, vcc, 4, v15

The total allocated private size is 20 bytes, and this is accessing 20 bytes off the scratch wave offset. It's possible the base pointer here is negative, but as far as I can tell that isn't possible here

@arsenm
Copy link

arsenm commented Feb 1, 2019

The most suspicious thing I've found is this:

BB0_10:
       v_add_u32_e32 v19, vcc, 4, v15
       v_add_u32_e32 v19, vcc, 16, v19
       buffer_load_dword v19, v19, s[0:3], s11 offen
       v_add_u32_e32 v15, vcc, 4, v15

The total allocated private size is 20 bytes, and this is accessing 20 bytes off the scratch wave offset. It's possible the base pointer here is negative, but as far as I can tell that isn't possible here

Nevermind, this only appears in my mangled version trying to find the fault point

@yet-another-account
Copy link

yet-another-account commented Feb 2, 2019

I am getting this error on my RX580 too. I have pared down my code to isolate the problem:

import numpy as np

import tensorflow as tf

tf.enable_eager_execution()
print(tf.executing_eagerly())


model = tf.keras.layers.Conv2D(1, (3, 3), activation='relu', padding='same')

img = tf.random_uniform((1, 128, 128, 1), dtype=tf.float32)
img = tf.image.resize_images(img, [128, 256], align_corners=True, preserve_aspect_ratio=False)

print(img.shape)
with tf.GradientTape() as tape:
    print(1)
    img_hat = model(img)
    print(2)
    loss = tf.reduce_mean(img_hat)
    print(3)
grads = tape.gradient(loss, model.variables)
print(4)

This fails with Memory access fault by GPU node-1 (Agent handle: 0x21eefd0) on address 0xadc658000. Reason: Page not present or supervisor privilege.

However, interestingly, when

img = tf.image.resize_images(img, [128, 256], align_corners=True, preserve_aspect_ratio=False)

is replaced with

img = tf.image.resize_images(img, [128, 128], align_corners=True, preserve_aspect_ratio=False)

it succeeds without problems.

EDIT: after downgrading to 1.2.0-2018111340 it works perfectly.

@fendiwira
Copy link
Author

Hi @fendiwira , can you try the following step and see if that can fix your issue:

cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl &&
wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb && 
wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && 
sudo dpkg -i rocm-opencl*.deb && rm -rf ~/.cache

Hi @sunway513 it's works thank you..

@sunway513
Copy link

@fendiwira thanks for the feedback! Will update when there's an official fix available.

@sunway513 sunway513 added the bug Something isn't working label Feb 7, 2019
@sunway513 sunway513 self-assigned this Feb 7, 2019
@sunway513 sunway513 added the gfx803 issue specific to gfx803 GPUs label Feb 7, 2019
@johnneijzen
Copy link

Hi @fendiwira , can you try the following step and see if that can fix your issue:

cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl &&
wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb && 
wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && 
sudo dpkg -i rocm-opencl*.deb && rm -rf ~/.cache

Hi @sunway513 it's works thank you..

also works here I had similar problem while training model for object detection using faster rcnn inception v2 because but that downgrade it worked again

@leosarra
Copy link

leosarra commented May 5, 2019

Same problem here on my RX480 when training a VGG16 network.
Downgrading to an older release (1.2.0-2018111340) prevented the issue from showing up

@Bengt
Copy link

Bengt commented May 14, 2019

I put @eukaryote31's test on gist for easier reproduction:

https://gist.github.com/Bengt/2d4b8535c781ded2b9ce653cfe7b0eeb

I am reproducing using ROCm 2.1 and Tensorflow 1.12:

$ docker run -it --device=/dev/kfd --device=/dev/dri --group-add video rocm/tensorflow:rocm2.1-tf1.12-python3
$ wget https://gist.githubusercontent.com/Bengt/2d4b8535c781ded2b9ce653cfe7b0eeb/raw/34e0426e10e665df0f66c298bb07f879bb2abe79/test.py

The test completes without error on CPU (Threadripper 1950X):

# env HIP_VISIBLE_DEVICES= python3 test.py
[...]
True
2019-05-14 11:40:41.038653: E tensorflow/stream_executor/rocm/rocm_driver.cc:965] could not retrieve ROCM device count: HIP_ERROR_NoDevice
(1, 128, 256, 1)
1
2
3
4

The test fails with the aforementioned Memory access fault on GPU (gfx803, Fiji, Fury X):

# python3 test.py           
[...]
True
2019-05-14 11:34:23.980338: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 0 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:09:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:34:23.980488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 1 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:42:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:34:23.980640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 2 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1.05
pciBusID 0000:43:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:34:23.980710: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Adding visible gpu devices: 0, 1, 2
2019-05-14 11:34:23.980750: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-14 11:34:23.980765: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1057]      0 1 2 
2019-05-14 11:34:23.980775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 0:   N N N 
2019-05-14 11:34:23.980784: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 1:   N N N 
2019-05-14 11:34:23.980792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 2:   N N N 
2019-05-14 11:34:23.980860: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3540 MB memory) -> physical GPU (device: 0, name: Device 7300, pci bus id: 0000:09:00.0)
2019-05-14 11:34:23.997726: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 3540 MB memory) -> physical GPU (device: 1, name: Device 7300, pci bus id: 0000:42:00.0)
2019-05-14 11:34:24.014538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 3540 MB memory) -> physical GPU (device: 2, name: Device 7300, pci bus id: 0000:43:00.0)
(1, 128, 256, 1)
1
2
3
2019-05-14 11:34:28.159093: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
2019-05-14 11:34:28.755163: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
Memory access fault by GPU node-2 (Agent handle: 0x2c5aee0) on address 0xbe1c00000. Reason: Page not present or supervisor privilege.
Aborted (core dumped)

The downgrade suggested by @sunway513 works for me too:

# cd ~ && mkdir rocm1.9.2-opencl && cd rocm1.9.2-opencl && wget https://www.dropbox.com/s/rtwe1zrpuphbyqm/rocm-opencl-1.2.0-2018111340_amd64.deb &&  wget https://www.dropbox.com/s/6gp2g5zju66i4e9/rocm-opencl-dev-1.2.0-2018111340_amd64.deb && dpkg -i rocm-opencl*.deb && rm -rf ~/.cache && cd -
# python3 test.py 
[...]
True
2019-05-14 11:47:45.049593: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 0 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:09:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:47:45.049735: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 1 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1
pciBusID 0000:42:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:47:45.049847: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1530] Found device 2 with properties: 
name: Device 7300
AMDGPU ISA: gfx803
memoryClockRate (GHz) 1.05
pciBusID 0000:43:00.0
Total memory: 4.00GiB
Free memory: 3.75GiB
2019-05-14 11:47:45.049912: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Adding visible gpu devices: 0, 1, 2
2019-05-14 11:47:45.049944: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-14 11:47:45.049956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1057]      0 1 2 
2019-05-14 11:47:45.049966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 0:   N N N 
2019-05-14 11:47:45.049976: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 1:   N N N 
2019-05-14 11:47:45.049987: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1070] 2:   N N N 
2019-05-14 11:47:45.050053: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3540 MB memory) -> physical GPU (device: 0, name: Device 7300, pci bus id: 0000:09:00.0)
2019-05-14 11:47:45.066602: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 3540 MB memory) -> physical GPU (device: 1, name: Device 7300, pci bus id: 0000:42:00.0)
2019-05-14 11:47:45.084008: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1189] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 3540 MB memory) -> physical GPU (device: 2, name: Device 7300, pci bus id: 0000:43:00.0)
(1, 128, 256, 1)
1
2
3
2019-05-14 11:47:49.123642: I tensorflow/core/kernels/conv_grad_input_ops.cc:1023] running auto-tune for Backward-Data
2019-05-14 11:47:49.890313: I tensorflow/core/kernels/conv_grad_filter_ops.cc:975] running auto-tune for Backward-Filter
4

@Bengt
Copy link

Bengt commented May 14, 2019

The issue persists and the downgrade still fixes it with today's rocm2.3-tf1.13-imagenet-training.

@Bengt
Copy link

Bengt commented May 17, 2019

This issue persists with rocm2.4-tf2.0-alpha0-config-v2 and the downgrade still fixes it.

@gaetanbahl
Copy link

I have this same issue using a R9 Fury card, following the installation guide https://rocm.github.io/tensorflow.html

The downgrade indeed fixed the issue.

A "true" fix would be preferable. Let me know if you need anything (config details, tests...).

@sunway513
Copy link

Hi all, we have included a set of OpenCL toolchain fixes for GFX803 targets in ROCm2.5, in my local GFX803 setup with ROCm2.5 docker image, VM fault is no longer reproducible using the reduced test from @Bengt.
Please try the following docker image:
rocm/tensorflow:rocm2.5-tf1.13-python3

@gaetanbahl
Copy link

Hello @sunway513, I tried the new image on R9 Fury (non X) and am still getting this issue when running the following command:

python3 benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --num_gpus=1 --batch_size=4 --model=vgg16

BTW, I had to copy /opt/rocm/miopen/share/miopen/db/gfx803_64.cd.pdb.txt to /opt/rocm/miopen/share/miopen/db/gfx803_56.cd.pdb.txt in order to avoid annoying MIOpen(HIP): Warning [FindRecordUnsafe] File is unreadable:/opt/rocm/miopen/share/miopen/db/gfx803_56.cd.pdb.txt messages.

@sunway513
Copy link

Hi @gaetanbahl , VGG16 can run correctly on my local GFX803 setup using ROCm2.5 docker image.
Could you provide the logs for the following commands:
uname -a
apt --installed list | grep rock-dkms
Besides, it would be helpful if you can ensure the HIP unit tests can pass:
https://github.com/ROCm-Developer-Tools/HIP/tree/master/tests

For the concern on gfx803 MIOpen perfDB, MIOpen by default provides the following performance database:
gfx803_36.cd.pdb.txt gfx803_64.cd.pdb.txt gfx900_56.cd.pdb.txt gfx900_64.cd.pdb.txt gfx906_60.cd.pdb.txt gfx906_64.cd.pdb.txt
It seems your R9 Fury board spec is not on the list.
@daniellowell , could you comment on this issue?

@gaetanbahl
Copy link

I am using the docker image you mentionned.

root@epsilon:/dockerx# uname -a Linux epsilon 4.15.0-51-generic #55-Ubuntu SMP Wed May 15 14:27:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

$ apt --installed list | grep rock-dkms WARNING: apt does not have a stable CLI interface. Use with caution in scripts. rock-dkms/now 2.4-25 all [installed,upgradable to: 2.5-27]

Oh, I guess I should upgrade rock-dkms, sorry... I will upgrade and try again.

@gaetanbahl
Copy link

@sunway513 Indeed, I don't get the memory error anymore, only the .txt thing.

Thanks for your help!

Can you confirm that simply copying the gfx803_64.cd.pdb.txt file to gfx803_65.cd.pdb.txt will not give me problems?

@sunway513
Copy link

@gaetanbahl , thanks for the update :-)
Copying the MIOpen performance database won't get you any functionality issue.

@leosarra
Copy link

leosarra commented Jun 8, 2019

Can confirm that the crash doesn't occur anymore on my RX 480. Thank you for your hard work

@sunway513
Copy link

Thank you @LithiumSR for confirming it!

@Bengt
Copy link

Bengt commented Jun 20, 2019

I can confirm the test working under rocm2.5-tf1.13-python3 with R9 Fury X and Nano. Thanks for fixing!

@urugn
Copy link

urugn commented Nov 19, 2019

Am not sure if to open a new issue because am having the same issue but with gfx900 (Vega 64).
Sometimes it runs but over 70% of the time this error occurs.
For my case installed rocm ubuntu 18.04 and compiled MIVisionX from source.

@sunway513
Copy link

@urugn can you try the docker container:
https://hub.docker.com/repository/docker/rocm/tensorflow

@minzak
Copy link

minzak commented Dec 2, 2019

Same problem with miner on gfx900 (Vega FE)
xmrig/xmrig#1340

@ranisalt
Copy link

ranisalt commented Feb 3, 2020

Same problem on Vega M GH, setting HCC_SERIALIZE_KERNEL=0x3 HCC_SERIALIZE_COPY=0x3 HIP_TRACE_API=0x2 MIOPEN_ENABLE_LOGGING_CMD=1 produces no further output.

@Bengt
Copy link

Bengt commented Feb 3, 2020

I ported the test to TensorFlow 2:

wget https://gist.githubusercontent.com/Bengt/2d4b8535c781ded2b9ce653cfe7b0eeb/raw/c1ba1169aebdc980a144ac1672c6402235a470aa/test_tf2.py

It still works with image rocm/tensorflow:rocm3.0-tf2.1-rc1-python3 on 4 x Vega 64 8 GB Liquid Edition.

@Dan-RAI
Copy link

Dan-RAI commented Feb 4, 2020

Same problem on Radeon VII running custom hip ported code distributed via ray. The code runs flawless without ray. On nvidia no problems with non-ported code and ray.

@FormulasT
Copy link

This problem is still exist when I use latest docker of rocm/tensorflow.I have been trying since yesterday.

@Soddentrough
Copy link

Another Radeon VII with the same issue (on AI Benchmark):

MIOpen Error: /root/driver/MLOpen/src/gemm_v2.cpp:523: rocBlas error encountered
Memory access fault by GPU node-1 (Agent handle: 0x5600f99cb850) on address 0x19000. Reason: Unknown.

ROCm: 3.5.0
TF Version: 2.2.0

@spades1404
Copy link

Can somebody rehost the dropbox files in the fix that @sunway513 did. They are no longer availlable and I cannot issue the commands. Thanks!

@Extarys
Copy link

Extarys commented Oct 15, 2020

I also tried to install AMDGPU-PRO but opencl wasn't available. I was able to install ROCm and OpenCL is now detected but I also have this error.
My guess is even if the dropbox links above worked, the files might be outdated for the current version @spades1404

@spades1404
Copy link

Yeah I eventually figured that out. Turns out 3.8 is broken(at least for me), and after many hours trying to configure a docker container with the "apparently" working 2.5 downgrade, I ran into more compatibility issues with python since it utilises python 3.5. if the apt-get hosted lower versions I could've just downgraded the version on my local machine. Anyways I've decided to just use colab now!

@sunway513
Copy link

The OpenCL packages I posted last year can be found here:
http://repo.radeon.com/rocm/apt/1.9.3/pool/main/r/rocm-opencl/rocm-opencl_1.2.0-2018111340_amd64.deb
http://repo.radeon.com/rocm/apt/1.9.3/pool/main/r/rocm-opencl-dev/rocm-opencl-dev_1.2.0-2018111340_amd64.deb
However, the newly reported issue should be different, and most likely would not benefit from the old OpenCL packages.

@Extarys @spades1404 Can you help create a new issue and provide the following information:

  • ROCm version
  • GPU model
  • Steps to reproduce the issue

cc @jerryyin @deven-amd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gfx803 issue specific to gfx803 GPUs
Projects
None yet
Development

No branches or pull requests