PyTorch Hugging Face Models do not have ACL calls for Docker versions > 23.05 #200

abhishek-rn · 2023-09-19T05:24:11Z

Hi,

Docker Tags:
r23.09-torch-2.0.0-onednn-acl
r23.05-torch-2.0.0-onednn-acl

I am unable to get acl calls in docker versions higher than 23.05 for Pytorch Hugging Face Models

Attaching oneDNN verbose calls for BERT model here
23.05_Bert_Verbose.txt
23.09_Bert_Verbose.txt

The code to reproduce this is attached as below:
PyT_Bert_Training.txt --> Use this for the first run to generate necessary inference checkpoints and files.
PyT_Bert_Inf.txt --> For subsequent runs to generate the oneDNN logs

Also, as a result, the later oneDNN verbose exhibits gemm:jit calls for Matmuls and this results in poor performance for inference compared to gemm:acl calls.

Thanks

nSircombe · 2023-09-19T08:48:55Z

Hi @abhishek-rn
Thanks for the report. This transition from 23.05 to 23.06 marks the move from PyTorch 1.x to 2.x, so it looks like we may have lost some functionality at this stage.
Would you be able to confirm if the same behaviour is present if you use the pip installed pytorch packages for 1.3 and 2.0 on AArch64, and also on x86?

abhishek-rn · 2023-09-19T09:33:47Z

Hi @nSircombe
The Docker tag read r23.05-torch-2.0.0-onednn-acl.
So, I thought that would mean torch-2.0.0.
However, I ran the pip installed pytorch 2.0.0 and 1.13 and PFB the logs:
ARM_PyT_1.13_Bert_Verbose.txt
ARM_PyT_2.0.0_Bert_Verbose.txt

And the results there show that PyT 1.13 has no ACL calls but PyT 2.0.0 has.

x86_PyT_1.13_Bert_Verbose.txt
x86_PyT_2.0.0_Bert_Verbose.txt

Also, x86 PyTorch do not have oneDNN calls for Matmuls as seen in the above logs

nSircombe · 2023-09-19T11:05:50Z

Yes you're right, the version is 2.0. The tag is correct - matches the version in the Dockerfile. The mistake is in the README for the 23.05 increment here which still has 1.3.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyTorch Hugging Face Models do not have ACL calls for Docker versions > 23.05 #200

PyTorch Hugging Face Models do not have ACL calls for Docker versions > 23.05 #200

abhishek-rn commented Sep 19, 2023

nSircombe commented Sep 19, 2023

abhishek-rn commented Sep 19, 2023

nSircombe commented Sep 19, 2023

PyTorch Hugging Face Models do not have ACL calls for Docker versions > 23.05 #200

PyTorch Hugging Face Models do not have ACL calls for Docker versions > 23.05 #200

Comments

abhishek-rn commented Sep 19, 2023

nSircombe commented Sep 19, 2023

abhishek-rn commented Sep 19, 2023

nSircombe commented Sep 19, 2023