Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable gemm f16 on CPU #19744

Merged
merged 2 commits into from
Mar 1, 2024
Merged

disable gemm f16 on CPU #19744

merged 2 commits into from
Mar 1, 2024

Conversation

yufenglee
Copy link
Member

Description

Temporarily disable fp16 gemm on CPU because it usually needs a following Cast which offsets the gain. Need more fp16 operators implementation and performance tuning.

Also fix a fusion error of LayerNormalization.

Motivation and Context

@yufenglee yufenglee marked this pull request as ready for review March 1, 2024 17:43
snnn
snnn previously approved these changes Mar 1, 2024
pranavsharma
pranavsharma previously approved these changes Mar 1, 2024
@yufenglee yufenglee dismissed stale reviews from pranavsharma and snnn via dc6fa4e March 1, 2024 18:19
@yufenglee yufenglee merged commit 22176a5 into main Mar 1, 2024
93 of 95 checks passed
@yufenglee yufenglee deleted the yufeng/disable_gemm_f16 branch March 1, 2024 21:44
maggie1059 pushed a commit that referenced this pull request Mar 5, 2024
### Description
<!-- Describe your changes. -->
Temporarily disable fp16 gemm on CPU because it usually needs a
following Cast which offsets the gain. Need more fp16 operators
implementation and performance tuning.

Also fix a fusion error of LayerNormalization.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
zz002 pushed a commit to zz002/onnxruntime that referenced this pull request Mar 7, 2024
### Description
<!-- Describe your changes. -->
Temporarily disable fp16 gemm on CPU because it usually needs a
following Cast which offsets the gain. Need more fp16 operators
implementation and performance tuning.

Also fix a fusion error of LayerNormalization.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants