disable gemm f16 on CPU #19744

yufenglee · 2024-03-01T17:43:35Z

Description

Temporarily disable fp16 gemm on CPU because it usually needs a following Cast which offsets the gain. Need more fp16 operators implementation and performance tuning.

Also fix a fusion error of LayerNormalization.

Motivation and Context

### Description  Temporarily disable fp16 gemm on CPU because it usually needs a following Cast which offsets the gain. Need more fp16 operators implementation and performance tuning. Also fix a fusion error of LayerNormalization. ### Motivation and Context

disable Gemm fp16 on cpu

4a98035

yufenglee marked this pull request as ready for review March 1, 2024 17:43

yufenglee requested review from snnn, pranavsharma and smk2007 March 1, 2024 17:49

snnn previously approved these changes Mar 1, 2024

View reviewed changes

pranavsharma previously approved these changes Mar 1, 2024

View reviewed changes

fix format

dc6fa4e

yufenglee dismissed stale reviews from pranavsharma and snnn via dc6fa4e March 1, 2024 18:19

pranavsharma approved these changes Mar 1, 2024

View reviewed changes

snnn approved these changes Mar 1, 2024

View reviewed changes

yufenglee merged commit 22176a5 into main Mar 1, 2024
93 of 95 checks passed

yufenglee deleted the yufeng/disable_gemm_f16 branch March 1, 2024 21:44

cjm715 mentioned this pull request Oct 16, 2024

[Feature Request] FP16 support for MatMul and GEMM on CPU execution provider #22467

Open

cjm715 mentioned this pull request Dec 4, 2024

Revert "disable gemm f16 on CPU (#19744)" quadric-io/onnxruntime#26

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

disable gemm f16 on CPU #19744

disable gemm f16 on CPU #19744

yufenglee commented Mar 1, 2024

disable gemm f16 on CPU #19744

disable gemm f16 on CPU #19744

Conversation

yufenglee commented Mar 1, 2024

Description

Motivation and Context