Sign CUDA Kernel #17293

baijumeswani · 2023-08-25T04:56:37Z

l1_loss is defined as: mean(abs(y1 - y2))

If y = abs(x), dy/dx = sign(x).

In onnxruntime, Sign does not have a cuda kernel. As a result, the execution graph looks like: MemcpyToHost -> Sign -> MemcpyFromHost

This PR implements the Sign cuda kernel so as to avoid the memcpy.

centwang · 2023-08-25T08:15:23Z

If the percentage of the kernel time in the profile result is minor, I actually think that adding CUDA kernel of Sign is much simpler as it only requires several lines change in the unary elementwise, and it also helps ORT to run inference or forward graph with Sign on CUDA in the future...

…baijumeswani/abs-grad

baijumeswani · 2023-08-25T21:17:06Z

If the percentage of the kernel time in the profile result is minor, I actually think that adding CUDA kernel of Sign is much simpler as it only requires several lines change in the unary elementwise, and it also helps ORT to run inference or forward graph with Sign on CUDA in the future...

Makes sense. I was contemplating whether I should add the Sign cuda kernel or the AbsGrad cuda kernel initially.

Made the change now to add the Sign cuda kernel

…baijumeswani/abs-grad

baijumeswani · 2023-08-29T04:04:38Z

Thank you for the review @er3x3 @hariharans29

Cherry-pick PRs: #18026 #17912 #17901 “2 lines added whitespace errors when cherry-picking" #17293 #17364 #17505 #17885 This PR contains all the cherry-picks for the patch release except: 1. The PRs marked with sdxl_llama 2. #17772 which has a merge conflict. --------- Co-authored-by: Chi Lo <[email protected]> Co-authored-by: Chi Lo <[email protected]> Co-authored-by: Scott McKay <[email protected]> Co-authored-by: Baiju Meswani <[email protected]> Co-authored-by: Kaz Nishimura <[email protected]> Co-authored-by: Scott McKay <[email protected]>

baijumeswani added the training issues related to ONNX Runtime training; typically submitted using template label Aug 25, 2023

baijumeswani requested review from askhade, pengwa and centwang August 25, 2023 04:56

Address pull request review comment

9220980

baijumeswani force-pushed the baijumeswani/abs-grad branch from 2ea9aa9 to 9220980 Compare August 25, 2023 21:15

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

c53e9c2

…baijumeswani/abs-grad

baijumeswani changed the title ~~AbsGrad CPU and CUDA Kernels~~ Sign CUDA Kernel Aug 25, 2023

baijumeswani requested a review from hariharans29 August 25, 2023 21:29

centwang previously approved these changes Aug 28, 2023

View reviewed changes

baijumeswani added 4 commits August 28, 2023 16:33

Add rocm execution provider for Sign

b5ba287

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

41aea97

…baijumeswani/abs-grad

Update operator kernels doc

4a0bfaa

Merge branch 'main' of https://github.com/microsoft/onnxruntime into …

17a58d8

…baijumeswani/abs-grad

hariharans29 previously approved these changes Aug 28, 2023

View reviewed changes

baijumeswani dismissed stale reviews from hariharans29 and centwang via 17a58d8 August 28, 2023 17:21

Add definitions to rocm ep

61ae117

centwang approved these changes Aug 29, 2023

View reviewed changes

baijumeswani merged commit 5d2c573 into main Aug 29, 2023

baijumeswani deleted the baijumeswani/abs-grad branch August 29, 2023 04:03

baijumeswani added the release:1.16.2 label Oct 10, 2023

snnn mentioned this pull request Nov 1, 2023

Some cherry-picks for the 1.16.2 release #18218

Merged

tianleiwu removed the release:1.16.2 label Nov 7, 2023

kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024

Sign CUDA Kernel (microsoft#17293)

e676def

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sign CUDA Kernel #17293

Sign CUDA Kernel #17293

baijumeswani commented Aug 25, 2023 •

edited

Loading

centwang commented Aug 25, 2023

baijumeswani commented Aug 25, 2023

baijumeswani commented Aug 29, 2023

Sign CUDA Kernel #17293

Sign CUDA Kernel #17293

Conversation

baijumeswani commented Aug 25, 2023 • edited Loading

centwang commented Aug 25, 2023

baijumeswani commented Aug 25, 2023

baijumeswani commented Aug 29, 2023

baijumeswani commented Aug 25, 2023 •

edited

Loading