Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bfloat16 support for quickgelugrad #18336

Merged
merged 3 commits into from
Nov 8, 2023
Merged

Conversation

prathikr
Copy link
Contributor

@prathikr prathikr commented Nov 7, 2023

Description

Registers BFloat16 datatype as valid input type for CUDA QuickGeluGrad Kernel.

Motivation and Context

Enabling meta-llama/Llama-2-70b to be finetuned with ONNX Runtime training.

@prathikr prathikr requested a review from hanbitmyths November 8, 2023 05:04
@prathikr prathikr merged commit 34f77ea into main Nov 8, 2023
@prathikr prathikr deleted the prathikrao/quickgelugrad-bfloat16 branch November 8, 2023 16:40
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
### Description
<!-- Describe your changes. -->

Registers BFloat16 datatype as valid input type for CUDA QuickGeluGrad
Kernel.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Enabling `meta-llama/Llama-2-70b` to be finetuned with ONNX Runtime
training.

---------

Co-authored-by: Prathik Rao <[email protected]@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants