This repository has been archived by the owner on Nov 4, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
avik-pal
force-pushed
the
ap/fused_dense
branch
12 times, most recently
from
April 18, 2024 15:15
04b13d6
to
5a31de5
Compare
3 tasks
avik-pal
force-pushed
the
ap/fused_dense
branch
6 times, most recently
from
April 19, 2024 19:43
8271ed8
to
c910e11
Compare
avik-pal
force-pushed
the
ap/fused_dense
branch
from
April 19, 2024 20:22
c910e11
to
2fde902
Compare
avik-pal
force-pushed
the
ap/fused_dense
branch
2 times, most recently
from
April 19, 2024 22:06
0a77c67
to
b14870e
Compare
avik-pal
force-pushed
the
ap/fused_dense
branch
3 times, most recently
from
April 23, 2024 03:35
6feb0c2
to
7d58f05
Compare
avik-pal
force-pushed
the
ap/fused_dense
branch
4 times, most recently
from
April 23, 2024 13:07
2c743ca
to
8040ad1
Compare
avik-pal
force-pushed
the
ap/fused_dense
branch
15 times, most recently
from
April 24, 2024 02:05
a649e37
to
921a393
Compare
avik-pal
force-pushed
the
ap/fused_dense
branch
from
April 24, 2024 02:25
921a393
to
6a08a48
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #53 +/- ##
==========================================
+ Coverage 75.97% 82.81% +6.83%
==========================================
Files 16 23 +7
Lines 437 704 +267
==========================================
+ Hits 332 583 +251
- Misses 105 121 +16 ☔ View full report in Codecov by Sentry. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fused CUDA KernelsWill be handled in a later PRCUBLASLtGemmKernels-- currently a bit slow with the default configuration compared to CUBLAS with an additional kernel launch. There is some tuning on their end, so it might worth revisiting this later.GroupNorm: Fuse the activation function into the kernelWill be handled in a later PRFastBroadcast
for faster broadcasts on CPUForwardDiff over Zygote TestsTests will be part of Native Nested AD support for Lux Models Lux.jl#598fused_conv_bias_attention
Is Anything Broken?
Zygote.gradient
overZygote.gradient
for dense layers will be broken once this is merged. But fear not we will have faster nested AD package merged soon!