-
-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dropout erroring on latest CUDA #1960
Comments
This is expected since the model wasn't moved to the GPU. I don't think we've ever guaranteed that layers on the CPU will work on the |
ah, sorry, silly mistake |
This seems to break models which ignore dropout layer in functor. |
Do you have an example? I don't understand your comment, |
Do you mean a layer that wraps a dropout layer and does not include the dropout layer as part of the functor leaves? That seems like an improper use of functor when implementing the wrapper layer (though I can believe there's some good reason for doing this). Alternatively, as a workaround, the RNG can be set when constructing the dropout layer: Dropout(p; rng = Flux.rng_from_array(CuArray)) |
Like this one: struct MultiheadAttention{Q<:Dense, K<:Dense, V<:Dense, O<:Dense, DP<:Dropout} <: AbstractAttention
head::Int
future::Bool
iqproj::Q
ikproj::K
ivproj::V
oproj::O
drop::DP
end
Flux.functor(mh::MultiheadAttention) = (mh.iqproj, mh.ikproj, mh.ivproj, mh.oproj), m -> MultiheadAttention(mh.head, mh.future, m..., mh.drop) This is an really old code snip from Transformers, which should definitely be updated. But I think it is potentially breaking because functor was used to extract parameter and dropout was not treating as a layer with parameter before. So maybe the broader question would be how do we know if a Flux layer need to be |
I would say always assume a sublayer could have parameters. There is basically no performance loss from doing so. |
Can you post a full stack trace? |
As well as a full list of packages in your environment. If I had to guess, something is holding essential Flux dependencies back. |
Thank you all. The issue was resolved when I updated |
On
I get the following error:
The text was updated successfully, but these errors were encountered: