-
-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two errors on Julia 1.6 #897
Comments
Maybe worth adding: the following work fine, with either explicit broadcasting, or with gradient(x -> sum(norm.(collect(eachcol(x)))), rand(3,400))[1]
gradient(x -> sum(map(norm, collect(eachcol(x)))), rand(3,400))[1]
gradient(x -> sum(sin.(Diagonal(x))), rand(2))[1]
gradient(x -> sum(map(sin, Diagonal(x))), rand(2))[1] |
I'm not sure if this is a Julia issue or a Zygote issue. I checked all the Julia files in the above stack trace, it seems the only relevant commit at the Julia side is JuliaLang/julia#36188, but I didn't find anything obvious that would break Zygote. Also On the other hand, the following expression Another weird issue is that the expression |
The code posted at #157 shows the same error message on Julia 1.6 but not on Julia 1.5. |
Thanks for investigating. The reason Reversing that change appears to fix it, on 1.6. I think the version which takes a keyword isn't being run at all, and somehow doesn't break on 1.5. julia> @eval Zygote begin
function _pullback(cx::AContext, ::typeof(sum), f, xs::AbstractArray)
y, back = pullback(cx, ((f, xs) -> sum(f.(xs))), f, xs)
@show y # just to check this gets run
y, ȳ -> (nothing, back(ȳ)...)
end
end
_pullback (generic function with 492 methods)
julia> gradient(x -> sum(norm, collect(eachcol(x))), rand(3,400))[1] # only size changed
y = 386.0308068393151
3×400 Matrix{Float64}:
0.913379 0.269357 0.55777 0.666639 0.354963 … 0.0933578 0.740294 0.230064 0.236234
0.167705 0.501518 0.10804 0.687187 0.189529 0.674529 0.487126 0.901705 0.178678
0.370963 0.822148 0.822934 0.288732 0.915467 0.732322 0.463328 0.366059 0.955127
julia> gradient(x -> sum(sin, Diagonal(x)), rand(2))[1]
y = 1.3417542054446077
2-element Vector{Float64}:
0.8518952755965761
0.5751584407885143 |
Yes the rule for Within
And on Julia 1.6 it is as follows:
I do not see any difference between the two except the first line (the line with "About to run..."). In particular, the second argument is displayed as |
Is there a separate function to hit instead? |
I don't think so. The lowered code becomes the following on 1.6
The line
The line On 1.5, we have the following, which is a lot simpler:
And the line |
Right thanks, I meant it is following a different code path, which this seems like it is. Julia seems to be playing around with kwarg handling much earlier now. Maybe we can get away with wrapping the function as a |
A bit of a shot in the dark, but are you sure the only difference is the Julia version? I don't really know about any kwargs changes in Base, is there any chance this could be due to JuliaDiff/ChainRulesCore.jl#308? Otherwise, I might be able to take a look at this tomorrow. |
I was testing on two almost identical Julia environments. The only difference between the two environments is the Julia version itself. Also both environments have the same Zygote version (0.6.8) as well as the same ChainRulesCore version (0.9.37). Thus I think the different Julia version is the most likely cause here. |
addresses the second part of FluxML#897
addresses the first part of FluxML#897
addresses the first part of FluxML#897
956: fix differentiation of loopinfo exprs r=DhairyaLGandhi a=simeonschaub addresses the first part of #897 Co-authored-by: Simeon Schaub <[email protected]>
955: fix adjoint for sum r=DhairyaLGandhi a=simeonschaub addresses the second part of #897 Co-authored-by: Simeon Schaub <[email protected]>
955: fix adjoint for sum r=DhairyaLGandhi a=simeonschaub addresses the second part of #897 Co-authored-by: Simeon Schaub <[email protected]>
Yes, both should be fixed now. |
These two probably unrelated things work fine on 1.5, but give errors on 1.6 or master.
First, only for long enough generators:
And second, seemingly for any structured matrix:
The text was updated successfully, but these errors were encountered: