Skip to content
This repository has been archived by the owner on Feb 21, 2022. It is now read-only.

WIP: Simple Regression with ADAM #5

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

DhairyaLGandhi
Copy link
Member

Using FluxML/Optimisers.jl#3 and simple sketch of a regression task with ADAM

This currently results in

ERROR: Tracing Error: No IR for Tuple{typeof(getfield),Tuple{Int64,Int64},Int64}
in Tuple{typeof(step),Chain{Tuple{Dense{typeof(relu),Array{Float64,2},Array{Float64,1}},Dense{typeof(identity),Array{Float64,2},Array{Float64,1}},typeof(softmax)}}}
in Tuple{typeof(gradient),var"#7#8",Chain{Tuple{Dense{typeof(relu),Array{Float64,2},Array{Float64,1}},Dense{typeof(identity),Array{Float64,2},Array{Float64,1}},typeof(softmax)}}}
in Tuple{Zygote.var"#38#39"{typeof(∂(#7))},Float64}
in Tuple{typeof(∂(#7)),Float64}
in Tuple{typeof(∂(mse)),Float64}
in Tuple{Zygote.var"#3121#back#1219"{Zygote.var"#1215#1217"{Array{Float64,2}}},Float64}
in Tuple{Zygote.var"#1215#1217"{Array{Float64,2}},Float64}
in Tuple{typeof(fill),Float64,Tuple{Int64,Int64}}
in Tuple{Type{Array{Float64,2}},UndefInitializer,Tuple{Int64,Int64}}
in Tuple{typeof(getfield),Tuple{Int64,Int64},Int64}
Stacktrace:
 [1] trace(::Any, ::Any, ::Vararg{Any,N} where N) at /Users/dhairyagandhi/.julia/packages/Mjolnir/eyPSM/src/trace.jl:266
 [2] trace(::Any, ::Vararg{Any,N} where N) at /Users/dhairyagandhi/Downloads/new_clones/XLATools.jl/src/compile/rt.jl:39
 [3] (::XLA.XFunction)(::Any) at /Users/dhairyagandhi/Downloads/new_clones/XLATools.jl/src/compile/rt.jl:55
 [4] train2(::Chain{Tuple{Dense{typeof(relu),Array{Float64,2},Array{Float64,1}},Dense{typeof(identity),Array{Float64,2},Array{Float64,1}},typeof(softmax)}}) at ./REPL[31]:3
 [5] top-level scope at REPL[33]:1

@MikeInnes
Copy link
Member

I think you may as well just edit the original mnist.jl for this.

Does this script work if you avoid the xla conversion? Presumably it needs to be modified to pass the optimiser state around.

@DhairyaLGandhi
Copy link
Member Author

It does work without the call to xla

@MikeInnes
Copy link
Member

Ok, this needs to change to pass the optimiser state around explicitly though, rather than using a stateful IdDict internally; otherwise it can't work with XLA (or other immutable objects like StaticArrays) in principle.

@DhairyaLGandhi
Copy link
Member Author

Yeah, I have removed the abstract parts from the pr and reworking the bit to remove the IdDict. In this case, is there any preference to what init(::ADAM, x) should return?

@MikeInnes
Copy link
Member

MikeInnes commented Jun 17, 2020

init should just return whatever state would normally be stored in the IdDict, eg a zero array of some kind.

Our functional training loop has the broad structure of

opt = ADAM(...)
st = state(opt, m)
for _ in _
  dm = gradient(...)
  m, st = update(opt, m, dm, st)
end

so ideally the state variable st is type stable, which might make it clearer what init should do.

@DhairyaLGandhi
Copy link
Member Author

I've made some changes in FluxML/Optimisers.jl#3 to reflect the initialisation conditions. We still need to fix call to (o::ADAM)(m, m\bar) from its current form

@MikeInnes
Copy link
Member

While you're looking at it, it may make sense to try Optimisers.jl on the rnn example. Zygote returns Refs for mutable struct gradients, which I don't think the optimisers support yet.

@DhairyaLGandhi
Copy link
Member Author

Yes, taking a look

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants