-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove hard-coded Vector
fields and add vectors constructors
#299
Comments
Or we could just not make it a vector at all but a scalar 😛 |
If we stick with vectors (e.g. because of Flux/Zygote), in the example I guess |
But then how do I do my inplace updates! 😆. Also nobody like scalars! Especially GPUs! |
One could eg use ParameterHandling and perform in-place updates of the resulting vector 😉 But then you would still have to go back and reconstruct the kernel. |
Did we decide that |
|
I still think one could consider dropping support for Flux.params and switching to scalar parameters instead. Both Ref and Vector seem a bit unnatural. |
:(
I can definitely see the appeal of this, but I would really like this ecosystem to be compatible with the Flux one e.g. so that you can map the inputs to a GP through a DNN defined using Flux. I'm not saying that this is necessarily a good idea in general, see [1], but I'd at least like to avoid ruling it out. If we didn't use the Functors API, is there another way to interop nicely with Flux? |
You mentionned ParameterHandling and it looks very nice and it seems the only requirement on our side would be to implement the |
I haven't thought this through but maybe it is possible to write a custom Flux layer that contains only the parameter vector and a function that builds the kernel: struct KernelLayer{P,F}
params::P
model::F
end
Functors.@functor KernelLayer (params,)
(f::KernelLayer)(x, y) = f.model(f.params)(x, y)
... |
I think I got this part. However I wanted to say that I am also happy with the My point is that if we can have |
Interesting -- how are you imagining this interacting with Flux @devmotion ? I don't think I'm seeing where you're going with this yet. |
As I said, I didn't spend much thought on it. But the main idea would be that instead of forcing all kernels to be mutable we just have one kernel that just wraps the parameters of a possibly complicated kernel in a Flux-compatible way (e.g. as a |
Ok so that's like a complementary idea right? We would need a conversion from any kernel to a |
The main idea is that it only requires the output of |
I came back to this issue some days ago and thought a bit more about the problems and the suggestion above and started to work on a draft. Currently my suggestion would be the following breaking changes:
My main reasoning is:
|
E.g., in https://juliagaussianprocesses.github.io/KernelFunctions.jl/stable/examples/deep-kernel-learning/ one approach would be to remove the neuralnet = Chain(Dense(1, 3), Dense(3, 2))
k = SqExponentialKernel() ∘ FunctionTransform(neuralnet) with neuralnet = Chain(Dense(1, 3), Dense(3, 2))
k = KernelModel(flatten(SqExponentialKernel())...) ∘ FunctionTransform(neuralnet) Then the rest of the example would work in the same way as now and it would be as efficient as now. Clearly, this is a stupid example since |
Ahhh I think I understand now! We could default k = compose(SqExponentialKernel(), FunctionTransform(neuralnet))
model = KernelModel(flatten(k)...) |
I was worried that such generic fallbacks introduce a) type piracy and b) surprising behaviour. I became also more and more convinced that it would be better to error if someone calls I also think it can be more efficient for larger models to not only use |
Thanks for the explanation, not having a general fallback seems safer and faster indeed. |
How do you envision this? Should we still allow to pass Implementation-wise just using PH types sounds easier, but it also sounds like an unnecessary extra cost if one does not do any optimization. I am particularly thinking of how the constructors should look like if we allow both... Would using a keyword argument like Here is a quick try to see how it would go julia> struct Foo{T}
a::T
function Foo(x; fixed=true)
if fixed
return new{typeof(float(x)}(float(x)) # I am just putting this as a dummy example
else
return new{typeof(x)}(x)
end
end
end
julia> Foo(2)
Foo{Float64}(2.0)
julia> Foo(2; fixed=false)
Foo{Int64}(2)
julia> @code_warntype Foo(2)
Variables
#self#::Type{Foo}
x::Int64
Body::Foo{Float64}
1 ─ %1 = $(QuoteNode(var"#Foo#3#4"()))
│ %2 = (%1)(true, #self#, x)::Foo{Float64}
└── return %2
julia> @code_warntype Foo(2; fixed=false)
Variables
#unused#::Core.Const(Core.var"#Type##kw"())
@_2::NamedTuple{(:fixed,), Tuple{Bool}}
@_3::Type{Foo}
x::Int64
fixed::Bool
@_6::Bool
Body::Union{Foo{Float64}, Foo{Int64}}
1 ─ %1 = Base.haskey(@_2, :fixed)::Core.Const(true)
│ Core.typeassert(%1, Core.Bool)
│ (@_6 = Base.getindex(@_2, :fixed))
└── goto #3
2 ─ Core.Const(:(@_6 = true))
3 ┄ %6 = @_6::Bool
│ (fixed = %6)
│ %8 = (:fixed,)::Core.Const((:fixed,))
│ %9 = Core.apply_type(Core.NamedTuple, %8)::Core.Const(NamedTuple{(:fixed,), T} where T<:Tuple)
│ %10 = Base.structdiff(@_2, %9)::Core.Const(NamedTuple())
│ %11 = Base.pairs(%10)::Core.Const(Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}())
│ %12 = Base.isempty(%11)::Core.Const(true)
│ Core.typeassert(%12, Core.Bool)
└── goto #5
4 ─ Core.Const(:(Base.kwerr(@_2, @_3, x)))
5 ┄ %16 = $(QuoteNode(var"#Foo#3#4"()))
│ %17 = fixed::Bool
│ %18 = (%16)(%17, @_3, x)::Union{Foo{Float64}, Foo{Int64}}
└── return %18 |
No, this was not what I had in mind and wanted to suggest. I do not want to allow ParameterHandling types in kernels, they are not even I just wanted to suggest using them in the struct GaussianKernelWithLengthscale{T<:Real} <: SimpleKernel
l::T
end
function ParameterHandling.flatten(::Type{T}, k::GaussianKernelWithLengthscale)
# uses `exp` by default - if one wants to use e.g. `log1pexp` one has to implement the model explicitly
lvec, unflatten_to_l = value_flatten(T, positive(k.l))
function unflatten_to_gaussiankernelwithlengthscale(x::Vector{T})
return GaussianKernelWithLengthscale(unflatten_to_l(x))
end
return lvec, unflatten_to_gaussiankernelwithlengthscale
end
# implementation of the kernel
... |
Ah ok! But then if we start to define a default |
One doesn't. That's the whole point of If you want something more custom or complex, you just have to write the function that builds the model. |
If that's okay with you I would be happy to build the PR! I think I got all the elements, but since I am obviously not the most qualified person to do so I would understand if you (@devmotion) or @willtebbutt would like to do it! |
I already have something in a local branch, I had to play around a bit before I ended up with the suggestion above. But before finishing and polishing it I wanted to discuss the ideas with you 🙂 If you think we should do this, I will clean up the local branch and open a PR. |
I am definitely in favor, we just need @st-- and @willtebbutt inputs. |
Right now kernels like
LinearKernel
have two problems: they can only takeReal
arguments (no way to passkernel.c
) as an argument and also they do not allow for differentAbstractVector
types to be stored.The interested use case is when using GPUs, one cannot do
kernelmatrix
onCuArrays
without this.should be
The text was updated successfully, but these errors were encountered: