-
-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Higher order derivative products? #1102
Comments
I modified η = 1.
function innerloss(a, x)
sum(a .* x .^ 3)
end
function outerloss(a, b, x)
g = gradient(x -> innerloss(a, x), x)[1]
adaptedx = x - η * g
innerloss(b, adaptedx)
end
a = [1.]; b = [2.]; x = [1.];
gs = gradient(params(x)) do
outerloss(a, b, x)
end
println(gs[x]) # [-264.0] |
@CarloLucibello: thanks! I am afraid this is not what we look for exactly. Given the example above, the computation one would like to do is the following I guess, another way to tackle this is to write the chain rule explicitly So then my code would have to compute the Hessian Any thoughts on how I can solve this elegantly? Thanks! |
@rdangovs would perhaps |
@lssimoes Thanks! Will give it a go! |
Suppose I have a loss function
that takes data
and parameters
. Can I compute the following?
I.e. can I differentiate through the
and
.
gradient
? Please find below apytorch
example of how to achieve that forI wonder whether this behavior could be reproduced in Flux for a large class of loss functions in an easy way, say starting from this one? Here is one unsuccessful attempt of mine.
It seems to me that here
gradient
does not differentiate throughinnergs
properly. I am afraid my understanding ofgradient
is currently limited to make this work right now. Could you help me? Any advice is appreciated. Thanks!The text was updated successfully, but these errors were encountered: