Fix ADAMW, and track the loss #46

mcabbott · 2022-01-30T15:50:20Z

This makes the same changes as FluxML/Flux.jl#1612, xref #38.

It also alters the tests to keep track of the loss. Although it does not fix any of the convergence problems.

ToucheSir · 2022-01-30T22:44:50Z

src/rules.jl

@@ -405,7 +405,7 @@ weight decay regularization.
                         (no need to change default)
 """
 ADAMW(η = 1f-3, β = (9f-1, 9.99f-1), γ = 0, ϵ = eps(typeof(η))) =
-  OptimiserChain(ADAM{typeof(η)}(η, β, ϵ), WeightDecay(γ))
+  OptimiserChain(ADAM{typeof(η)}(1, β, ϵ), WeightDecay{typeof(η)}(γ), Descent(η))


Sorry, I just remembered a previous comment (maybe on Slack? Can't find it now) which noted that our implementation did not match the paper:
.

TL;DR is that (possibly because of an odd choice of variable names in the paper, they use α instead of η for the learning rate and η for something else), the "fixed" version is wrong and the original is correct. Apologies for the noise.

oh damn, Brian is right, FluxML/Flux.jl#1612 should be reverted, I got confused by the nomenclature

I'll file a PR in Flux

mcabbott added 2 commits January 30, 2022 10:30

log the loss during tests

28c1506

fixup ADAMW

9293965

mcabbott requested a review from CarloLucibello January 30, 2022 15:50

ToucheSir reviewed Jan 30, 2022

View reviewed changes

mcabbott closed this Feb 3, 2022

CarloLucibello mentioned this pull request Feb 10, 2022

fix adamw FluxML/Flux.jl#1868

Merged

mcabbott mentioned this pull request Mar 7, 2022

Improve WeightDecay docstring #60

Merged

ToucheSir mentioned this pull request Sep 12, 2023

write >> as infix notation for OptimiserChain #139

Open

CarloLucibello mentioned this pull request May 3, 2024

Implementation of AdamW differs from PyTorch FluxML/Flux.jl#2433

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ADAMW, and track the loss #46

Fix ADAMW, and track the loss #46

mcabbott commented Jan 30, 2022

ToucheSir Jan 30, 2022 •

edited

Loading

CarloLucibello Feb 3, 2022

CarloLucibello Feb 3, 2022

Fix ADAMW, and track the loss #46

Fix ADAMW, and track the loss #46

Conversation

mcabbott commented Jan 30, 2022

ToucheSir Jan 30, 2022 • edited Loading

Choose a reason for hiding this comment

CarloLucibello Feb 3, 2022

Choose a reason for hiding this comment

CarloLucibello Feb 3, 2022

Choose a reason for hiding this comment

ToucheSir Jan 30, 2022 •

edited

Loading