Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add implementation of PANOC+ #57

Merged
merged 10 commits into from
Jan 5, 2022
Merged

Add implementation of PANOC+ #57

merged 10 commits into from
Jan 5, 2022

Conversation

aldma
Copy link
Collaborator

@aldma aldma commented Jan 5, 2022

Hello!
First of all, thanks for all the work you put into this repo.

@AndThem and I would like to contribute with a variant of PANOC we recently investigated (arXiv:2112:13000).

The algorithm consists of two entangled backtracking steps: before a tentative update is accepted along some search direction, there is a check on the forward-backward stepsize gamma that should satisfy a Lipschitz bound. In contrast with the original PANOC, this refined linesearch procedure allows to easily reject poor steps and to handle smooth terms f that have only locally Lipschitz continuous gradient.

In the preprint we denoted this variant by PANOC+. However, to avoid issues with symbols and names, the algorithm has been implemented as a solver called NOLIP. This PR is to add NOLIP to the repo.

@aldma aldma marked this pull request as ready for review January 5, 2022 09:47
@codecov
Copy link

codecov bot commented Jan 5, 2022

Codecov Report

Merging #57 (19a0030) into master (de2260e) will increase coverage by 0.73%.
The diff coverage is 93.82%.

❗ Current head 19a0030 differs from pull request most recent head 765f12d. Consider uploading reports for the commit 765f12d to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master      #57      +/-   ##
==========================================
+ Coverage   89.03%   89.76%   +0.73%     
==========================================
  Files          20       21       +1     
  Lines         857      938      +81     
==========================================
+ Hits          763      842      +79     
- Misses         94       96       +2     
Impacted Files Coverage Δ
src/algorithms/panocplus.jl 93.82% <93.82%> (ø)
src/utilities/iteration_tools.jl 84.00% <0.00%> (+1.33%) ⬆️
src/algorithms/panoc.jl 97.82% <0.00%> (+2.17%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update de2260e...765f12d. Read the comment docs.

Copy link
Member

@lostella lostella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the work here! This looks pretty good, I have left some comments in the code.

It’s interesting that the code is overall shorter than the original implementation, I guess this is because no special treatment for quadratic f is there (which makes sense given the focus on non-Lipschitz differentiable cases).

Some tests failed but unrelated to this change (we should get rid of some randomized tests, or make assertions more permissive) so you can disregard them.

README.md Outdated
@@ -31,6 +31,7 @@ Vũ-Condat primal-dual algorithm[^chambolle_2011][^vu_2013][^condat_2013] | [`Vu
Davis-Yin splitting[^davis_2017] | [`DavisYin`](src/algorithms/davis_yin.jl)
Asymmetric forward-backward-adjoint splitting[^latafat_2017] | [`AFBA`](src/algorithms/primal_dual.jl)
PANOC (L-BFGS)[^stella_2017] | [`PANOC`](src/algorithms/panoc.jl)
PANOC+ (L-BFGS)[^demarchi_2021] | [`NOLIP`](src/algorithms/nolip.jl)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: should the name be NOLIP here? Although I agree it would not match the paper… I’m anyway preparing more extensive documentation to replace this README, where a clearer pairing (literature, implementation) is there, so I can also take care of it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather change NOLIP to, say, PANOCplus, as it is a variant of PANOC. But I don't know your naming conventions or Julia constraints. What do you suggest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that would be clearer: let’s go for PANOCplus (and …Iteration and …State)

Comment on lines 160 to 173
if (iter.gamma === nothing || iter.adaptive == true)
mul!(state.Az, iter.A, state.z)
f_Az = gradient!(state.grad_f_Az, iter.f, state.Az)
tol = 10 * eps(R) * (1 + abs(f_Az))
if f_Az > f_Az_upp + tol && state.gamma >= iter.minimum_gamma
state.gamma *= 0.5
if state.gamma < iter.minimum_gamma
@warn "stepsize `gamma` became too small ($(state.gamma))"
end
can_update_direction = true
reset_direction_state!(iter, state)
continue
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be done using backtrack_stepsize just like at the start of the iterations (lines 98-105)? One way to do that would be to let it compute a new local gamma value, and then compare it to state.gamma to see if backtracked

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it would be much better. However, backtrack_stepsize can potentially take several iterations before returning, whereas here we backtrack only once before restarting.

gamma=gamma, y=y, z=z, g_z=g_z, res=x-z, H=initialize(iter.directions, x),
)
if (iter.gamma === nothing || iter.adaptive == true)
state.gamma, state.g_z, f_Az, f_Az_upp = backtrack_stepsize!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this initial backtracking is not there in the pseudocode in the paper, do I understand it right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pseudocode gives special treatment to the case k=0. Considering the first iteration, then, the initial backtracking on gamma is there indeed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True! Didn’t realize it. Partially related to this: do you think the pseudocode in the manuscript could be rearranged so as to avoid “start with step xyz” and just start with step 1? That just struck me as potentially confusing when I first saw it.


while true

if can_update_direction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: this is more of a should_ than a can_?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this gives more emphasis on the fact that d cannot be updated sometimes. Conversely, as argued in the preprint, the search direction is updated whenever it can_ be, but this is not necessary.

lostella
lostella previously approved these changes Jan 5, 2022
Copy link
Member

@lostella lostella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks @aldma!

If I had one wish it would be for an equivalence test between NOLIP and PANOC, in what I believe (correct me if I’m wrong) is the only case they should be expected to work exactly the same: when gamma or Lf is given, and consequently adaptive=false. There are a couple of such tests in https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl/blob/master/test/problems/test_equivalence.jl

Testing for the same iterates should be sufficient, as that indirectly tests that also directions are computed the same.

@lostella
Copy link
Member

lostella commented Jan 5, 2022

If I had one wish it would be for an equivalence test between NOLIP and PANOC

That would be very useful to have in case anything changes in the implementation of either algorithm, it could help catch any mistake early

@aldma
Copy link
Collaborator Author

aldma commented Jan 5, 2022

I've added an equivalence test between NOLIP and PANOC and then changed the name from NOLIP to PANOCplus. Thanks for your hints!

Copy link
Member

@lostella lostella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@lostella lostella merged commit 41cb172 into JuliaFirstOrder:master Jan 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants