Add implementation of S-FISTA #48

wwkong · 2021-09-19T17:30:03Z

Hello!

I saw in your description that you were open to contributions, and I would like to add one of my proximal gradient algorithms (AIPP) to the list. The algorithm itself relies on a specialized implementation of the classic FISTA method by Beck & Teboulle, so this PR is to have this FISTA variant added to the repo.

lostella

Hey @wwkong thanks for contributing this! I have some general comments, before I look into the algorithm.

One question on the algorithm: I’m assuming this reduces to standard FISTA when mu=0 (which appears to be the default). How do the iterations differ otherwise? Is it just about the sequence of extrapolation parameters? There are better choices when the strong convexity parameter is known, see eg https://www.seas.ucla.edu/~vandenbe/236C/lectures/fgrad.pdf I’m not sure whether this is related (haven’t read your paper).

The reason I’m asking is, I’m thinking of significantly improving the fast gradient method which is already implemented in the package, so that one can inject whatever extrapolation sequence they want. So I’d like to make sure there’s not too much duplication :-)

Project.toml

wwkong · 2021-09-20T02:53:08Z

One question on the algorithm: I’m assuming this reduces to standard FISTA when mu=0 (which appears to be the default).

Indeed. A good reference to see why this is true is in one of my more recent preprints [1, Section 3].

How do the iterations differ otherwise? Is it just about the sequence of extrapolation parameters?

For the case of mu > 0, this algorithm is one of the (many!) generalizations of FISTA for the strongly convex case (and not just because of how the extrapolation is done). Note that the scheme itself is based on Nesterov's 3-sequence fast gradient method [2, Eq. (2.2.7)], which itself is based on his fairly complicated theory of estimating functions [2, Section 2.2.1]. Some references that might help in understanding his theory are [1,3,4].

There are better choices when the strong con exits parameter is known, see eg https://www.seas.ucla.edu/~vandenbe/236C/lectures/fgrad.pdf I’m not sure whether this is related (haven’t read your paper).

I am familiar with the strongly convex method mentioned in Vandenberghe's notes above. If I recall correctly, that method is a special instance of one of Nesterov's fast gradient methods [2, Eq. (2.2.22)] which only works if mu > 0, i.e., it fails to converge if mu=0. My method, on the other hand, converges no matter what mu the user may choose (due to being based on Nesterov's more general framework).

The reason I’m asking is, I’m thinking of significantly improving the fast gradient method which is already implemented in the package, so that one can inject whatever extrapolation sequence they want. So I’d like to make sure there’s not too much duplication :-)

Gotcha. Gotcha. For some more added info, I implemented the more general framework of Nesterov because my nonconvex proximal gradient method does not work without several of the auxiliary iterate sequences. There are, of course, many ways to simplify this framework (set mu=0, eliminate a variable, etc.) into one of the more popular (and shorter) versions of FISTA like you currently have in your code.

[1] Kong, W., Melo, J. G., & Monteiro, R. D. (2021). FISTA and Extensions--Review and New Insights. arXiv preprint arXiv:2107.01267.

[2] Nesterov, Y. (2018). Lectures on convex optimization (Vol. 137). Berlin, Germany: Springer International Publishing.

[3] Florea, M. I., & Vorobyov, S. A. (2018). An accelerated composite gradient method for large-scale composite objective problems. IEEE Transactions on Signal Processing, 67(2), 444-459.

[4] Monteiro, R. D., Ortiz, C., & Svaiter, B. F. (2016). An adaptive accelerated first-order method for convex optimization. Computational Optimization and Applications, 64(1), 31-73.

codecov · 2021-09-20T04:55:35Z

Codecov Report

Merging #48 (69b2fca) into master (2f5d17d) will decrease coverage by 0.05%.
The diff coverage is 86.84%.

@@            Coverage Diff             @@
##           master      #48      +/-   ##
==========================================
- Coverage   87.96%   87.91%   -0.06%     
==========================================
  Files          18       19       +1     
  Lines         748      786      +38     
==========================================
+ Hits          658      691      +33     
- Misses         90       95       +5

Impacted Files	Coverage Δ
src/ProximalAlgorithms.jl	`100.00% <ø> (ø)`
src/algorithms/fista.jl	`86.84% <86.84%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2f5d17d...69b2fca. Read the comment docs.

lostella · 2021-09-20T11:36:00Z

@wwkong thanks for the detailed explanation and references!

So the algorithm named S-FISTA in the relevant reference, right? Maybe it makes sense to use SFISTA in naming types and functions then? So that it is clear which generalization of FISTA this is (it can be made clear by looking into the docstring and references there, but the name should give I direct clue).

By the way, I merged some fixes to the benchmark script that should make it work in this PR too

wwkong · 2021-09-20T13:22:17Z

@wwkong thanks for the detailed explanation and references!

So the algorithm named S-FISTA in the relevant reference, right? Maybe it makes sense to use SFISTA in naming types and functions then? So that it is clear which generalization of FISTA this is (it can be made clear by looking into the docstring and references there, but the name should give I direct clue).

That's a good point. I'll rename the functions in fista.jl using SFISTA as a prefix then, but keep the filename fista.jl in case other FISTA variants could be added in there.

By the way, I merged some fixes to the benchmark script that should make it work in this PR too

Cool. I'll do a rebase in the next push.

EDIT: The latest push should have the renames and rebase.

src/algorithms/fista.jl

lostella

Hey @wwkong I have some final comments/questions. Once these few last things are clear, this can be merged!

src/algorithms/fista.jl

test/problems/test_lasso_small.jl

test/problems/test_lasso_small_strongly_convex.jl

src/algorithms/fista.jl

lostella

Looks good! Thanks @wwkong!

lostella requested changes Sep 19, 2021

View reviewed changes

Project.toml Outdated Show resolved Hide resolved

Project.toml Outdated Show resolved Hide resolved

wwkong added 7 commits September 20, 2021 09:41

First attempt of FISTA.

6a0ab82

Add dependencies & bug fixes.

d3f5b76

Add more advanced features.

fbc5126

Documentation + tests/benchmarks for FISTA.

f63dbf3

Added more documentation.

f94d502

Remove unneeded changes and loosen a FISTA test.

6ef9e83

Rename FISTA to SFISTA in several places.

8b9acdf

wwkong force-pushed the add-fista branch from 3e6c102 to 8b9acdf Compare September 20, 2021 13:44

Small changes.

e718836

lostella requested changes Sep 20, 2021

View reviewed changes

src/algorithms/fista.jl Outdated Show resolved Hide resolved

Small docstring fixes.

fa2eab0

lostella changed the title ~~Add an implementation of a strongly convex FISTA method + various dependencies~~ Add implementation of S-FISTA Sep 20, 2021

lostella requested changes Sep 21, 2021

View reviewed changes

wwkong added 6 commits September 21, 2021 22:16

Changed variables, docs, and space allocation.

851ef91

Add tests + bug fixes.

eecd6a4

Edit test cases.

f7d0c2b

Change strongly convex problem generator.

16776c4

Bug fixes.

e588742

String changes.

424d5b1

lostella reviewed Sep 24, 2021

View reviewed changes

Requested changes from the PR.

69b2fca

lostella approved these changes Sep 24, 2021

View reviewed changes

lostella merged commit cf6f7e9 into JuliaFirstOrder:master Sep 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add implementation of S-FISTA #48

Add implementation of S-FISTA #48

wwkong commented Sep 19, 2021

lostella left a comment •

edited

Loading

wwkong commented Sep 20, 2021

codecov bot commented Sep 20, 2021 •

edited

Loading

lostella commented Sep 20, 2021

wwkong commented Sep 20, 2021 •

edited

Loading

lostella left a comment

lostella left a comment

Add implementation of S-FISTA #48

Add implementation of S-FISTA #48

Conversation

wwkong commented Sep 19, 2021

lostella left a comment • edited Loading

Choose a reason for hiding this comment

wwkong commented Sep 20, 2021

codecov bot commented Sep 20, 2021 • edited Loading

Codecov Report

lostella commented Sep 20, 2021

wwkong commented Sep 20, 2021 • edited Loading

lostella left a comment

Choose a reason for hiding this comment

lostella left a comment

Choose a reason for hiding this comment

lostella left a comment •

edited

Loading

codecov bot commented Sep 20, 2021 •

edited

Loading

wwkong commented Sep 20, 2021 •

edited

Loading