Improvements to Manifolds #448

antoine-levitt · 2017-07-23T11:44:01Z

Second order algorithms
Never used it, so I'm not the best person to do this
Compare performance with other libraries
Competitors include ROPTLIB, ManOpt, https://github.com/NickMcNutt/ManifoldOptim.jl (abandoned)
Vector transport
The proper way to implement algorithms like CG and BFGS is to use vector transport to transport the information at one point to another. Right now this is done with projections, which might not be the most efficient
More manifolds and variants of existing manifolds (e.g. different retractions)
See e.g. the list in http://www.math.fsu.edu/~whuang2/Indices/index_ROPTLIB.html
Also {x, Ax = b}, or intersection manifold (just do the projection on both manifolds iteratively and hope it converges)
Optimize number of retractions and projections
I have been pretty liberal with the use of retractions and projections in the optimizers, maybe some of them are unnecessary
A better way to do product manifolds?
Right now, the two components are stored in a flat 1D array, which might be suboptimal
Arbitrary inner product
The Sphere and Stiefel manifolds could take a more general inner product

simonbyrne · 2017-09-25T19:29:09Z

Any chance of support for positive semi-definite matrices?

pkofod · 2017-09-25T19:38:46Z

Any chance of support for positive semi-definite matrices?

According to the docs Manopt has this, but their code is GPL-3 licensed, so I think it's best if we avoid links to their code base in this thread, as it will probably exclude the reader from contributing.

antoine-levitt · 2017-09-25T19:39:01Z

Coincidentally, @whuang08 pointed to me just yesterday that a retraction for the manifold of SPD matrices is R_eta(X) = X + eta + 0.5 eta X^{-1} eta, which breaks the current API (which only has a retract(x) function whereas this would need a retract(x,eta)). This should not be too bad to adapt though. That retraction seems fishy to me because it's singular at zero eigenvalues, but apparently it's useful. I don't have any experience in SPD matrices, do you have a use case/paper/reference implementation?

antoine-levitt · 2017-09-25T19:40:20Z

Ah, good point about the licence, will avoid looking at ROPTLIB also then. It is however described in papers.

simonbyrne · 2017-09-25T19:51:52Z

It probably doesn't answer your question directly (it's been a while since I've worked with manifolds) but the easiest option is usually parametrise SPD matrices via a Cholesky factor which is constrained to have non-negative diagonals.

antoine-levitt · 2017-09-25T19:54:24Z

That looks substantially different from using a manifold type algorithm, and is probably best treated by a solver with inequality constrains (JuMP & friends).

simonbyrne · 2017-09-25T20:20:30Z

Ah, okay. Thanks.

simonbyrne · 2017-09-25T20:37:30Z

Hmm, my problem is that I have quite a complicated objective function, which doesn't seem to be supported by JuMP. I guess I can write the reparametrisation myself, but it would be nice if there were some tools for this.

whuang08 · 2017-09-25T21:34:58Z

Two references for SPD matrices are: https://www.math.fsu.edu/~whuang2/papers/RMKMSPDM.htm and https://arxiv.org/abs/1507.02772 Of course, there are more references since computation on SPD matrices is an active topic as far as I know. (Many people discussed about this when I was in a linear algebra conference.) Best, Wen

…

On Mon, Sep 25, 2017 at 3:37 PM, Simon Byrne ***@***.***> wrote: Hmm, my problem is that I have quite a complicated objective function, which doesn't seem to be supported by JuMP. I guess I can write the reparametrisation myself, but it would be nice if there were some tools for this. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#448 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AQ5gb8_vJZE1455Udt7MXxysqYnGfM4gks5smA8LgaJpZM4Ogbic> .

antoine-levitt · 2017-09-26T07:07:03Z

I guess it depends on whether you expect your minimum to hit the boundary (matrices with zero eigenvalues) or not. If you do then I would imagine it's hard to bypass an inequality constrained solver. If you don't then I guess manifold algorithms are efficient.

antoine-levitt · 2018-06-19T07:32:25Z

Just putting this here for reference: I benchmarked finding the first few eigenvalues of a large random matrix in Optim and in manopt. The results is that the implementations are comparable and result in about the same number of iterations/matvec. Code for manopt:

% Generate random problem data.
% rng(0);
n = 1000;
m = 10;
A = randn(n);
A = .5*(A+A.');
x0 = randn(n,m);
 
% Create the problem structure.
manifold = stiefelfactory(n,m);
problem.M = manifold;
 
% Define the problem cost function and its Euclidean gradient.
problem.cost  = @(x) -trace(x'*(A*x));
problem.egrad = @(x) -2*A*x;

% Solve.
[x, xcost, info, options] = rlbfgs(problem, x0);
 
% Display some statistics.
figure;
semilogy([info.iter], [info.gradnorm], '.-');
xlabel('Iteration number');
ylabel('Norm of the gradient of f');

and Optim:

using Optim

# srand(0)
const n = 1000
const m = 10
M = randn(n,n)
const A = (M+M')/2
f(x) = -vecdot(x,A*x)
g(x) = -2*A*x
g!(stor,x) = copy!(stor,g(x))
x0 = randn(n,m)

manif = Optim.Stiefel()
Optim.optimize(f, g!, x0, Optim.LBFGS(m=30,manifold=manif, linesearch=Optim.HagerZhang()), Optim.Options(show_trace=true, allow_f_increases=true,g_tol=1e-6))

pkofod · 2018-06-19T08:28:16Z

Cool, good to have some examples like this.

I tried swapping out HagerZhang for MoreThuente. The former gives

Results of Optimization Algorithm
 * Algorithm: L-BFGS
 * Starting Point: [0.15317647539137902,0.42376052107832024, ...]
 * Minimizer: [-0.04295817425645339,-0.018684703787118326, ...]
 * Minimum: -4.311236e+02
 * Iterations: 136
 * Convergence: true
   * |x - x'| ≤ 0.0e+00: false 
     |x - x'| = 6.80e-08 
   * |f(x) - f(x')| ≤ 0.0e+00 |f(x)|: false
     |f(x) - f(x')| = -1.45e-14 |f(x)|
   * |g(x)| ≤ 1.0e-06: true 
     |g(x)| = 9.52e-07 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 410
 * Gradient Calls: 411

while MoreThuente gives

Results of Optimization Algorithm
 * Algorithm: L-BFGS
 * Starting Point: [0.05351779586314921,-0.7229077835364466, ...]
 * Minimizer: [-0.04590592560787643,-0.03998550452894138, ...]
 * Minimum: -4.301612e+02
 * Iterations: 137
 * Convergence: true
   * |x - x'| ≤ 0.0e+00: false 
     |x - x'| = 6.96e-08 
   * |f(x) - f(x')| ≤ 0.0e+00 |f(x)|: false
     |f(x) - f(x')| = -2.51e-14 |f(x)|
   * |g(x)| ≤ 1.0e-06: true 
     |g(x)| = 8.44e-07 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 147
 * Gradient Calls: 148

Interesting to see the significant reduction in objective calls here. I know this is know - HZ needs to be conservative, but if MT is suited for the example at hand, there's a pretty neat gain to find in terms of calls to f(x)/g(x).

(using A_mul_B! in g! can also decrease runtime by 25%, but that obviously has nothing to do with how well the optimizer works :))

anriseth · 2018-06-19T10:14:14Z

I tried swapping out HagerZhang for MoreThuente.

Hmm, and BackTracking stagnates

 * Algorithm: L-BFGS
 * Starting Point: [0.5494801640402754,1.2759422464582777, ...]
 * Minimizer: [-0.003418669419144482,0.00966606972837302, ...]
 * Minimum: -3.174551e+02
 * Iterations: 8
 * Convergence: true
   * |x - x'| ≤ 0.0e+00: false 
     |x - x'| = 1.05e-09 
   * |f(x) - f(x')| ≤ 0.0e+00 |f(x)|: true
     |f(x) - f(x')| = 0.00e+00 |f(x)|
   * |g(x)| ≤ 1.0e-06: false 
     |g(x)| = 8.45e+00 
   * Stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 52
 * Gradient Calls: 10

antoine-levitt · 2018-06-19T14:03:52Z

@anriseth yes, I reported this on gitter. This is strange, because I used backtracking on similar problems before and it worked very well. I reduced it further here: #626

@pkofod yes, LBFGS usually produces pretty good initial steps, so HZ is counterproductive in requiring more evals per iteration. In my previous tests backtracking was a much better default choice; @anriseth suggested to make it the default for LBFGS and I agree: it should really only be the default for CG, where it is more important to have a good stepsize.

jagot · 2019-09-19T13:18:36Z

I am particularly interested in the last two items in the list. For the custom metric, I implemented the Löwdin transform that symmetrically orthogonalizes a non-orthogonal basis, such that the SVD:s &c that assumes an orthogonal basis works. The results are then transformed back to the non-orthogonal basis.

Please see my implementation here:
https://github.com/JuliaAtoms/SCF.jl/blob/b25be8df29efb21ce573a4b9041709ee78573353/src/manifolds.jl
I am well aware that it is not particularly efficient and there may be a mathematically more beautiful approach, but it works (TM).

Would there be any interest in including these extended manifolds into Optim.jl? Any improvement suggestions?

antoine-levitt · 2019-09-19T13:28:51Z

For sure, please do put it in there. The longer term plan is to split those things off in Manifolds.jl,see JuliaManifolds/Manifolds.jl#35, but right now it can live here.

I think you can avoid doing the lowdin transform explicitly, and just eigendecompose overlap matrices. Will take a closer look when I'm at a computer

jagot · 2019-09-19T13:33:25Z

Well, I implemented the Löwdin transform by eigendecomposing the overlap matrix (when it is not simply an UniformScaling) and caching the forward and backward transforms. I was more thinking if one could implement something via SVD, but it will anyway end up being dense transform matrices (with the exception for diagonal overlap matrices, of course).

antoine-levitt · 2019-09-19T13:58:12Z

I mean computing the overlap as X'SX, factoring that, and using that transform to modify X. That's more efficient when the first dimension of X is large, esp. when S can be applied in a matrix-free way

antoine-levitt mentioned this issue Jul 23, 2017

Manifold optimization #435

Merged

antoine-levitt mentioned this issue Nov 30, 2018

TODO list JuliaNLSolvers/ManifoldProjections.jl#1

Open

11 tasks

This was referenced May 6, 2021

Community discussions using Optim.jl and Manifolds.jl together JuliaRobotics/IncrementalInference.jl#1242

Open

Optim.retract differs from Manifolds.retract, maybe? #920

Open

kellertuer mentioned this issue Mar 16, 2023

Integration into Optim JuliaManifolds/Manifolds.jl#35

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to Manifolds #448

Improvements to Manifolds #448

antoine-levitt commented Jul 23, 2017

simonbyrne commented Sep 25, 2017

pkofod commented Sep 25, 2017 •

edited

Loading

antoine-levitt commented Sep 25, 2017

antoine-levitt commented Sep 25, 2017

simonbyrne commented Sep 25, 2017

antoine-levitt commented Sep 25, 2017

simonbyrne commented Sep 25, 2017

simonbyrne commented Sep 25, 2017

whuang08 commented Sep 25, 2017 via email •

edited

Loading

antoine-levitt commented Sep 26, 2017

antoine-levitt commented Jun 19, 2018

pkofod commented Jun 19, 2018 •

edited

Loading

anriseth commented Jun 19, 2018

antoine-levitt commented Jun 19, 2018

jagot commented Sep 19, 2019

antoine-levitt commented Sep 19, 2019

jagot commented Sep 19, 2019

antoine-levitt commented Sep 19, 2019

Improvements to Manifolds #448

Improvements to Manifolds #448

Comments

antoine-levitt commented Jul 23, 2017

simonbyrne commented Sep 25, 2017

pkofod commented Sep 25, 2017 • edited Loading

antoine-levitt commented Sep 25, 2017

antoine-levitt commented Sep 25, 2017

simonbyrne commented Sep 25, 2017

antoine-levitt commented Sep 25, 2017

simonbyrne commented Sep 25, 2017

simonbyrne commented Sep 25, 2017

whuang08 commented Sep 25, 2017 via email • edited Loading

antoine-levitt commented Sep 26, 2017

antoine-levitt commented Jun 19, 2018

pkofod commented Jun 19, 2018 • edited Loading

anriseth commented Jun 19, 2018

antoine-levitt commented Jun 19, 2018

jagot commented Sep 19, 2019

antoine-levitt commented Sep 19, 2019

jagot commented Sep 19, 2019

antoine-levitt commented Sep 19, 2019

pkofod commented Sep 25, 2017 •

edited

Loading

whuang08 commented Sep 25, 2017 via email •

edited

Loading

pkofod commented Jun 19, 2018 •

edited

Loading