-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to Manifolds #448
Comments
Any chance of support for positive semi-definite matrices? |
According to the docs Manopt has this, but their code is GPL-3 licensed, so I think it's best if we avoid links to their code base in this thread, as it will probably exclude the reader from contributing. |
Coincidentally, @whuang08 pointed to me just yesterday that a retraction for the manifold of SPD matrices is R_eta(X) = X + eta + 0.5 eta X^{-1} eta, which breaks the current API (which only has a retract(x) function whereas this would need a retract(x,eta)). This should not be too bad to adapt though. That retraction seems fishy to me because it's singular at zero eigenvalues, but apparently it's useful. I don't have any experience in SPD matrices, do you have a use case/paper/reference implementation? |
Ah, good point about the licence, will avoid looking at ROPTLIB also then. It is however described in papers. |
It probably doesn't answer your question directly (it's been a while since I've worked with manifolds) but the easiest option is usually parametrise SPD matrices via a Cholesky factor which is constrained to have non-negative diagonals. |
That looks substantially different from using a manifold type algorithm, and is probably best treated by a solver with inequality constrains (JuMP & friends). |
Ah, okay. Thanks. |
Hmm, my problem is that I have quite a complicated objective function, which doesn't seem to be supported by JuMP. I guess I can write the reparametrisation myself, but it would be nice if there were some tools for this. |
Two references for SPD matrices are:
https://www.math.fsu.edu/~whuang2/papers/RMKMSPDM.htm
and
https://arxiv.org/abs/1507.02772
Of course, there are more references since computation on SPD matrices is
an active topic as far as I know.
(Many people discussed about this when I was in a linear algebra
conference.)
Best,
Wen
…On Mon, Sep 25, 2017 at 3:37 PM, Simon Byrne ***@***.***> wrote:
Hmm, my problem is that I have quite a complicated objective function,
which doesn't seem to be supported by JuMP. I guess I can write the
reparametrisation myself, but it would be nice if there were some tools for
this.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#448 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AQ5gb8_vJZE1455Udt7MXxysqYnGfM4gks5smA8LgaJpZM4Ogbic>
.
|
I guess it depends on whether you expect your minimum to hit the boundary (matrices with zero eigenvalues) or not. If you do then I would imagine it's hard to bypass an inequality constrained solver. If you don't then I guess manifold algorithms are efficient. |
Just putting this here for reference: I benchmarked finding the first few eigenvalues of a large random matrix in Optim and in manopt. The results is that the implementations are comparable and result in about the same number of iterations/matvec. Code for manopt:
and Optim:
|
Cool, good to have some examples like this. I tried swapping out HagerZhang for MoreThuente. The former gives
while MoreThuente gives
Interesting to see the significant reduction in objective calls here. I know this is know - HZ needs to be conservative, but if MT is suited for the example at hand, there's a pretty neat gain to find in terms of calls to f(x)/g(x). (using |
Hmm, and BackTracking stagnates
|
@anriseth yes, I reported this on gitter. This is strange, because I used backtracking on similar problems before and it worked very well. I reduced it further here: #626 @pkofod yes, LBFGS usually produces pretty good initial steps, so HZ is counterproductive in requiring more evals per iteration. In my previous tests backtracking was a much better default choice; @anriseth suggested to make it the default for LBFGS and I agree: it should really only be the default for CG, where it is more important to have a good stepsize. |
I am particularly interested in the last two items in the list. For the custom metric, I implemented the Löwdin transform that symmetrically orthogonalizes a non-orthogonal basis, such that the SVD:s &c that assumes an orthogonal basis works. The results are then transformed back to the non-orthogonal basis. Please see my implementation here: Would there be any interest in including these extended manifolds into Optim.jl? Any improvement suggestions? |
For sure, please do put it in there. The longer term plan is to split those things off in Manifolds.jl,see JuliaManifolds/Manifolds.jl#35, but right now it can live here. I think you can avoid doing the lowdin transform explicitly, and just eigendecompose overlap matrices. Will take a closer look when I'm at a computer |
Well, I implemented the Löwdin transform by eigendecomposing the overlap matrix (when it is not simply an |
I mean computing the overlap as X'SX, factoring that, and using that transform to modify X. That's more efficient when the first dimension of X is large, esp. when S can be applied in a matrix-free way |
See #435
Never used it, so I'm not the best person to do this
Competitors include ROPTLIB, ManOpt, https://github.com/NickMcNutt/ManifoldOptim.jl (abandoned)
The proper way to implement algorithms like CG and BFGS is to use vector transport to transport the information at one point to another. Right now this is done with projections, which might not be the most efficient
See e.g. the list in http://www.math.fsu.edu/~whuang2/Indices/index_ROPTLIB.html
Also {x, Ax = b}, or intersection manifold (just do the projection on both manifolds iteratively and hope it converges)
I have been pretty liberal with the use of retractions and projections in the optimizers, maybe some of them are unnecessary
Right now, the two components are stored in a flat 1D array, which might be suboptimal
The Sphere and Stiefel manifolds could take a more general inner product
The text was updated successfully, but these errors were encountered: