Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More Preconditioning #202

Closed
cortner opened this issue May 20, 2016 · 12 comments
Closed

More Preconditioning #202

cortner opened this issue May 20, 2016 · 12 comments

Comments

@cortner
Copy link
Contributor

cortner commented May 20, 2016

@pkofod : This follows on from JuliaLang/julia#188

I promised to report back here when there is some sort of consensus what happens with preconditioning in iterative linear algebra. In the end the outcome is very simple: Any potential preconditioner type must overload

   Base.A_ldiv_B!  
   Base.A_mul_B!

or alternatively,

  Base.\
  Base.*

Base.A_ldiv_B! would replace precondfwd!. Further, I would remove specialised implementations for inner products and simply use vecdot applied to preconditioned vectors. The additional overhead is likely negligible.

If there is no resistance, then I can make the modifications.

@pkofod
Copy link
Member

pkofod commented May 21, 2016

I think ldiv dispatch based on a preconditioner type is the best way to go 👍 I should make a note of it in the (dev) documentation (which is beeing made atm).

@cortner
Copy link
Contributor Author

cortner commented May 21, 2016

Great, expect a pull request soon

@cortner
Copy link
Contributor Author

cortner commented May 25, 2016

One of the discussions I had with the linear algebra people is whether P * x is even required or whether P \ x is sufficient. Going through cg, steepest descent and lbfgs, I find that the only place it is used is in cg.jl, is

dPd = precondinvdot(s, cg.P, s)

which computes the length of the step-direction with respect to the metric defined by the preconditioner.

In principle, this can be avoided by computing dPd via an iteration formula,

 dPd = beta^2 dPd_old - 2 beta vecdot(s_old, gr_old) + vecdot(gr_old, pgr_old)

Advantages:

  • we can remove all preconditioner functionality except for A_ldiv_B! and precondprep!
  • avoids some additional computations (though I think the gain will be negligable)

Disadvantage

  • If the preconditioner is not constant, but updated at each step, then this can cause dPd to become close to zero, so the CG update might become strange; I think this is the most significant issue with that change!
  • a little more variables to track, so a little more complexity of code
  • could there be a potential round-off issue?

@cortner
Copy link
Contributor Author

cortner commented May 25, 2016

I've kind of come around to the opinion that
If the preconditioner is not constant, but updated at each step, then this can cause dPd to
become close to zero, so the CG update might become strange; I think this is the most
significant issue with that change!

is a big problem, so I won't make that change. Instead, I will let the user overload dot for potential fast application of the P-inner product.

@pkofod
Copy link
Member

pkofod commented May 25, 2016

I'm a big fan of simplicity, but as you state, we clearly don't want an implementation that could converge given the inputs, but doesn't for reasons that could have been avoided in the implementation.

@cortner
Copy link
Contributor Author

cortner commented May 25, 2016

Ok, so let me update you briefly what I am doing now: minimally, the preconditioner type

    A_ldiv_B!
    dot

the advantage is that both are in Base and can be overloaded. I then write a little alias function

    invdot(a, P, b) = vecdot(a, A_ldiv_B!(similar(a), P, b))

which creates a temporary array, then solves P \ b, then applies the inner product.
The advantage of this approach is the invdot could in principle be overloaded, but
doesn't need to.

The disadvantage is the temporary array. If you have a problem with that
then I should just remove this alias, and allocate a temporary array in cg.jl that I use for this.
Let me know what you think.

Thanks for the feedback.

@cortner
Copy link
Contributor Author

cortner commented May 25, 2016

@cortner
Copy link
Contributor Author

cortner commented May 25, 2016

sorry for the continuing updates . . . I think I can now avoid invdot altogether.

@pkofod
Copy link
Member

pkofod commented May 25, 2016

I am not going to complain about work going into Optim :) Looking forward to see what you come up with.

@cortner
Copy link
Contributor Author

cortner commented May 25, 2016

done for now - see new pull request, I am happy to discuss and adjust.

@pkofod
Copy link
Member

pkofod commented May 31, 2016

👍 do you have more preconditioning tricks up your sleeves, or would you prefer to close this issue?

@cortner
Copy link
Contributor Author

cortner commented May 31, 2016

Tank you. This is great for now, when constrained optimisation is better developed it will be worth revisiting all of this.

@cortner cortner closed this as completed May 31, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants