Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add adjoints for functions on Hermitian matrices (take 2) #355

Merged
merged 30 commits into from
Nov 9, 2019

Conversation

sethaxen
Copy link
Contributor

@sethaxen sethaxen commented Oct 3, 2019

Note: this is a redo of #352 because of a wayward rebase

This PR adds adjoints for the following functions of Hermitian and real symmetric matrices:

  • exp
  • cos
  • sin
  • tan
  • cosh
  • sinh
  • tanh
  • log
  • acos
  • asin
  • atan
  • acosh
  • asinh
  • atanh
  • sqrt
  • sincos
  • ^

Each of these can be written as a matrix power series, which for a diagonalizable matrix simplifies to element-wise action of the function upon the eigenvalues of the matrix. Consequently, their computation in LinearAlgebra and their adjoints take a similar form to each other.

Along the way, I fixed #340 for generic matrices, since the current adjoint for exp using the eigendecomposition shares a similar form to the Hermitian cases.

@sethaxen
Copy link
Contributor Author

sethaxen commented Oct 7, 2019

The test is currently failing because DiffRule's complex derivative of acosh is wrong. Looks like this PR fixes it but hasn't been worked on in a while. Should I just add a custom acosh adjoint to Zygote?

@MikeInnes
Copy link
Member

Should I just add a custom acosh adjoint to Zygote?

Sure, please do.

@sethaxen sethaxen force-pushed the sdaxen/hermitian_funcs branch from 97b4471 to 29db72f Compare October 8, 2019 16:30
Post-processing just enforces type constraints, but pulling back through it significantly changes gradients wrt args. It turns out this isn't necessary.
Enables overloading for specific functions
Separating real and imag makes gradients discontinuous when output type is determined by input value.
@sethaxen
Copy link
Contributor Author

The ^(A, p::Integer) adjoints are still erroring due to #344, but otherwise, this is ready for review.

src/lib/array.jl Outdated Show resolved Hide resolved
When 2nd derivative is not available, use finite difference approximation of 2nd deriv for greater accuracy. When it is, relax the check.
@sethaxen
Copy link
Contributor Author

Just a reminder that the only reason this doesn't pass is because Array(::Union{Symmetric,Hermitian}) mutates. I'd like to push this PR over the finish line, which I can do by either adding an adjoint for this case or if a solution like those discussed in (#344 and #377) is available.

@MikeInnes MikeInnes requested a review from willtebbutt October 24, 2019 11:10
Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not amazingly well-equipped to review this / am currently overloaded to don't have time properly dive into this. That being said, from what I can tell fairly quickly it LGTM -- everything is tested as far as I can tell. As such, I don't want to hold this up, and I can't see any reason not to merge.

@MikeInnes
Copy link
Member

Since this is still marked WIP I'll give @sethaxen the ability to merge, rather than merging immediately. I'm not sure if any other help is needed from our side but lmk if so.

bors d+

@bors
Copy link
Contributor

bors bot commented Nov 4, 2019

✌️ sethaxen can now approve this pull request. To approve and merge a pull request, simply reply with bors r+. More detailed instructions are available here.

@sethaxen
Copy link
Contributor Author

sethaxen commented Nov 4, 2019

bors try

@bors
Copy link
Contributor

bors bot commented Nov 4, 2019

try

Merge conflict

@sethaxen
Copy link
Contributor Author

sethaxen commented Nov 4, 2019

@MikeInnes what's missing is this:

Just a reminder that the only reason this doesn't pass is because Array(::Union{Symmetric,Hermitian}) mutates. I'd like to push this PR over the finish line, which I can do by either adding an adjoint for this case or if a solution like those discussed in (#344 and #377) is available.

I'd appreciate feedback. I'd rather not merge something that fails the tests.

@MikeInnes
Copy link
Member

Adding an adjoint for that case seems fine. We can remove it as part of #344 as needed, but that doesn't need to block this work.

@sethaxen
Copy link
Contributor Author

sethaxen commented Nov 9, 2019

bors try

bors bot added a commit that referenced this pull request Nov 9, 2019
@bors
Copy link
Contributor

bors bot commented Nov 9, 2019

try

Build succeeded

@sethaxen
Copy link
Contributor Author

sethaxen commented Nov 9, 2019

bors r+

bors bot added a commit that referenced this pull request Nov 9, 2019
355: [WIP] Add adjoints for functions on Hermitian matrices (take 2) r=sethaxen a=sethaxen

_Note: this is a redo of #352 because of a wayward rebase_

This PR adds adjoints for the following functions of Hermitian and real symmetric matrices:
- [x] `exp`
- [x] `cos`
- [x] `sin`
- [x] `tan`
- [x] `cosh`
- [x] `sinh`
- [x] `tanh`
- [x] `log`
- [x] `acos`
- [x] `asin`
- [x] `atan`
- [x] `acosh`
- [x] `asinh`
- [x] `atanh`
- [x] `sqrt`
- [x] `sincos`
- [x] `^`

Each of these can be written as a matrix power series, which for a diagonalizable matrix simplifies to element-wise action of the function upon the eigenvalues of the matrix. Consequently, their computation in `LinearAlgebra` and their adjoints take a similar form to each other.

Along the way, I fixed #340 for generic matrices, since the current adjoint for `exp` using the eigendecomposition shares a similar form to the Hermitian cases.

Co-authored-by: Seth Axen <[email protected]>
@bors
Copy link
Contributor

bors bot commented Nov 9, 2019

Build succeeded

@bors bors bot merged commit fb95ec7 into FluxML:master Nov 9, 2019
@sethaxen sethaxen deleted the sdaxen/hermitian_funcs branch November 9, 2019 23:33
@sethaxen sethaxen changed the title [WIP] Add adjoints for functions on Hermitian matrices (take 2) Add adjoints for functions on Hermitian matrices (take 2) Nov 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exp broken for defective matrices
4 participants