-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix mode
of LKJCholesky
#1938
Fix mode
of LKJCholesky
#1938
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1938 +/- ##
=======================================
Coverage 86.02% 86.02%
=======================================
Files 144 144
Lines 8699 8700 +1
=======================================
+ Hits 7483 7484 +1
Misses 1216 1216 ☔ View full report in Codecov by Sentry. |
bc5dbac
to
c45af15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a bug or should be fixed. The logpdf
of LKJ
is defined wrt the Lebesgue measure on the strict upper/lower triangle of the matrix, while the logpdf
of LKJCholesky
is defined wrt the Lebesgue measure on the strict upper/lower triangle of the triangular factor:
julia> X = rand(LKJCholesky(10, 2.0));
julia> logpdf(LKJCholesky(10, 2.0), X)
-5.741870506362201
julia> logpdf(LKJ(10, 2.0), Matrix(X))
1.3678551789839233
This is important for e.g. Bijectors' logdetjac to work correctly, and this is why they have different modes.
As a result, they have different modes. Also, while LKJ
has a reasonable notion of mean (sample infinite matrices, average them), LKJCholesky
has no such reasonable notion (averaging the triangular factor produces an invalid and useless Cholesky
object). One could interpret the mean as what one gets from calling Matrix
on the Cholesky
objects and then averaging, but just like Factorization
is not a subtype of AbstractMatrix
so doesn't support e.g. addition, I think it makes more sense to leave it unimplemented. If we think one will really want the mean
of LKJ
from LKJCholesky
, maybe it makes more sense to implement convert
methods for switching between these distributions.
Hm, but actually, I think the |
Yeah, math is fine. But it's true that |
Hmm... My interpretation and use of Variates or samples of the distribution are `LinearAlgebra.Cholesky` objects, as might
be returned by `F = LinearAlgebra.cholesky(R)`, so that `Matrix(F) ≈ R` is a variate or
sample of [`LKJ`](@ref).
Sampling `LKJCholesky` is faster than sampling `LKJ`, and often having the correlation
matrix in factorized form makes subsequent computations cheaper as well. Based on this interpretation (variates of type
I haven't been aware of this (or at least I don't remember it), and it still seems unclear (surprising?) given the docstring of |
I'd originally proposed in #1336 adding a Distributions.jl/test/cholesky/lkjcholesky.jl Lines 44 to 75 in 957f0c0
I agree this would be better. But since this would be breaking, is this likely to happen anytime soon? FWIW, I've had a branch for some time with implementations of a |
mode
of LKJCholesky
and define mean(::LKJCholesky)
mode
of LKJCholesky
I removed the definition of
My understanding was that Distributions.jl/src/Distributions.jl Line 239 in 957f0c0
|
As a counterpoint, for a Distributions.jl/src/univariates.jl Lines 191 to 203 in 957f0c0
There's no corresponding docstring for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a minor suggestion.
Co-authored-by: Seth Axen <[email protected]>
It would be great to have this as its own issue so it doesn't get lost |
I opened #1939. |
For LKJ the mode is only defined (and then identical to the identity matrix) if η > 1: If 0 < η < 1, the identity matrix is a trough of the density, and if η = 1 then it's a uniform distribution of correlation matrices (see e.g. https://mc-stan.org/docs/functions-reference/correlation_matrix_distributions.html#probability-density-function). Therefore for
LKJ
mode
is defined only ifη > 1
:For
LKJCholesky
, however, on the master branchmode
always returns the identity matrix, irrespective of the parameters:This PR fixes the problem,
and definesmean
forLKJCholesky
which is currently missing and should always return the identity matrix: