Skip to content

Commit

Permalink
BACKPORT: condVar + GLMM bootstrap lowerbound compat (#519)
Browse files Browse the repository at this point in the history
* Remove DataFramesMeta from the docs (#515)

* fix reflink

* remove DataFramesMeta from the docs

(cherry picked from commit 628cbf3)

* CompatHelper: bump compat for "Distributions" to "0.25" (#516)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 16bb7fb)

* Add sparseL extractor and use it in condVar (#492)

* Add sparseL extractor and use it in condVar

* Add and export condVarTables.  Comment out failing tests.

* update version bound for condVar

* make cvtbl internal

* remove some unused type restrictions

* Remove guards on code for v1.7.0-DEV

* Add tests of convVartable

Co-authored-by: Phillip Alday <[email protected]>
(cherry picked from commit cbcc1e1)

* omit lowerbound on beta for GLMM bootstrap (#518)

* news update

(cherry picked from commit a3c33de)

* NEWS links

* minor version bump

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Douglas Bates <[email protected]>
  • Loading branch information
3 people authored May 10, 2021
1 parent d5db680 commit 7272052
Show file tree
Hide file tree
Showing 12 changed files with 226 additions and 39 deletions.
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
MixedModels v3.7.0 Release Notes
========================
* Add `condVar` and `condVartables` for computing the conditional variance on the random effects [#492]
* Bugfix: store the correct lower bound for GLMM bootstrap, when the original model was fit with `fast=false` [#518]

MixedModels v3.6.0 Release Notes
========================
* Add `likelihoodratiotest` method for comparing non-mixed (generalized) linear models to (generalized) linear mixed models [#508].
Expand Down Expand Up @@ -181,9 +186,11 @@ Package dependencies
[#480]: https://github.com/JuliaStats/MixedModels.jl/issues/480
[#482]: https://github.com/JuliaStats/MixedModels.jl/issues/482
[#484]: https://github.com/JuliaStats/MixedModels.jl/issues/484
[#492]: https://github.com/JuliaStats/MixedModels.jl/issues/492
[#493]: https://github.com/JuliaStats/MixedModels.jl/issues/493
[#495]: https://github.com/JuliaStats/MixedModels.jl/issues/495
[#501]: https://github.com/JuliaStats/MixedModels.jl/issues/501
[#507]: https://github.com/JuliaStats/MixedModels.jl/issues/507
[#508]: https://github.com/JuliaStats/MixedModels.jl/issues/508
[#510]: https://github.com/JuliaStats/MixedModels.jl/issues/510
[#518]: https://github.com/JuliaStats/MixedModels.jl/issues/518
4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "MixedModels"
uuid = "ff71e718-51f3-5ec2-a782-8ffcbfa3c316"
author = ["Phillip Alday <[email protected]>", "Douglas Bates <[email protected]>", "Jose Bayoan Santiago Calderon <[email protected]>"]
version = "3.6.0"
version = "3.7.0"

[deps]
Arrow = "69666777-d1a9-59fb-9406-91d4454c9d45"
Expand All @@ -28,7 +28,7 @@ Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
Arrow = "1"
BlockArrays = "0.11, 0.12, 0.13"
DataAPI = "1"
Distributions = "0.21, 0.22, 0.23, 0.24"
Distributions = "0.21, 0.22, 0.23, 0.24, 0.25"
GLM = "1"
LazyArtifacts = "1"
NLopt = "0.5, 0.6"
Expand Down
1 change: 0 additions & 1 deletion docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
[deps]
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
DataFramesMeta = "1313f7d8-7da2-5740-9ea0-a2ca25f37964"
DisplayAs = "0b91fe84-8a4c-11e9-3e1d-67c38462b6d6"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
FreqTables = "da1fdf0e-e0ff-5433-a45f-9bb5ff651cb1"
Expand Down
38 changes: 23 additions & 15 deletions docs/src/bootstrap.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ using DisplayAs

```@example Main
using DataFrames
using DataFramesMeta # dplyr-like operations
using Gadfly # plotting package
using MixedModels
using Random
Expand All @@ -54,21 +53,26 @@ first(df, 10)
Especially for those with a background in [`R`](https://www.R-project.org/) or [`pandas`](https://pandas.pydata.org),
the simplest way of accessing the parameter estimates in the parametric bootstrap object is to create a `DataFrame` from the `allpars` property as shown above.

The [`DataFramesMeta`](https://github.com/JuliaData/DataFramesMeta.jl) package provides macros for extracting rows or columns of a dataframe.
We can use `filter` to filter out relevant rows of a dataframe.
A density plot of the estimates of `σ`, the residual standard deviation, can be created as
```@example Main
σres = @where(df, :type .== "σ", :group .== "residual").value
plot(x = σres, Geom.density, Guide.xlabel("Parametric bootstrap estimates of σ"))
σres = filter(df) do row # create a thunk that operates on rows
row.type == "σ" && row.group == "residual" # our filtering rule
end
plot(x = σres.value, Geom.density, Guide.xlabel("Parametric bootstrap estimates of σ"))
```
For the estimates of the intercept parameter, the `getproperty` extractor must be used
```@example Main
plot(@where(df, :type .== "β"), x = :value, Geom.density, Guide.xlabel("Parametric bootstrap estimates of β₁"))
plot(filter(:type => ==("β"), df), x = :value, Geom.density, Guide.xlabel("Parametric bootstrap estimates of β₁"))
```

A density plot of the estimates of the standard deviation of the random effects is obtained as
```@example Main
σbatch = @where(df, :type .== "σ", :group .== "batch").value
plot(x = σbatch, Geom.density,
σbatch = filter(df) do row # create a thunk that operates on rows
row.type == "σ" && row.group == "batch" # our filtering rule
end
plot(x = σbatch.value, Geom.density,
Guide.xlabel("Parametric bootstrap estimates of σ₁"))
```

Expand All @@ -77,7 +81,7 @@ Although this mode appears to be diffuse, this is an artifact of the way that de
In fact, it is a pulse, as can be seen from a histogram.

```@example Main
plot(x = σbatch, Geom.histogram,
plot(x = σbatch.value, Geom.histogram,
Guide.xlabel("Parametric bootstrap estimates of σ₁"))
```

Expand Down Expand Up @@ -126,24 +130,28 @@ DataFrame(shortestcovint(samp2))

A histogram of the estimated correlations from the bootstrap sample has a spike at `+1`.
```@example Main
ρs = @where(df2, :type .== "ρ", :group .== "subj").value
plot(x = ρs, Geom.histogram,
ρs = filter(df2) do row
row.type == "ρ" && row.group == "subj"
end
plot(x = ρs.value, Geom.histogram,
Guide.xlabel("Parametric bootstrap samples of correlation of random effects"))
```
or, as a count,
```@example Main
sum(ρs .≈ 1)
count(ρs.value .≈ 1)
```

Close examination of the histogram shows a few values of `-1`.
```@example Main
sum(ρs .≈ -1)
count(ρs.value .≈ -1)
```

Furthermore there are even a few cases where the estimate of the standard deviation of the random effect for the intercept is zero.
```@example Main
σs = @where(df2, :type .== "σ", :group .== "subj", :names .== "(Intercept)").value
sum(σs .≈ 0)
σs = filter(df2) do row
row.type == "σ" && row.group == "subj" && row.names == "(Intercept)"
end
count(σs.value .≈ 0)
```

There is a general condition to check for singularity of an estimated covariance matrix or matrices in a bootstrap sample.
Expand All @@ -154,5 +162,5 @@ The `issingular` method for a `MixedModel` object that tests if a parameter vect

This operation is encapsulated in a method for the `issingular` function.
```@example Main
sum(issingular(samp2))
count(issingular(samp2))
```
2 changes: 2 additions & 0 deletions src/MixedModels.jl
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ export @formula,
coeftable,
cond,
condVar,
condVartables,
describeblocks,
deviance,
dispersion,
Expand Down Expand Up @@ -116,6 +117,7 @@ export @formula,
setθ!,
simulate!,
sparse,
sparseL,
std,
stderror,
updateL!,
Expand Down
6 changes: 3 additions & 3 deletions src/bootstrap.jl
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ bar is automatically disabled for non-interactive (i.e. logging) contexts.
Julia- and BLAS-level threads in the future.
!!! warning
The PRNG shared between threads is locked using [`Threads.SpinLock`](@ref), which
The PRNG shared between threads is locked using `Threads.SpinLock`, which
should not be used recursively. Do not wrap `parametricbootstrap` in an outer `SpinLock`.
"""
function parametricbootstrap(
Expand Down Expand Up @@ -115,7 +115,7 @@ function parametricbootstrap(
samp,
deepcopy(morig.λ),
getfield.(morig.reterms, :inds),
copy(morig.optsum.lowerbd),
morig.optsum.lowerbd[1:length(first(samp).θ)],
NamedTuple{Symbol.(fnames(morig))}(map(t -> (t.cnames...,), morig.reterms)),
)
end
Expand Down Expand Up @@ -257,7 +257,7 @@ end
"""
shortestcovint(bsamp::MixedModelBootstrap, level = 0.95)
Return the shortest interval containing `level` proportion for each parameter from [`bsamp.allpars`](@ref)
Return the shortest interval containing `level` proportion for each parameter from `bsamp.allpars`
"""
function shortestcovint(bsamp::MixedModelBootstrap{T}, level = 0.95) where {T}
allpars = bsamp.allpars
Expand Down
117 changes: 109 additions & 8 deletions src/linearmixedmodel.jl
Original file line number Diff line number Diff line change
Expand Up @@ -290,15 +290,48 @@ diagonal blocks from the conditional variance-covariance matrix,
s² Λ(Λ'Z'ZΛ + I)⁻¹Λ'
"""
function condVar(m::LinearMixedModel{T}) where {T}
retrms = m.reterms
t1 = first(retrms)
L11 = getblock(m.L, 1, 1)
if !isone(length(retrms)) || !isa(L11, Diagonal{T,Vector{T}})
throw(ArgumentError("code for multiple or vector-valued r.e. not yet written"))
L = m.L
s = sdest(m)
@static if VERSION < v"1.6.1"
spL = LowerTriangular(SparseMatrixCSC{T, Int}(sparseL(m)))
else
spL = LowerTriangular(sparseL(m))
end
nre = size(spL, 1)
val = Array{T,3}[]
offset = 0
for (i, re) in enumerate(m.reterms)
λt = s * transpose(re.λ)
vi = size(λt, 2)
ℓi = length(re.levels)
vali = Array{T}(undef, (vi, vi, ℓi))
scratch = Matrix{T}(undef, (size(spL, 1), vi))
for b in 1:ℓi
fill!(scratch, zero(T))
copyto!(view(scratch, (offset + (b - 1) * vi) .+ (1:vi), :), λt)
ldiv!(spL, scratch)
mul!(view(vali, :, :, b), scratch', scratch)
end
push!(val, vali)
offset += vi * ℓi
end
ll = first(t1.λ)
Ld = L11.diag
Array{T,3}[reshape(abs2.(ll ./ Ld) .* varest(m), (1, 1, length(Ld)))]
val
end

function _cvtbl(arr::Array{T,3}, trm) where {T}
merge(
NamedTuple{(fname(trm),)}((trm.levels,)),
columntable([NamedTuple{(:σ, :ρ)}(sdcorr(view(arr, :, :, i))) for i in axes(arr, 3)]),
)
end

"""
condVartables(m::LinearMixedModel)
Return the conditional covariance matrices of the random effects as a `NamedTuple` of columntables
"""
function condVartables(m::MixedModel{T}) where {T}
NamedTuple{fnames(m)}((map(_cvtbl, condVar(m), m.reterms)...,))
end

function createAL(allterms::Vector{Union{AbstractReMat{T},FeMat{T}}}) where {T}
Expand Down Expand Up @@ -848,6 +881,74 @@ end

Base.show(io::IO, m::LinearMixedModel) = Base.show(io, MIME"text/plain"(), m)

"""
_coord(A::AbstractMatrix)
Return the positions and values of the nonzeros in `A` as a
`NamedTuple{(:i, :j, :v), Tuple{Vector{Int32}, Vector{Int32}, Vector{Float64}}}`
"""
function _coord(A::Diagonal)
(i = Int32.(axes(A,1)), j = Int32.(axes(A,2)), v = A.diag)
end

function _coord(A::UniformBlockDiagonal)
dat = A.data
r, c, k = size(dat)
blk = repeat(r .* (0:k-1), inner=r*c)
(
i = Int32.(repeat(1:r, outer=c*k) .+ blk),
j = Int32.(repeat(1:c, inner=r, outer=k) .+ blk),
v = vec(dat)
)
end

function _coord(A::SparseMatrixCSC{T,Int32}) where {T}
rv = rowvals(A)
cv = similar(rv)
for j in axes(A, 2), k in nzrange(A, j)
cv[k] = j
end
(i = rv, j = cv, v = nonzeros(A), )
end

function _coord(A::Matrix)
m, n = size(A)
(
i = Int32.(repeat(axes(A, 1), outer=n)),
j = Int32.(repeat(axes(A, 2), inner=m)),
v = vec(A),
)
end

"""
sparseL(m::LinearMixedModel{T}; full::Bool=false) where {T}
Return the lower Cholesky factor `L` as a `SparseMatrix{T,Int32}`.
`full` indicates whether the parts of `L` associated with the fixed-effects and response
are to be included.
"""
function sparseL(m::LinearMixedModel{T}; full::Bool=false) where {T}
L, reterms = m.L, m.reterms
nt = length(reterms) + full
rowoffset, coloffset = 0, 0
val = (i = Int32[], j = Int32[], v = T[])
for i in 1:nt, j in 1:i
Lblk = L[Block(i, j)]
cblk = _coord(Lblk)
append!(val.i, cblk.i .+ Int32(rowoffset))
append!(val.j, cblk.j .+ Int32(coloffset))
append!(val.v, cblk.v)
if i == j
coloffset = 0
rowoffset += size(Lblk, 1)
else
coloffset += size(Lblk, 2)
end
end
dropzeros!(tril!(sparse(val...,)))
end


#=
"""
Expand Down
35 changes: 34 additions & 1 deletion src/utilities.jl
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ end
Return the average of `a` and `b`
"""
average(a::T, b::T) where {T<:AbstractFloat} = (a + b) / 2
average(a::T, b::T) where {T<:AbstractFloat} = (a + b) / T(2)

"""
cpad(s::AbstractString, n::Integer)
Expand Down Expand Up @@ -285,3 +285,36 @@ function Base.show(io::IO, pca::PCA;

nothing
end

"""
sdcorr(A::AbstractMatrix{T}) where {T}
Transform a square matrix `A` with positive diagonals into an `NTuple{size(A,1), T}` of
standard deviations and a tuple of correlations.
`A` is assumed to be symmetric and only the lower triangle is used. The order of the
correlations is row-major ordering of the lower triangle (or, equivalently, column-major
in the upper triangle).
"""
function sdcorr(A::AbstractMatrix{T}) where {T}
m,n = size(A)
m == n || throw(ArgumentError("matrix A must be square"))
indpairs = checkindprsk(m)
rtdiag = sqrt.(NTuple{m,T}(diag(A)))
(
rtdiag,
ntuple(kchoose2(m)) do k
i,j = indpairs[k]
A[i,j]/(rtdiag[i] * rtdiag[j])
end,
)
end

"""
kchoose2(k)
The binomial coefficient `k` choose `2` which is the number of elements
in the packed form of the strict lower triangle of a matrix.
"""
function kchoose2(k) # will be inlined
(k * (k - 1)) >> 1
end
Loading

2 comments on commit 7272052

@palday
Copy link
Member Author

@palday palday commented on 7272052 May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/36465

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v3.7.0 -m "<description of version>" 7272052c8438932ce31d330213eb353e9fff0b8a
git push origin v3.7.0

Please sign in to comment.