FOWARDPORT: Prettier Markdown (and thus better Documenter Support) (#478

) * Prettier Markdown (and thus better Documenter Support) (#476) * REPL-like output for our objects * describe show functionality * model and var corr now as pretty markdown * use Markdown stdlib for LRT * nicer optsum markdown * remove dead comments * fix test * fix doc build * more use of Markdown stdlib * append! version compat * example of raw markdown * The Zen of Alex * woops * fix exponent alignment * update documenter ci * remove unused option * one more attempt at correct rendering * maybe it's 1.6? * The Zen Continues Co-authored-by: Alex Arslan <[email protected]> * slightly more complicated example (and bugfix!) * patch bump Co-authored-by: Alex Arslan <[email protected]> (cherry picked from commit c7ff101) * small edits * Update al;lowable versions of dependencies * Remove unnecessary using statement * contrast coding * statsmodels for statsmodels example Co-authored-by: Douglas Bates <[email protected]>
JuliaStats · Mar 2, 2021 · 8d92489 · 8d92489
1 parent 26a84fc
commit 8d92489
Show file tree

Hide file tree

Showing 14 changed files with 293 additions and 199 deletions.
diff --git a/.github/workflows/documenter.yml b/.github/workflows/documenter.yml
@@ -10,9 +10,12 @@ on:
 jobs:
   Documenter:
     name: Documentation
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-20.04
     steps:
       - uses: actions/checkout@v2
+      - uses: julia-actions/setup-julia@v1
+        with:
+          version: 1.6-nightly
       - uses: julia-actions/julia-buildpkg@latest
       - uses: julia-actions/julia-docdeploy@latest
         env:

diff --git a/Project.toml b/Project.toml
@@ -9,6 +9,7 @@ DataAPI = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a"
 Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
 GLM = "38e38edf-8417-5370-95a0-9cbb8c7f171a"
 LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
+Markdown = "d6f4376e-aef5-505a-96c1-9c027394607a"
 NLopt = "76087f3c-5699-56af-9a33-bf431cd00edd"
 Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
 PooledArrays = "2dfb63ee-cc39-5dd5-95bd-886bf059d720"
@@ -28,7 +29,7 @@ DataAPI = "1"
 Distributions = "0.21, 0.22, 0.23, 0.24"
 GLM = "1"
 NLopt = "0.5, 0.6"
-PooledArrays = "0.5, 1.0"
+PooledArrays = "0.5, 1"
 ProgressMeter = "1"
 StaticArrays = "0.11, 0.12, 1"
 StatsBase = "0.31, 0.32, 0.33"

diff --git a/docs/Project.toml b/docs/Project.toml
@@ -2,17 +2,19 @@
 BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
 DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
 DataFramesMeta = "1313f7d8-7da2-5740-9ea0-a2ca25f37964"
+DisplayAs = "0b91fe84-8a4c-11e9-3e1d-67c38462b6d6"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
 FreqTables = "da1fdf0e-e0ff-5433-a45f-9bb5ff651cb1"
 Gadfly = "c91e804a-d5a3-530f-b6f0-dfbca275c004"
 MixedModels = "ff71e718-51f3-5ec2-a782-8ffcbfa3c316"
 Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
 StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
+StatsModels = "3eaba693-59b7-5ba5-a881-562e759f1c8d"
 
 [compat]
 BenchmarkTools = "0.5"
-DataFrames = "0.21"
-Documenter = "0.25"
+DataFrames = "0.22"
+Documenter = "0.26"
 FreqTables = "0.4"
 Gadfly = "1"
 StatsBase = "0.33"
diff --git a/docs/make.jl b/docs/make.jl
@@ -13,6 +13,7 @@ makedocs(
         "GaussHermite.md",
         "bootstrap.md",
         "rankdeficiency.md",
+        "mime.md",
         # "SimpleLMM.md",
         # "MultipleTerms.md",
         # "SingularCovariance.md",

diff --git a/docs/src/GaussHermite.md b/docs/src/GaussHermite.md
@@ -29,6 +29,9 @@ More formally, a `k`th order rule is exact when `f` is a `k-1` order polynomial.
 
 In the [*Golub-Welsch algorithm*](https://en.wikipedia.org/wiki/Gaussian_quadrature#The_Golub-Welsch_algorithm) the abscissae for a particular Gaussian quadrature rule are determined as the eigenvalues of a symmetric tri-diagonal matrix and the weights are derived from the squares of the first row of the matrix of eigenvectors.
 For a `k`th order normalized Gauss-Hermite rule the tridiagonal matrix has zeros on the diagonal and the square roots of `1:k-1` on the super- and sub-diagonal, e.g.
+```@setup Main
+using DisplayAs
+```
 ```@example Main
 using DataFrames, LinearAlgebra, Gadfly
 sym3 = SymTridiagonal(zeros(3), sqrt.(1:2))
@@ -112,6 +115,7 @@ A model with fixed-effects for age, age squared, number of live children and urb
 ```@example Main
 const form1 = @formula use ~ 1 + age + abs2(age) + livch + urban + (1|dist);
 m1 = fit(MixedModel, form1, contra, Bernoulli(), fast=true)
+DisplayAs.Text(ans) # hide
 ```
 
 For a model such as `m1`, which has a single, scalar random-effects term, the unscaled conditional density of the spherical random effects variable, $\mathcal{U}$,

diff --git a/docs/src/bootstrap.md b/docs/src/bootstrap.md
@@ -24,6 +24,10 @@ parameter, `θ`, that defines the variance-covariance matrices of the random eff
 For example, a simple linear mixed-effects model for the `Dyestuff` data in the [`lme4`](http://github.com/lme4/lme4)
 package for [`R`](https://www.r-project.org) is fit by
 
+```@setup Main
+using DisplayAs
+```
+
 ```@example Main
 using DataFrames
 using DataFramesMeta  # dplyr-like operations
@@ -35,6 +39,7 @@ using Random
 ```@example Main
 dyestuff = MixedModels.dataset(:dyestuff)
 m1 = fit(MixedModel, @formula(yield ~ 1 + (1 | batch)), dyestuff)
+DisplayAs.Text(ans) # hide
 ```
 
 To bootstrap the model parameters, first initialize a random number generator then create a bootstrap sample
@@ -101,6 +106,7 @@ m2 = fit(
     @formula(reaction ~ 1+days+(1+days|subj)),
     sleepstudy,
 )
+DisplayAs.Text(ans) # hide
 ```
 ```@example Main
 samp2 = parametricbootstrap(rng, 10_000, m2, use_threads=true);

diff --git a/docs/src/constructors.md b/docs/src/constructors.md
@@ -18,11 +18,11 @@ MixedModels.dataset
 
 ```@setup Main
 using Test
+using DisplayAs
 ```
 
 ```@example Main
-using DataFrames, MixedModels
-using StatsBase: describe
+using DataFrames, MixedModels, StatsModels
 dyestuff = DataFrame(MixedModels.dataset(:dyestuff))
 describe(dyestuff)
 ```
@@ -37,7 +37,7 @@ MixedModels.jl builds on the the *Julia* formula language provided by [StatsMode
 The second way is to combine `Term`s with operators like `+`, `&`, `~`, and others at "run time".  This is especially useful if you wish to create a formula from a list a variable names.  For instance, the following are equivalent:
 
 ```@example Main
-@formula(y ~ 1 + a + b + a & b) == term(:y) ~ term(1) + term(:a) + term(:b) + term(:a) & term(:b)
+@formula(y ~ 1 + a + b + a & b) == (term(:y) ~ term(1) + term(:a) + term(:b) + term(:a) & term(:b))
 ```
 
 MixedModels.jl provides additional formula syntax for representing *random-effects terms*.  Most importantly, `|` separates random effects and their grouping factors (as in the formula extension used by the *R* package [`lme4`](https://cran.r-project.org/web/packages/lme4/index.html).  Much like with the base formula language, `|` can be used within the `@formula` macro and to construct a formula programmatically:
@@ -60,13 +60,14 @@ A basic model with simple, scalar random effects for the levels of `batch` (the
 ```@example Main
 fm = @formula(yield ~ 1 + (1|batch))
 fm1 = fit(MixedModel, fm, dyestuff)
+DisplayAs.Text(ans) # hide
 ```
 
 ```@setup Main
 @testset "fm1" begin
-    @test deviance(fm1) ≈ 327.32706
-    @test varest(fm1) ≈ 2451.2500
-    @test VarCorr(fm1).σρ.batch.σ[1] ≈ 37.260345
+    @test isapprox(deviance(fm1), 327.32706, atol=1.e-3)
+    @test isapprox(varest(fm1), 2451.250, atol=1.e-3)
+    @test isapprox(VarCorr(fm1).σρ.batch.σ[1], 37.260, atol=1.e-3)
 end
 ```
 
@@ -81,6 +82,7 @@ dyestuff2 = MixedModels.dataset(:dyestuff2)
 By default, the model is fit by maximum likelihood. To use the `REML` criterion instead, add the optional named argument `REML=true` to the call to `fit`
 ```@example Main
 fm1reml = fit(MixedModel, fm, dyestuff, REML=true)
+DisplayAs.Text(ans) # hide
 ```
 ```@setup Main
 @testset "fm1reml" begin
@@ -120,6 +122,7 @@ A model with random intercepts and random slopes for each subject, allowing for
 ```@example Main
 sleepstudy = MixedModels.dataset(:sleepstudy)
 fm2 = fit(MixedModel, @formula(reaction ~ 1 + days + (1 + days|subj)), sleepstudy)
+DisplayAs.Text(ans) # hide
 ```
 ```@setup Main
 @testset "fm2" begin
@@ -137,6 +140,7 @@ As every sample is used on every plate these two factors are *crossed*.
 ```@example Main
 penicillin = MixedModels.dataset(:penicillin)
 fm3 = fit(MixedModel, @formula(diameter ~ 1 + (1|plate) + (1|sample)), penicillin)
+DisplayAs.Text(ans) # hide
 ```
 ```@setup Main
 @testset "fm3" begin
@@ -155,12 +159,14 @@ describe(pastes)
 This can be expressed using the solidus (the "`/`" character) to separate grouping factors, read "`cask` nested within `batch`":
 ```@example Main
 fm4a = fit(MixedModel, @formula(strength ~ 1 + (1|batch/cask)), pastes)
+DisplayAs.Text(ans) # hide
 ```
 
 If the levels of the inner grouping factor are unique across the levels of the outer grouping factor, then this nesting does not need to expressed explicitly in the model syntax. For example, defining `sample` to be the combination of `batch` and `cask`, yields a naming scheme where the nesting is apparent from the data even if not expressed in the formula. (That is, each level of `sample` occurs in conjunction with only one level of `batch`.) As such, this model is equivalent to the previous one.
 ```@example Main
 pastes.sample = (string.(pastes.cask, "&",  pastes.batch))
 fm4b = fit(MixedModel, @formula(strength ~ 1 + (1|sample) + (1|batch)), pastes)
+DisplayAs.Text(ans) # hide
 ```
 ```@setup Main
 @testset "implicit and explicit nesting" begin
@@ -176,6 +182,7 @@ Additional covariates include the academic department, `dept`, in which the cour
 ```@example Main
 insteval = MixedModels.dataset(:insteval)
 fm5 = fit(MixedModel, @formula(y ~ 1 + service * dept + (1|s) + (1|d)), insteval)
+DisplayAs.Text(ans) # hide
 ```
 ```@setup Main
 @testset "fm5" begin
@@ -196,24 +203,28 @@ This often arises when there are many random effects you want to estimate (as is
 The special syntax `zerocorr` can be applied to individual random effects terms inside the `@formula`:
 ```@example Main
 fm2zerocorr_fm = fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + days|subj)), sleepstudy)
+DisplayAs.Text(ans) # hide
 ```
 
 Alternatively, correlations between parameters can be removed by including them as separate random effects terms:
 ```@example Main
 fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + (days|subj)), sleepstudy)
+DisplayAs.Text(ans) # hide
 ```
 
 Finally, for predictors that are categorical, MixedModels.jl will estimate correlations between each level.
 Notice the large number of correlation parameters if we treat `days` as a categorical variable by giving it contrasts:
 ```@example Main
 fit(MixedModel, @formula(reaction ~ 1 + days + (1 + days|subj)), sleepstudy,
     contrasts = Dict(:days => DummyCoding()))
+DisplayAs.Text(ans) # hide
 ```
 
 Separating the `1` and `days` random effects into separate terms removes the correlations between the intercept and the levels of `days`, but not between the levels themselves:
 ```@example Main
 fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + (days|subj)), sleepstudy,
     contrasts = Dict(:days => DummyCoding()))
+DisplayAs.Text(ans) # hide
 ```
 (Notice that the variance component for `days: 1` is estimated as zero, so the correlations for this component are undefined and expressed as `NaN`, not a number.)
 
@@ -224,6 +235,7 @@ fulldummy
 ```@example Main
 fit(MixedModel, @formula(reaction ~ 1 + days + (1 + fulldummy(days)|subj)), sleepstudy,
     contrasts = Dict(:days => DummyCoding()))
+DisplayAs.Text(ans) # hide
 ```
 This fit produces a better fit as measured by the objective (negative twice the log-likelihood is 1610.8) but at the expense of adding many more parameters to the model.
 As a result, model comparison criteria such, as `AIC` and `BIC`, are inflated.
@@ -232,14 +244,17 @@ But using `zerocorr` on the individual terms does remove the correlations betwee
 ```@example Main
 fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + days|subj)), sleepstudy,
     contrasts = Dict(:days => DummyCoding()))
+DisplayAs.Text(ans) # hide
 ```
 ```@example Main
 fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj) + zerocorr(days|subj)), sleepstudy,
     contrasts = Dict(:days => DummyCoding()))
+DisplayAs.Text(ans) # hide
 ```
 ```@example Main
 fit(MixedModel, @formula(reaction ~ 1 + days + zerocorr(1 + fulldummy(days)|subj)), sleepstudy,
     contrasts = Dict(:days => DummyCoding()))
+DisplayAs.Text(ans) # hide
 ```
 
 ## Fitting generalized linear mixed models
@@ -254,6 +269,7 @@ the distribution family for the response, and possibly the link function, must b
 verbagg = MixedModels.dataset(:verbagg)
 verbaggform = @formula(r2 ~ 1 + anger + gender + btype + situ + mode + (1|subj) + (1|item));
 gm1 = fit(MixedModel, verbaggform, verbagg, Bernoulli())
+DisplayAs.Text(ans) # hide
 ```
 
 The canonical link, which is `LogitLink` for the `Bernoulli` distribution, is used if no explicit link is specified.
@@ -422,6 +438,7 @@ coeftable
 ```
 ```@example Main
 coeftable(fm2)
+DisplayAs.Text(ans) # hide
 ```
 
 ## Covariance parameter estimates
@@ -432,9 +449,11 @@ VarCorr
 ```
 ```@example Main
 VarCorr(fm2)
+DisplayAs.Text(ans) # hide
 ```
 ```@example Main
 VarCorr(gm1)
+DisplayAs.Text(ans) # hide
 ```
 
 Individual components are returned by other extractors
@@ -533,6 +552,7 @@ sum(leverage(fm2))
 When a model converges to a singular covariance, such as
 ```@example Main
 fm3 = fit(MixedModel, @formula(yield ~ 1+(1|batch)), MixedModels.dataset(:dyestuff2))
+DisplayAs.Text(ans) # hide
 ```
 the effective degrees of freedom is the lower bound.
 ```@example Main
@@ -543,6 +563,7 @@ Models for which the estimates of the variances of the random effects are large
 ```@example Main
 fm4 = fit(MixedModel, @formula(diameter ~ 1+(1|plate)+(1|sample)),
     MixedModels.dataset(:penicillin))
+DisplayAs.Text(ans) # hide
 ```
 ```@example Main
 sum(leverage(fm4))
@@ -552,6 +573,7 @@ Also, a model fit by the REML criterion generally has larger estimates of the va
 ```@example Main
 fm4r = fit(MixedModel, @formula(diameter ~ 1+(1|plate)+(1|sample)),
     MixedModels.dataset(:penicillin), REML=true)
+DisplayAs.Text(ans) # hide
 ```
 ```@example Main
 sum(leverage(fm4r))

diff --git a/docs/src/index.md b/docs/src/index.md
@@ -14,6 +14,7 @@ Pages = [
         "GaussHermite.md",
         "bootstrap.md",
         "rankdeficiency.md",
+        "mime.md",
 ]
 Depth = 2
 ```
diff --git a/docs/src/mime.md b/docs/src/mime.md
@@ -0,0 +1,74 @@
+# Alternative display and output formats
+
+In the documentation, we have presented the output from MixedModels.jl in the same format you will see when working in the REPL.
+You may have noticed, however, that output from other packages received pretty printing.
+For example, DataFrames are converted into nice HTML tables.
+In MixedModels, we recently (v3.2.0) introduced limited support for such pretty printing.
+(For more details on how the print and display system in Julia works, check out [this NextJournal post](https://nextjournal.com/sdanisch/julias-display-system).)
+
+In particular, we have defined Markdown output, i.e. `show` methods, for our types, which can be easily translated into HTML, LaTeX or even a MS Word Document using tools such as [pandoc](https://pandoc.org/).
+Packages like `IJulia` and `Documenter` can often detect the presence of these display options and use them automatically.
+
+
+```@example Main
+using MixedModels
+form = @formula(rt_trunc ~ 1 + spkr * prec * load +
+                          (1 + load | item) +
+                          (1 + spkr + prec + load | subj))
+contr = Dict(:spkr => EffectsCoding(),
+             :prec => EffectsCoding(),
+             :load => EffectsCoding(),
+             :item => Grouping(),
+             :subj => Grouping())
+kbm = fit(MixedModel, form, MixedModels.dataset(:kb07); contrasts=contr)
+```
+
+Note that the display here is more succinct than the standard REPL display:
+
+```@example Main
+using DisplayAs
+kbm |> DisplayAs.Text
+```
+
+This brevity is intentional: we wanted these types to work well with traditional academic publishing constraints on tables.
+The summary for a model fit presented in the REPL does not mesh well with being treated as a single table (with columns shared between the random and fixed effects).
+In our experience, this leads to difficulties in typesetting the resulting tables.
+We nonetheless encourage users to report fit statistics such as the log likelihood or AIC as part of the caption of their table.
+If the correlation parameters in the random effects are of interest, then [`VarCorr`](@ref) can also be pretty printed:
+
+```@example Main
+VarCorr(kbm)
+```
+
+Similarly for [`BlockDescription`](@ref), `OptSummary` and `MixedModels.likelihoodratiotest`:
+
+```@example Main
+BlockDescription(kbm)
+```
+
+```@example Main
+kbm.optsum
+```
+
+```@example Main
+m0 = fit(MixedModel, @formula(reaction ~ 1 + (1|subj)), MixedModels.dataset(:sleepstudy))
+m1 = fit(MixedModel, @formula(reaction ~ 1 + days + (1+days|subj)), MixedModels.dataset(:sleepstudy))
+MixedModels.likelihoodratiotest(m0,m1)
+```
+
+To explicitly invoke this behavior, we must specify the right `show` method:
+```julia
+show(MIME("text/markdown"), m1)
+```
+```@example Main
+println(sprint(show, MIME("text/markdown"), kbm)) # hide
+```
+(The raw and not rendered output is intentionally shown here.)
+
+In the future, we may directly support HTML and LaTeX as MIME types.
+
+This output can also be written directly to file:
+
+```julia
+show(open("model.md", "w"), MIME("text/markdown"), kbm)
+```