-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attempt rescaling of poorly scaled problems #615
Conversation
Codecov Report
@@ Coverage Diff @@
## main #615 +/- ##
==========================================
- Coverage 94.19% 93.94% -0.25%
==========================================
Files 29 29
Lines 2757 2594 -163
==========================================
- Hits 2597 2437 -160
+ Misses 160 157 -3
Continue to review full report at Codecov.
|
Is it enough to change the |
@dmbates The block it fails at is: 4×4 Matrix{Float64}:
-2.09724e7 34383.4 29916.1 -78929.7
-34369.8 -1334.4 -10224.6 -1.22374e5
-29893.4 10223.1 -3655.35 -163610.0
78803.2 -5724.27 -3884.24 2.10496e6 But I hadn't pursued it beyond that. |
The initial step is just an empty vector -- we let NLopt do the hard work figuring that out. I have given some thought to adjusting the initial step for GLMM when we're initializing from a prior LMM/GLM but I hadn't really had a chance to experiment with what would be useful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for doing this.
Is it possible to construct a test that will reliably trigger this condition? |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@dmbates I came up with a test based on an error we occasionally get in shrinkage plots -- there's a limit to how much you can re-inflate data that requires some amount of shrinkage to fit. 😄 I also added in a secondary check that returns |
@dmbates well having tested this on OpenBLAS+MKL on x86-64 and OpenBLAS on Apple silicon, I've found one example that works for now (using |
@palday This looks ready to go from my point of view. |
@palday We probably should document in the code the reason for setting the value to |
While we are looking at this section of code, do we still need what is now line 472 push!(fitlog, (copy(optsum.initial), optsum.finitial)) This is the line that results in the initial parameter value and objective being recorded twice when Could this line just be removed or is it there to avoid fitlog being an empty vector? |
@dmbates At first I thought that the value shouldn't be recorded twice because the line before that is The longterm idea was to unconditionally store the first and last steps in |
@dmbates just double checked and the first entry isn't repeated for me with the current code. |
@palday I think Nlopt has to evaluate the objective at the initial parameter values because we only pass it the parameters, we don't pass it the objective value. If we want to have fitlog contain the initial and final parameter and objective values and have the |
if you set julia> ds = MixedModels.dataset(:dyestuff)
Arrow.Table with 30 rows, 2 columns, and schema:
:batch String
:yield Int16
julia> contrasts = Dict{Symbol, AbstractContrasts}(:batch => Grouping())
Dict{Symbol, AbstractContrasts} with 1 entry:
:batch => Grouping()
julia> thin = 1
1
julia> dsm01 = let
form = @formula yield ~ 1 + (1|batch)
fit(MixedModel, form, ds; contrasts, thin)
end
Linear mixed model fit by maximum likelihood
yield ~ 1 + (1 | batch)
logLik -2 logLik AIC AICc BIC
-163.6635 327.3271 333.3271 334.2501 337.5307
Variance components:
Column Variance Std.Dev.
batch (Intercept) 1388.3332 37.2603
Residual 2451.2501 49.5101
Number of obs: 30; levels of grouping factors: 6
Fixed-effects parameters:
────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
────────────────────────────────────────────────
(Intercept) 1527.5 17.6946 86.33 <1e-99
────────────────────────────────────────────────
julia> first(dsm01.optsum.fitlog, 3)
3-element Vector{Tuple{Vector{Float64}, Float64}}:
([1.0], 327.76702162461663)
([1.0], 327.76702162461663)
([1.75], 331.03619322245146) |
@dmbates Huh, that's weird. I think our tests don't catch that because they're doing |
Closes #611
This should help address some problematic models. The basic idea is to rescale the initial identity scaling by dividing by a large number, where "large" is the product of the largest weight by the largest response value. In other words, for poorly constrained problems, we change the initial guess to already have a good amount of shrinkage and see if that helps.
I tried more naive ways of rescaling the weights or even standardizing all variables in the model, but that didn't help. @dmbates if you're happy with this solution, I can add tests and updatee NEWs, etc., but I have rather mixed feelings about it myself.
docs/NEWS-update.jl
to update the cross-references.