Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add optim_kwds_prelim to count models in discrete_model #3928

Merged
merged 9 commits into from
Sep 25, 2017

Conversation

josef-pkt
Copy link
Member

@josef-pkt josef-pkt commented Sep 23, 2017

see #3925 and #3578

changes and additions implemented

  • new fit option in kwargs optim_kwds_prelim for all count models in discrete_model
  • Poisson.fit use null params as start_params
  • extra params in GPP and NBP use moment estimator based on prelim fit as start_params

changes one propagated fit method compared to master (in GPP)
some adjustments in unit tests, mainly to avoid problems with bfgs in older scipy

missing: improve start_params for Zero Inflation

Aside because it showed up in a test case:
If the start_params are changed as in this PR, then it is possible that the estimator converges to a different result within specified convergence tolerance.
We can only replicate or compare subject to tolerance provided by convergence criteria.

a remaining issue in unit tests, see comment below
#3928 (comment)
Estimating zero-inflation has convergence problems when there is no zero-inflation in the data.
unit test for ZINBP2 now "cheats" by using rounded params as start_params
(note there are also comment about this case in the ZI PRs, AFAIR)

closes #3578

@josef-pkt
Copy link
Member Author

This will cause some test failures.
Locally I have one when running test_discrete and I didn't check the other ones

The only change in default method is in GPP, which used the same method for prelim as for the final model.
I will change the test cases to use an appropriate method on a case by case basis, if needed at the end.

next on plan: use null model for poisson start_params which might solve some failures.

@josef-pkt
Copy link
Member Author

josef-pkt commented Sep 23, 2017

only two failures, both in tough cases:

I also have the first one in TestGeneralizedPoisson_underdispersion.test_newton locally on my computer
Note this actually uses an explicit start_params and is only indirectly affected by the current changes here. (I'll add comment to top non-replicability across versions when start_params change)

edit: the test failure is in "nm" score, the newton with fixed start_params from nm looks good

the second one looks like a bfgs failure in older scipy, not a failure locally with scipy 0.8

def test_optim_kwds_prelim():
...
      model = GeneralizedPoisson(y, exog, offset=offset, p=1.5)
        res = model.fit(disp=0, maxiter=200, optim_kwds_prelim=optim_kwds_prelim )
    
        assert_allclose(res.mle_settings['start_params'][:-1], res_poi.params, rtol=1e-4)
        assert_equal(res.mle_settings['optim_kwds_prelim'], optim_kwds_prelim)
        # rough check that convergence makes sense
>       assert_allclose(res.predict().mean(), y.mean(), rtol=0.1)
E       AssertionError: 
E       Not equal to tolerance rtol=0.1, atol=0
E       
E       (mismatch 100.0%)
E        x: array(50998.88303875233)
E        y: array(1167.7259259259258)

@josef-pkt
Copy link
Member Author

continuing TestGeneralizedPoisson_underdispersion.test_newton

this is a good test case for the next change: use proper start_params for the extra_params, alpha

In [49]:
res.model._get_start_params_null()
Out[49]:
array([ 0.35065687, -0.21486192])

final "newton" estimate is

In [50]:
res2.params
Out[50]:
array([ 1.1136293 , -0.53195689, -0.11228002])

Because we have under dispersion, GPP with alpha < 0, the fixed default start_params for alpha of 0.1 does not work for newton.
bfgs in scipy 0.18.1 already works, with only one warning in intermediate computation
res.model.fit(method='bfgs', optim_kwds_prelim=dict(method='bfgs', disp=1))
with params close to successful newton (with better start_params)
array([ 1.11363381, -0.53195835, -0.11228032])

@josef-pkt
Copy link
Member Author

two new test precision failures, that shouldn't have been affected by the test adjustments (commit "TST: adjust tests, precision, and prelim method in GPP")
precision failures in TestCountDiagnostic.test_count and TestCountDiagnostic.test_probs

more likely something changed in the travis environment.

(In the test adjustment I forgot to increase maxiter for nm, which I guess is the reason that one of the new tests still fails.)

@josef-pkt
Copy link
Member Author

NBP also fails in the optim_kwds_prelim test with "bfgs" on scipy < 0.18.
It didn't show up before, because the GPP test failed before that and stopped evaluating the test function. Test tolerance is smaller than implied by convergence tolerance.
(not clear what triggered the different convergence at higher decimals, but could be just noise if it depends on how the unit tests are run.)

And here are the 2 failing diagnostics test
I just realized that these are regression tests. See above about replicability when we change start_params

________________________ TestCountDiagnostic.test_count ________________________
self = <statsmodels.discrete.tests.test_diagnostic.TestCountDiagnostic object at 0x7f52dcdc8c10>
    def test_count(self):
        # partially smoke
        tzi1 = dia.test_poisson_zeroinflation(self.res)
    
        tzi2 = dia.test_poisson_zeroinflation_brock(self.res)
        # compare two implementation in special case
        assert_allclose(tzi1[:2], (tzi2[0]**2, tzi2[1]), rtol=1e-5)
    
        tzi3 = dia.test_poisson_zeroinflation(self.res, self.exog)
    
        # regression test
        tzi3_1 = (0.79863597832443878, 0.67077736750318928, 2, 2)
>       assert_allclose(tzi3, tzi3_1)
E       AssertionError: 
E       Not equal to tolerance rtol=1e-07, atol=0
E       
E       (mismatch 50.0%)
E        x: array([ 0.798819,  0.670716,  2.      ,  2.      ])
E        y: array([ 0.798636,  0.670777,  2.      ,  2.      ])
statsmodels/discrete/tests/test_diagnostic.py:58: AssertionError
---------------------------- Captured stdout setup -----------------------------
Optimization terminated successfully.
         Current function value: 1.231798
         Iterations: 6
         Function evaluations: 7
         Gradient evaluations: 7
________________________ TestCountDiagnostic.test_probs ________________________
self = <statsmodels.discrete.tests.test_diagnostic.TestCountDiagnostic object at 0x7f52dcc54290>
    def test_probs(self):
        nobs = self.nobs
        probs = self.res.predict_prob()
        freq = np.bincount(self.endog) / nobs
    
        tzi = dia.test_chisquare_prob(self.res, probs[:, :2])
        # regression numbers
        tzi1 = (0.387770845, 0.5334734738)
>       assert_allclose(tzi[:2], tzi1)
E       AssertionError: 
E       Not equal to tolerance rtol=1e-07, atol=0
E       
E       (mismatch 100.0%)
E        x: array([ 0.387776,  0.533471])
E        y: array([ 0.387771,  0.533473])

@coveralls
Copy link

coveralls commented Sep 24, 2017

Coverage Status

Coverage increased (+0.009%) to 81.553% when pulling f024c33 on josef-pkt:discrete_fit_start into 687d807 on statsmodels:master.

@josef-pkt
Copy link
Member Author

NBP test in test_optim_kwds_prelim still fails on older scipy.
I'm still using bfgs for the final model and it doesn't work with the "nm" prelim estimator

@josef-pkt
Copy link
Member Author

after using alpha prelim estimate:
5 new failures in zero-inflated, but the test_optim_kwds_prelim failure seems to have gone away
(win some, loose some)
https://travis-ci.org/statsmodels/statsmodels/jobs/279207195
new test failures are all in test_count_model.TestZeroInflatedNegativeBinomialP because the model didn't converge and hessian is not invertible

@josef-pkt
Copy link
Member Author

ZINBP test case is another nasty corner case: there is no zero-inflation in the data when negbin is used.
So the prediction of the inflation model is essentially zero prob of inflation or probability 1 for the main model.
Parameters close to the boundary are not well identified.

M:\..\tmp>nosetests -v --pdb --pdb-failures "M:\...\statsmodels\discrete\tests\test_count_model.py:TestZ
eroInflatedNegativeBinomialP.test_minimize"
M:\...\statsmodels\base\model.py:511: ConvergenceWarning: Maximum Likelihood optimization failed to converge. C
heck mle_retvals
  "Check mle_retvals", ConvergenceWarning)
M:\...\statsmodels\base\model.py:488: HessianInversionWarning: Inverting hessian failed, no bse or cov_params available
  'available', HessianInversionWarning)
statsmodels.discrete.tests.test_count_model.TestZeroInflatedNegativeBinomialP.te
st_minimize ... > c:\...\winpython-64bit-3.4.4.5qt5\python-3.4
.4.amd64\lib\site-packages\numpy\testing\utils.py(739)assert_array_compare()
-> raise AssertionError(msg)

(Pdb) x
array([  1.86303811, -10.1851057 ,  -0.20491449,   1.13812905,   1.34412812])
(Pdb) y
array([  1.883859, -10.280888,  -0.204769,   1.137985,   1.344457])
(Pdb) u
> c:\...\winpython-64bit-3.4.4.5qt5\python-3.4.4.amd64\lib\sit
e-packages\numpy\testing\utils.py(1392)assert_allclose()
-> verbose=verbose, header=header)
(Pdb) u
> m:\...\statsmodels\discrete\tests\test_count_model.py(395)test_minimize()
-> atol=1e-3, rtol=3e-3)
(Pdb) self.res1.predict(which='prob-main').mean()
0.999999999999998

bfgs result in different inflation parameters but has

(Pdb) self.res1.predict(which='prob-main').mean()
0.99999995516952411

'nm' is similar (but I don't have the number anymore on the shell screen)

I'm not getting into fixing zero-inflation corner solutions for now, so I will cheat in this test.
Relevant for this PR: The start_params for the main model look good, relatively close to the final estimate. Those are obtained from the preliminary estimators that is being improved in this PR.

The inflation start_params are fixed at zeros, and I won't change them now.

And even when we improve convergence behavior, we need to work with an "identified" model and not this dataset.

@josef-pkt
Copy link
Member Author

josef-pkt commented Sep 24, 2017

on travisciL test_optim_kwds_prelim() NBP still fails, but now it's scipy <= 0.16 (I didn't check those logs before)
on appveyor there was no failure, not even with scipy 0.14.0

however, assert is for start_params not for the final estimate of params
assert_allclose(res.mle_settings['start_params'][:-1], res_poi.params, rtol=1e-4)

maybe incorrect test logic, if different methods don't converge to the same params.
The problem is that in these older scipy, bfgs just cannot handle Poisson/exp cases. I.e. problem is still in Poisson (prelim) estimator and not in NBP itself.

@coveralls
Copy link

coveralls commented Sep 24, 2017

Coverage Status

Coverage increased (+0.01%) to 81.554% when pulling df8c9c2 on josef-pkt:discrete_fit_start into 687d807 on statsmodels:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 81.554% when pulling 5ad60c6 on josef-pkt:discrete_fit_start into 687d807 on statsmodels:master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 81.554% when pulling 5ad60c6 on josef-pkt:discrete_fit_start into 687d807 on statsmodels:master.

@codecov-io
Copy link

Codecov Report

Merging #3928 into master will increase coverage by <.01%.
The diff coverage is 93.84%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3928      +/-   ##
==========================================
+ Coverage   78.97%   78.98%   +<.01%     
==========================================
  Files         541      541              
  Lines       80818    80872      +54     
  Branches     9185     9188       +3     
==========================================
+ Hits        63829    63879      +50     
- Misses      14915    14917       +2     
- Partials     2074     2076       +2
Impacted Files Coverage Δ
statsmodels/discrete/tests/test_count_model.py 100% <100%> (ø) ⬆️
statsmodels/discrete/tests/test_discrete.py 98.36% <100%> (+0.03%) ⬆️
statsmodels/discrete/tests/test_diagnostic.py 95.23% <100%> (ø) ⬆️
statsmodels/discrete/discrete_model.py 87.59% <87.5%> (-0.08%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 687d807...5ad60c6. Read the comment docs.

@josef-pkt
Copy link
Member Author

josef-pkt commented Sep 24, 2017

Finally everything green.

still to come: add alpha start_params also for NegativeBinomial
(and if I'm patient enough, a few more unit tests)

After I make the change (only) two tests fail TestNegativeBinomialNBP1Null.test_start_null TestNegativeBinomialNBP2Null.test_start_null because the alpha start_params for the null model are not the same anymore.
Note failures are in the "P" version.
The unit test compares null start_params between NB and NBP. something is different between the two NB implementations.
my mistake NBP._estimate_dispersion used mean and not df_resid

@josef-pkt
Copy link
Member Author

another test failure this time in 3 TestNegbinClu test classes that all use bfgs (default) for the same model.
It reports convergence after just 18 function evaluations. bfgs uses gtol as convergence criterion.
the bfgs internal gradient is just below gtol=1e-5. However, when I calculate it with the results.params, then I have rtol around 1e-4.
Reducing gtol to 1e-7 resolves the test failures on my computer.
I.e. bfgs stops too early when it gets different start_params which are good.

@coveralls
Copy link

coveralls commented Sep 24, 2017

Coverage Status

Coverage increased (+0.01%) to 81.554% when pulling f5ef296 on josef-pkt:discrete_fit_start into 687d807 on statsmodels:master.

@josef-pkt
Copy link
Member Author

as green as it gets (one statespace simulate hang again on appveyor)
I'm finished here, will merge tomorrow

what I haven't checked yet is how good the moment estimator for alpha is at the final mu and alpha. (But I don't see a use case yet when we already have the final estimate. For the preliminary estimate it will be useful also as a diagnostic check, and could be made into a staticmethod or classmethod.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants