Robust adaption for NUTS #324

yebai · 2017-07-26T14:47:27Z

This is a umbrella issue for adaption issues of the NUTS algorithm.

Pre-cond unstable/buggy Pre-cond seems unstable/buggy #284
HMCDA is buggy Make sure HMCDA is not buggy #289
Adding noise to step size? Adding noise to step size? #268

Before these three

Sampling interface update #326 should be implemented before starting working on this.

xukai92 · 2017-08-11T13:32:13Z

Notebook for simple illustration of adaptation issue: https://github.com/xukai92/TuringDemo/blob/master/look-into-adapt.ipynb
Helper file for generating LDA data: https://github.com/xukai92/TuringDemo/blob/master/video/lda-gen-data.jl

xukai92 · 2017-09-24T22:26:32Z

https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/var_adaptation.hpp shows that the way Stan uses Phase II window is: only set the computing-on-the-fly pre-cond matrix when setting up a new interval.

yebai · 2018-02-26T16:59:50Z

Notebook for simple illustration of adaptation issue: https://github.com/xukai92/TuringDemo/blob/master/look-into-adapt.ipynb
Helper file for generating LDA data: https://github.com/xukai92/TuringDemo/blob/master/video/lda-gen-data.jl

@xukai92 Let's try to isolate issues of adaption of step size and covariance and solve them one at a time. I suggest we first try to disable adaption of step-size (e.g. use a small enough step-size) and make sure covariance adaption is robust through using the Welford trick (see #289 (comment)); Then, we come back to see why step-size is adaption is fragile.

xukai92 · 2018-02-26T20:21:23Z

I see. that makes sense.

xukai92 · 2018-03-04T22:58:42Z

Note: there is a related issue in DiffEqBayes.jl (SciML/DiffEqBayes.jl#30).

use that as a test case when solving stability issue.

yebai · 2018-04-02T14:21:04Z

@xukai92 Any update on this issue?

* Fix dep log in lad * Dont send opt res * Fix VarInfo.show bug * Fix auto tune * Change * to .* in leapfrog * temp fix type * Disable @suppress_err temporarily * Fix a dep * Workable ReverseDiff v0.1 done * Add RevDiff to REQUIRE * Fix bug in R-AD * Fix some bugs * Fix bugs * Update test * ReversedDiff.jl mutable bug fixed * Any to Real * update benchmark * Resolve mem alloc for simplex dist * Fix bug and improve mem alloc * Improve implementaion of transformations * Don't include compile time in benchk * Resolve slowness caused by use of vi.logp * Update benchmark files * Add line to load pickle * Bugfix with reject * Using ReverseDiff.jl and unsafe model as default * Fix bug in test file * Rename vi.rs to vi.rvs * Add Naive Bayes model in Turing * Add NB to travis * DA works * Tune init * Better init * NB MNIST Stan added * Improve ad assignment * Improve ad assignment * Add Stan SV model * Improve transform typing * Finish HMM model * High dim gauss done * Benchmakr v2.0 done * Modulize var estimator and fix transform.jl * Run with ForwardDiff * Enable Stan for LDA bench * Fix a bug in adapt * Improve some code * Fix bug in NUTS MH step (#324) * Add interface for optionally enabling adaption. * Do not adapt step size when numerical error is caught. * Fix initial epsilon_bar. * Fix missing t_valid. * Drop incorrectly adapted step size when necessary (#324) * Edit warning message. * Small tweaks. * reset_da ==> restart_da * address suggested naming * Samler type for WarmUpManager.paras and notation tweaks. * Bugfix and adapt_step_size == > adapt_step_size!

xukai92 · 2018-07-03T20:06:00Z

#324 (comment) works on my local.

xukai92 · 2018-07-03T21:13:51Z

@yebai I tried the notebook with master branch again on my local and a remote linux machine. It seems that the adaptation is working now as long as the initialization is fine, i.e. the sampling only keeps throwing numerical error if the initialization is bad. Do you mind try it again on your local to see if you agree on this?

xukai92 · 2018-07-04T13:01:53Z

The downstream package DiffEqBayes.jl had a test relying on Turing.jl which suffered from adaptation issues before also passed with the current master (related issue: SciML/DiffEqBayes.jl#30, related PR: SciML/DiffEqBayes.jl#48)

ChrisRackauckas · 2018-07-04T15:32:34Z

Could you describe what changed to fix it? We thought the issue was related to parameters going negative when they were supposed to be non-negative, but didn't have a nice way to do domain transforms (are these going to be added in Turing?)

yebai · 2018-07-04T15:47:04Z

@yebai I tried the notebook with master branch again on my local and a remote linux machine. It seems that the adaptation is working now as long as the initialization is fine, i.e. the sampling only keeps throwing numerical error if the initialization is bad. Do you mind try it again on your local to see if you agree on this?

@xukai92 thanks, I will do another test and come back with my findings.

Ps. can you clarify/give an example what do you mean by bad initializations?

xukai92 · 2018-07-04T23:10:59Z

@yebai What I observed is that: 1) if there are numerical errors during the sampling, our adaptation can fix it (i.e. the numerical error disappears after few iterations, which was not the case before); 2) if there is a numerical error in the beginning, the sampling keeps throwing numerical errors.

My current understanding is: basically theta is initialized in a place where after invlink the model throws numerical error when evaluating log-joint or gradient. If this happens, we cannot "rewind" theta to a place which is still numerically OK because the initial state is numerically flawed. I haven't intensively looked into it, but I guess we might need some mechanism to resample the initial state if this is really what leads to the problem.

xukai92 · 2018-07-04T23:21:04Z

@ChrisRackauckas Turing.jl always has domain transforms. I didn't really change the functionality of Turing.jl in that PR but refactoring the core code in a way to ensure no unexpected side-effects happening, which I now believe was the reason why in-sampling numerical error was not correctly handled (either in rejection or adaptation). As I post in the comment above, there is still an issue on initialization, which is especially critical when the domain is very constrained. I think this is still a problem in the model in SciML/DiffEqBayes.jl#30 as I see there is a latter Travis job fails on DiffEqBayes.

xukai92 · 2018-09-13T12:14:53Z

DynamicHMC.jl has a good adaptation design: https://github.com/tpapp/DynamicHMC.jl/blob/master/src/sampler.jl

yebai · 2018-09-13T12:50:41Z

DymanicHMC is a very well designed and tested NUTS implementation together with adaption for preconditioning matrix. We can try to plug DynamicHMC into Turing and compare its results against our NUTS sampler. We can also try to refactor our NUTS sampler following DynamicHMC's design. Ideally, the sampler code should be testable and benchmarkable without dependency on other parts of Turing.

This also echoes the discussion in #456.

cc @willtebbutt @wesselb @mohamed82008

* Fix dep log in lad * Dont send opt res * Fix VarInfo.show bug * Fix auto tune * Change * to .* in leapfrog * temp fix type * Disable @suppress_err temporarily * Fix a dep * Workable ReverseDiff v0.1 done * Add RevDiff to REQUIRE * Fix bug in R-AD * Fix some bugs * Fix bugs * Update test * ReversedDiff.jl mutable bug fixed * Any to Real * update benchmark * Resolve mem alloc for simplex dist * Fix bug and improve mem alloc * Improve implementaion of transformations * Don't include compile time in benchk * Resolve slowness caused by use of vi.logp * Update benchmark files * Add line to load pickle * Bugfix with reject * Using ReverseDiff.jl and unsafe model as default * Fix bug in test file * Rename vi.rs to vi.rvs * Add Naive Bayes model in Turing * Add NB to travis * DA works * Tune init * Better init * NB MNIST Stan added * Improve ad assignment * Improve ad assignment * Add Stan SV model * Improve transform typing * Finish HMM model * High dim gauss done * Benchmakr v2.0 done * Modulize var estimator and fix transform.jl * Run with ForwardDiff * Enable Stan for LDA bench * Fix a bug in adapt * Improve some code * Fix bug in NUTS MH step (#324) * Add interface for optionally enabling adaption. * Do not adapt step size when numerical error is caught. * Fix initial epsilon_bar. * Fix missing t_valid. * Drop incorrectly adapted step size when necessary (#324) * Edit warning message. * Small tweaks. * reset_da ==> restart_da * address suggested naming * Samler type for WarmUpManager.paras and notation tweaks. * Bugfix and adapt_step_size == > adapt_step_size!

xukai92 · 2018-11-22T22:37:55Z

NUTS bug fixed in d0dafa9

yebai modified the milestones: Release 0.4, Release 0.3.1 Jul 26, 2017

xukai92 mentioned this issue Aug 11, 2017

Sampling interface update #326

Closed

yebai mentioned this issue Sep 25, 2017

Make Turing.jl 0.6 compatible #293

Closed

yebai modified the milestones: Release 0.3.1, Release 0.3.2 Nov 17, 2017

This was referenced Feb 27, 2018

AISTATS camera-ready plan #424

Closed

Problem in using turing_inference() for Lorenz equation. SciML/DiffEqBayes.jl#30

Closed

yebai added a commit that referenced this issue Apr 3, 2018

Fix bug in NUTS MH step (#324)

6eefadf

yebai added a commit that referenced this issue Apr 3, 2018

Drop incorrectly adapted step size when necessary (#324)

d47bfae

xukai92 mentioned this issue Apr 4, 2018

Refactoring HMC/NUTS related functions for better unit testing #431

Closed

yebai modified the milestones: Release 0.3.2, 0.5 Sep 13, 2018

yebai mentioned this issue Sep 13, 2018

Support DynamicHMC in Turing #527

Closed

yebai mentioned this issue Sep 19, 2018

Replace Stan.Adapt + WarmUpManager + adaption code with DynamicHMC.AbstractTuner #552

Closed

xukai92 closed this as completed Nov 22, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Robust adaption for NUTS #324

Robust adaption for NUTS #324

yebai commented Jul 26, 2017 •

edited by xukai92

Loading

xukai92 commented Aug 11, 2017

xukai92 commented Sep 24, 2017

yebai commented Feb 26, 2018

xukai92 commented Feb 26, 2018

xukai92 commented Mar 4, 2018 •

edited

Loading

yebai commented Apr 2, 2018

xukai92 commented Jul 3, 2018

xukai92 commented Jul 3, 2018

xukai92 commented Jul 4, 2018

ChrisRackauckas commented Jul 4, 2018

yebai commented Jul 4, 2018 •

edited

Loading

xukai92 commented Jul 4, 2018

xukai92 commented Jul 4, 2018

xukai92 commented Sep 13, 2018

yebai commented Sep 13, 2018

xukai92 commented Nov 22, 2018

Robust adaption for NUTS #324

Robust adaption for NUTS #324

Comments

yebai commented Jul 26, 2017 • edited by xukai92 Loading

xukai92 commented Aug 11, 2017

xukai92 commented Sep 24, 2017

yebai commented Feb 26, 2018

xukai92 commented Feb 26, 2018

xukai92 commented Mar 4, 2018 • edited Loading

yebai commented Apr 2, 2018

xukai92 commented Jul 3, 2018

xukai92 commented Jul 3, 2018

xukai92 commented Jul 4, 2018

ChrisRackauckas commented Jul 4, 2018

yebai commented Jul 4, 2018 • edited Loading

xukai92 commented Jul 4, 2018

xukai92 commented Jul 4, 2018

xukai92 commented Sep 13, 2018

yebai commented Sep 13, 2018

xukai92 commented Nov 22, 2018

yebai commented Jul 26, 2017 •

edited by xukai92

Loading

xukai92 commented Mar 4, 2018 •

edited

Loading

yebai commented Jul 4, 2018 •

edited

Loading