-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Robust adaption for NUTS #324
Comments
Notebook for simple illustration of adaptation issue: https://github.com/xukai92/TuringDemo/blob/master/look-into-adapt.ipynb |
https://github.com/stan-dev/stan/blob/develop/src/stan/mcmc/var_adaptation.hpp shows that the way Stan uses Phase II window is: only set the computing-on-the-fly pre-cond matrix when setting up a new interval. |
@xukai92 Let's try to isolate issues of adaption of step size and covariance and solve them one at a time. I suggest we first try to disable adaption of step-size (e.g. use a small enough step-size) and make sure covariance adaption is robust through using the Welford trick (see #289 (comment)); Then, we come back to see why step-size is adaption is fragile. |
I see. that makes sense. |
Note: there is a related issue in DiffEqBayes.jl (SciML/DiffEqBayes.jl#30).
|
@xukai92 Any update on this issue? |
* Fix dep log in lad * Dont send opt res * Fix VarInfo.show bug * Fix auto tune * Change * to .* in leapfrog * temp fix type * Disable @suppress_err temporarily * Fix a dep * Workable ReverseDiff v0.1 done * Add RevDiff to REQUIRE * Fix bug in R-AD * Fix some bugs * Fix bugs * Update test * ReversedDiff.jl mutable bug fixed * Any to Real * update benchmark * Resolve mem alloc for simplex dist * Fix bug and improve mem alloc * Improve implementaion of transformations * Don't include compile time in benchk * Resolve slowness caused by use of vi.logp * Update benchmark files * Add line to load pickle * Bugfix with reject * Using ReverseDiff.jl and unsafe model as default * Fix bug in test file * Rename vi.rs to vi.rvs * Add Naive Bayes model in Turing * Add NB to travis * DA works * Tune init * Better init * NB MNIST Stan added * Improve ad assignment * Improve ad assignment * Add Stan SV model * Improve transform typing * Finish HMM model * High dim gauss done * Benchmakr v2.0 done * Modulize var estimator and fix transform.jl * Run with ForwardDiff * Enable Stan for LDA bench * Fix a bug in adapt * Improve some code * Fix bug in NUTS MH step (#324) * Add interface for optionally enabling adaption. * Do not adapt step size when numerical error is caught. * Fix initial epsilon_bar. * Fix missing t_valid. * Drop incorrectly adapted step size when necessary (#324) * Edit warning message. * Small tweaks. * reset_da ==> restart_da * address suggested naming * Samler type for WarmUpManager.paras and notation tweaks. * Bugfix and adapt_step_size == > adapt_step_size!
#324 (comment) works on my local. |
@yebai I tried the notebook with master branch again on my local and a remote linux machine. It seems that the adaptation is working now as long as the initialization is fine, i.e. the sampling only keeps throwing numerical error if the initialization is bad. Do you mind try it again on your local to see if you agree on this? |
The downstream package DiffEqBayes.jl had a test relying on Turing.jl which suffered from adaptation issues before also passed with the current master (related issue: SciML/DiffEqBayes.jl#30, related PR: SciML/DiffEqBayes.jl#48) |
Could you describe what changed to fix it? We thought the issue was related to parameters going negative when they were supposed to be non-negative, but didn't have a nice way to do domain transforms (are these going to be added in Turing?) |
@xukai92 thanks, I will do another test and come back with my findings. Ps. can you clarify/give an example what do you mean by bad initializations? |
@yebai What I observed is that: 1) if there are numerical errors during the sampling, our adaptation can fix it (i.e. the numerical error disappears after few iterations, which was not the case before); 2) if there is a numerical error in the beginning, the sampling keeps throwing numerical errors. My current understanding is: basically |
@ChrisRackauckas Turing.jl always has domain transforms. I didn't really change the functionality of Turing.jl in that PR but refactoring the core code in a way to ensure no unexpected side-effects happening, which I now believe was the reason why in-sampling numerical error was not correctly handled (either in rejection or adaptation). As I post in the comment above, there is still an issue on initialization, which is especially critical when the domain is very constrained. I think this is still a problem in the model in SciML/DiffEqBayes.jl#30 as I see there is a latter Travis job fails on DiffEqBayes. |
DynamicHMC.jl has a good adaptation design: https://github.com/tpapp/DynamicHMC.jl/blob/master/src/sampler.jl |
This also echoes the discussion in #456. |
* Fix dep log in lad * Dont send opt res * Fix VarInfo.show bug * Fix auto tune * Change * to .* in leapfrog * temp fix type * Disable @suppress_err temporarily * Fix a dep * Workable ReverseDiff v0.1 done * Add RevDiff to REQUIRE * Fix bug in R-AD * Fix some bugs * Fix bugs * Update test * ReversedDiff.jl mutable bug fixed * Any to Real * update benchmark * Resolve mem alloc for simplex dist * Fix bug and improve mem alloc * Improve implementaion of transformations * Don't include compile time in benchk * Resolve slowness caused by use of vi.logp * Update benchmark files * Add line to load pickle * Bugfix with reject * Using ReverseDiff.jl and unsafe model as default * Fix bug in test file * Rename vi.rs to vi.rvs * Add Naive Bayes model in Turing * Add NB to travis * DA works * Tune init * Better init * NB MNIST Stan added * Improve ad assignment * Improve ad assignment * Add Stan SV model * Improve transform typing * Finish HMM model * High dim gauss done * Benchmakr v2.0 done * Modulize var estimator and fix transform.jl * Run with ForwardDiff * Enable Stan for LDA bench * Fix a bug in adapt * Improve some code * Fix bug in NUTS MH step (#324) * Add interface for optionally enabling adaption. * Do not adapt step size when numerical error is caught. * Fix initial epsilon_bar. * Fix missing t_valid. * Drop incorrectly adapted step size when necessary (#324) * Edit warning message. * Small tweaks. * reset_da ==> restart_da * address suggested naming * Samler type for WarmUpManager.paras and notation tweaks. * Bugfix and adapt_step_size == > adapt_step_size!
NUTS bug fixed in d0dafa9 |
This is a umbrella issue for adaption issues of the NUTS algorithm.
Before these three
The text was updated successfully, but these errors were encountered: