-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DISCUSSION] -- Change defaults of draws
and tune
in pm.sample
?
#3854
Comments
These seem like more sensible defaults for a wider range of models. |
+1 My misgiving is I think one of pymc3's strengths is "time to first sample", especially with smaller models, and this hurts that. Conversely, this'll make everything take ~3x as long, but most small models take ~1s to fit, and there is a lot more we can do with 1k samples (and 2k warmup steps!) |
As a corollary, I think we should use some of the work @eigenfoo and @dfm did for mass matrix adaptation, and at least use an expanding window to estimate the posterior variance. I have some experiments I was doing with 1000 tuning steps to figure out a good set of default parameters, and I will see if I can dig that up... |
I think that's sensible but we probably would need to specify the parameter for lots of tests (otherwise they might time out) |
Good to hear! I will start working on the PR right away. Changing the defaults seems quite easy. Playing with mass matrix adaptation and adapting tests will be harder but I'm happy to work on that under Colin's and Junpeng's supervision. I agree with @ColCarroll that this increases "time to first sample", but:
|
Many models are totally fine with 500 tuning steps and give reasonable posterior estimations with 500 samples. Going from 500 to 2000 tuning seems like quite a dramatic shift and make the biggest downside of Bayes (slowness) even worse. I could see an argument for going to 1000 tuning and 1000 samples. |
|
@AlexAndorra No model I ever built samples that quickly. However, you do point to a valid argument: Slow and complex models almost always require more tuning. Fast and simple models might be OK with 500 tuning steps, but in that case you don't really care if you're sampling 2000 instead because it's fast. However, even for slow and complex models do I rarely need more than 1000 tuning steps. Would like to hear @aseyboldt's perspective too. |
I am in favor of increasing tuning to 1000, and keeping samples at 500, or even tuning at 1500, samples at 500. |
Ha ha yeah I agree @twiecki: none of my real models sample that quickly either. For those, I always have to use more tuning steps and more chains. But that's my point: defaults don't really matter for these models -- they do for simple models however, and you perfectly summed up my argument :) |
I always felt both the tuning and draws default is a bit on the low side. One problem with only few draws is that the quality control only really works well if the number of draws isn't too small. It's quite possible for example to miss divergences when you only have 500 samples and two chains. If you have too few tuning steps the sampler should tell you that your model did not converge. If you have too few draws, the sampler might not tell you that, and we should care about that case much more. |
|
I don't actually know of too many examples where increasing tuning from
1000 to 1500 helps. (The target_accept warnings are perhaps a bit too
stringent, and we do seem to have some strange issues with the step size
adaptation independent of the number of tuning steps).
Maybe that is biased by the kind of models I usually use though.
Am Fr., 27. März 2020 um 18:10 Uhr schrieb Alexandre ANDORRA <
[email protected]>:
…
- Interesting, thanks @aseyboldt <https://github.com/aseyboldt>! From
the different comments, I think 1000 draws / 1000 tune or 1000 / 1500 would
be appropriate defaults.
- I have a preference for the second option because my assumption is
that the overwhelming majority of beginners don't read / understand the
sampler's warnings. So, we kind of have to be "libertarian paternalists" to
nudge them -- and 1500 tune would prehemptively solve more issues than
1000, without dragging down speed.
- I agree on the 4 chains default.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3854 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOLSHN3ABP4ZXFKUB7N43TRJTMXTANCNFSM4LTWXLSA>
.
|
I don't think this is a terribly important decision because I would never ever recommend that users (new or experienced) call |
Ah really? Why?
Being able to do that for simple / exploratory models sounds attractive to
me.
Le sam. 28 mars 2020 à 00:39, Chris Fonnesbeck <[email protected]> a
écrit :
… I don't think this is a terribly important decision because I would never
ever recommend that users (new or experienced) call sample without
explicitly passing both of these arguments.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3854 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHIJMTCFQY7GVSTZOFYNQB3RJU2JJANCNFSM4LTWXLSA>
.
|
@fonnesbeck But that's the inference button (TM) ;). I think 1000 tuning and samples at 500 is reasonable. With 2 cores you get 1000 and with 4 2000 which "should be enough for everyone". |
After reading this, I'm changing my vote to be 1000 tune/1000 sample. I realized our warmup does ok after 500, and usually stops improving before 1000. (I'd like to change that, but we aren't there yet!) I think spending more draws tuning than sampling feels too unintuitive to be a default. |
If we really want it to be an inference button, then it should be dynamic, making the choice of tuning according to the complexity of the model and the amount of data. Having a fixed value gives the impression that there is a rule.of thumb for the minimum number of draws, which of course there isn't. |
Well that would indeed be awesome! Is that even possible, in theory? Are there papers laying out this possibility? Regarding the defaults, we entered territories where I'm no longer qualified to give my opinion. It seems like 500 draws / 1000 tune or 1000 / 1000 are the most popular. Once you reach a consensus I'll implement it. And what about @aseyboldt's proposal to default to 4 chains instead of 2? |
Let's do 1000 tune / 1000 samples with 2 chains. |
Noted, I'll amend the PR 👌 |
Hi guys!
From what I understood, tuning samples tend to be more important than draws in the development steps of the Bayesian framework. This is not reflected in the current defaults of
draws
andtune
inpm.sample
-- both default to 500.So, why don't we change these to
draws=1000
andtune=2000
? These are usually good defaults that @fonnesbeck often advise, and it would drive home the point that, at least in development, one should care more about tuning samples than draws.If you think this would be a sensible change, I would be happy to make a PR.
Stay home and stay healthy,
PyMCheers ✌️
The text was updated successfully, but these errors were encountered: