-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function to estimate modes in marginalizations #128
Conversation
Codecov Report
@@ Coverage Diff @@
## master #128 +/- ##
==========================================
- Coverage 30.21% 29.95% -0.27%
==========================================
Files 69 69
Lines 3667 3699 +32
==========================================
Hits 1108 1108
- Misses 2559 2591 +32
Continue to review full report at Codecov.
|
Hi @oschulz! I have extended one of the plots from the tutorial with a new plotting recipe. It is very easy to evaluate it and it makes the plot "Data, True Model and Best Fit" better. |
We should change the plotting code to use the new |
@oschulz But it seems the new function just calls the old |
Indeed, there is no code duplication. I call old |
Oh, yes - then can should move that code into the new exported function and remove the old one, right? |
Within the scope of this PR, let's replace the We can rename the type |
Which reminds me: We shouldn't use the term "local mode" anywhere in this context any longer (also in the plots). A local mode is a "bump" in a multi-modal distribution, lower than the global mode. But here we're talking about global modes in the marginalizations of the posterior. |
Ok, so maybe call it "marginalized mode(s)" ? |
Yes, or "marginal modes" or so - I sent an email. |
|
||
@series begin | ||
ribbon --> (y_ribbons[:,5],y_ribbons[:,6]) | ||
fillcolor --> colors[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the colors are inverted compared to our usual convention. So the innermost ribbon should be green, the outermost red.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! By the way, I was always quite confused by this choice of colors. For me, it would be much more natural to use intense and bright red in a center (demonstrating a high density) and light green on the periphery (showing low density). For example, look at this Python Seaborn package.
@VasylHafych I just used the "review" feature to give some comments on the plot recipe. I hope you can work with that. I used this feature for the first time, so just ask me if there is something unclear. The comments are just some possible improvements. But generally this is quite nice work! |
@oschulz Ok, this seems generally like a good idea, but I have some questions on how to realize this.
|
Good point. I believe it should return a marginal distribution. Let's replace BATHistogram by a new type MarginalDist, which would contain a This way, things would be nice an clean: The result of a marginalization could immediate be reused as a prior, and at the same time all necessary information would be available to plotting recipes, without type piracy. In the future, we can then add non-histogrammed, sample-based Distributions to "EmpiricalDistributions.jl", and offer different marginalization algorithms to allow the user to select which kind of marginalization they want. The |
Let's go with the term "marginal mode". |
Hi @Cornelius-G. Thank you for your helpful suggestions. I have included an update function in my latest commit. @oschulz, considering that your suggestions with renamings require more work to be done, I propose to move my plotting recipe to a separate pull request to decouple these two topics. |
Sure! |
@oschulz, a new pull request is ready (#129). @Cornelius-G, if you could rename One more question — should we rename these functions in BAT 2.0 to have name consistency? They all are acting on posterior and returning a DensitySampleVector. For example, we can call them:
|
Yes - it's not just a rename, though. |
Ok, I guess this will be on my agenda then. |
I guess it would be something like struct MarginalDist{N,D<:Distribution,VS<:AbstractValueShape}
dims::NTuple{N,Int}
dist::D
origvalshape::VS
end with |
Search the page for "oschulz started a review" |
0 results are matched (Reviewers: No reviews). I guess it is not visible for me or something. |
Oh, sorry, I forgot to click "submit review" :-) |
I was playing around with KernelDensity.jl and realized that we can use it to construct KDE of our marginal distributions and then use an optimizer to find marginal modes. In this case, we should be able to find marginal mode with better precision than just doing binning. What do you think about that? |
Yes, I've been thinking about KDE, too - that should go into the EmpiricalDistributions package though. We could add KDE-based empirical distributions, in addition to the bin-based ones. That would be very valuable! Especially because they would be differentiable, so suitable to use as priors with HMC and so on - bin-based step-function priors wouldn't do so well there. |
@oschulz, Yes, it sounds like an interesting task. I have already played a bit with Julia KDE, so I can volunteer to implement |
Sure, there's no rush - the nice thing is, if we do things properly in BAT, something like that can be added to EmpiricalDistributions later on and will then "just work". |
I just merged the other PR by @Cornelius-G - could you adept this PR to the changes, @VasylHafych ? |
Hi @oschulz. Here is an updated version. We now have Do you think we need these functions to operate on a prior, too? |
I think we decided to call this marginal mode, not local mode, right? Or is In addition to We should probably define |
sorry, this name remained from the old implementation. The function is now called Ready to merge. |
Sorry, we a little change in the output of |
Thanks, will merge as soon as tests are through. |
Thanks again! |
The function returns local mode given posterior samples. The optimal number of bins can be determined using this implementation from Plots.jl.