-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for logL_birth #324
Fix for logL_birth #324
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #324 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 36 36
Lines 3041 3052 +11
=========================================
+ Hits 3041 3052 +11 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments inline.
anesthetic/samples.py
Outdated
self[x].plot(ax=ax, xlabel=xlabel, | ||
*args, **kwargs) | ||
if x == 'logL_birth': | ||
self[x].replace(-np.inf, np.nan | ||
).dropna().plot(ax=ax, xlabel=xlabel, | ||
*args, **kwargs) | ||
else: | ||
self[x].plot(ax=ax, xlabel=xlabel, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The
.dropna()
shouldn't be necessary. - Shall we separate inf replacement and plotting to reduce code duplication? We could have
if x == 'logL_birth': selfx = self[x].replace(...) else: selfx = self[x] selfx.plot(...)
- I feel like changing infs to nans shouldn't happen silently. Shall we at least have a warning that we are setting
-inf
tonan
for plotting purposes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropna is necessary for the 2D KDE, but not otherwiseo so I've removed those.
I've reduced code repetition as suggested.
With regard to infs to nans, I'm not that keen to have a warning for default behaviour (i.e. samples.plot_1d()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With regard to infs to nans, I'm not that keen to have a warning for default behaviour (i.e.
samples.plot_1d()
)
Well, I'd argue here that samples.plot_1d()
or samples.plot_2d()
is not default behaviour for nested sampling data (nor for anesthetic dataframes in general, which typically come with very many columns), otherwise this would have come up earlier, too. By default the user should specify which columns are to be plotted.
So I think a warning message could help the new user (who is most likely to call the plotting functions without specifying columns) to learn faster how to use the plotting command. This is also how this PR came to be: new user simply trying samples.plot_1d()
to see if it works...
I agree that too many warning messages are annoying and get in the way, but how often do we intentionally plot the posterior distribution of logL_birth
? Also, for people that indeed want to intentionally plot logL_birth
, they can easily avoid the warning by dropping infs first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am persuaded by this, so think we should have a warning message.
If we're going down a warning route, should we restrict this to logL_birth ? The code would be much neater if we just moved the relevant inf guards into kde_contour_plot_2d
and kde_1d
and kde_2d
. This would then be more consistent with the other functions behaviour, e.g. samples.plot()
, which just ignores nans and infs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we restrict this to logL_birth ?
We could do either logL_birth
specifically, or anything with infs generally. Happy with both, but the warning should say something along the lines "there are infs in columns [logL_birth, ...]".
The infs will also cause problems for histograms, so I think catching them in plot_1d
and plot_2d
is probably the right place...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My input from stumbling across this discussion via #332
Perhaps the default behavior is sensible (include nlive logl and loglbirth in the plot if nothing is specified), but it strikes me that the easiest thing to make the various kde errors less impactful is add a kind=scatter and make that the default for the specific case that the columns to plot are empty?
Note that this has come up in the past in #96, which this PR obseletes, so that test is now removed. |
The current question is whether we should change the default behaviour to ignoring the columns I would vote against that default change. Those columns are more specific
I'd keep the kde default, but am open to be convinced otherwise. |
Added |
I would also vote to keep things in |
…ns are actually ok here
…the newly added warnings and to keep our pytest output clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@williamjameshandley, if you are happy with my tweaks, feel free to squash and merge.
* test that limits get accurately updated by successive plots with logscale axes, adjusting to new data limits, see issue handley-lab#381 * fix typo from PR handley-lab#324 * bump version to 2.8.10 * update logscale plot limits to datalimits at the end, making use of `ax.dataLim`
* allow matplotlib 3.9 * bump version to 2.8.10 * Fix logscale limit updates (#383) * test that limits get accurately updated by successive plots with logscale axes, adjusting to new data limits, see issue #381 * fix typo from PR #324 * bump version to 2.8.10 * update logscale plot limits to datalimits at the end, making use of `ax.dataLim` * Fix macOS CI (#385) * attempt at fixing macOS CI by brew installing hdf5 * update from `miniconda@v2` to `miniconda@v3` * bump version to 2.8.11 * try newer `tables` version, which was previously restricted to 3.8.0 in #379 * Revert "attempt at fixing macOS CI by brew installing hdf5" This reverts commit 968bdb3. * Reapply "attempt at fixing macOS CI by brew installing hdf5" This reverts commit 204014a. Seems like this is needed after all, otherwise macOS is struggling to find a local HDF5. --------- Co-authored-by: Will Handley <[email protected]> * Fix to `color='C2'` plot_2d error post pandas 2 (#382) * Added failing test * bump version to 2.8.10 * Get color from self.color * Update README.rst * Update _version.py * Update README.rst * Update _version.py --------- Co-authored-by: Lukas Hergt <[email protected]> * bump version to 2.8.10 * bump version to 2.8.11 * bump version to 2.8.13 --------- Co-authored-by: Lukas Hergt <[email protected]> Co-authored-by: Will Handley <[email protected]>
Description
This PR fixes #310 by adding an explicit guard for logL_birth, dropping nans and infs before plotting
It's not the neastest solution, so any suggestions for improvements are welcome.
Checklist:
flake8 anesthetic tests
)pydocstyle --convention=numpy anesthetic
)python -m pytest
)