-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sanitise bin calculations #6212
base: main
Are you sure you want to change the base?
Conversation
Merge branch 'main' into sanitise_bins # Conflicts: # man/ggplot2-ggproto.Rd
binwidth <- dual_param(binwidth, list(NULL, NULL)) | ||
breaks <- dual_param(breaks, list(NULL, NULL)) | ||
fun = "mean", fun.args = list(), | ||
boundary = 0, closed = NULL, center = NULL) { | ||
bins <- dual_param(bins, list(x = 30, y = 30)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this is also done in setup_params()
, but then the default argument to compute_group()
should be bins = list(x = 30, y = 30)
, which causes a test to complain that there is a mismatch between compute_group()
defaults and stat_*()
constructor defaults.
expect_snapshot_error(bin_breaks_width(3)) | ||
expect_snapshot_error(comp_bin(dat, binwidth = letters)) | ||
expect_snapshot_error(comp_bin(dat, binwidth = -4)) | ||
|
||
expect_snapshot_error(bin_breaks_bins(3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer testing bin_breaks_width(3)
and bin_breaks_bins(3)
because these are always called from inside compute_bins()
, which protects against improper ranges.
Thanks for this - it'll be great to have the bin handing unified! I noticed that the documentation of the 2D binning layers has been a bit neglected and maybe this would be a good opportunity to bring them up to date? Specifically:
|
Thanks for these suggestions, I've adapted the docs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - great work
} | ||
check_numeric(breaks) | ||
bins <- bin_breaks(breaks, closed) | ||
return(bins) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we warn about ignoring the other arguments here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bins
usually gets prepopulated which results in false positive warnings. We can warn about the other arguments, but I think it'd be slightly inconsistent.
This PR aims to fix #6207.
However, the linked issue is merely an occasion to sanitise bin calculations across ggplot2.
Most binning functions in ggplot2 acted similar, but not quite identical. An example of this is
stat_bin(boundary)
has replaced thestat_bin(origin)
argument since ggplot2 2.1.0, butstat_bin_2d(origin)
continued to exist.The main changes in this PR:
compute_bins()
, which is now used by all binning functionality. It is modelled after thestat_bin()
functionality, as this was the most polished version.stat_bin()
is able to deal with 0-width data elegantly, so this fixedgeom_bin_2d()
should recognise 0-width data #6207.boundary
/center
instead oforigin
. They were never formal arguments to most binning stats.stat_bin(keep.zeroes)
introduced in Treatment options for zeroes in histograms #6139 was fulfilling the same/very similar role asstat_bin_2d(drop)
. However,stat_bin(drop)
has been deprecated (also since 2.1.0). This PR renamesstat_bin(keep.zeroes)
tostat_bin(drop)
, effectively resurrecting a formerly deprecated argument. There is no need for backward compatibility here sincekeep.zeroes
was introduced in this cycle of development.StatBin2d
is now a subclass ofStatSummary2d
, which prevents duplicated code.stat_bin(bins)
can be a function now, purely for symmetry with thestat_bin(breaks, binwidth)
arguments that can also be a function. You can use this to dobins = nclass.Sturges
for example.Reprex from the linked issue:
Created on 2024-12-04 with reprex v2.1.1