-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite case_when()
using vec_case_when()
#6300
Rewrite case_when()
using vec_case_when()
#6300
Conversation
#' `.default` participates in the computation of the common type with the RHS | ||
#' inputs. | ||
#' | ||
#' If `NULL`, the default, a missing value will be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use .default = NA
as default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think of it as "you didn't supply a .default
" so nothing fills in the locations where a default value would otherwise be used. This makes sense to me if you think of the output as being the result of starting with vec_init(ptype, size)
and then filling in the values of the cases. When you don't supply a .default
, you just get the missing values that resulted from the vec_init()
call
Co-authored-by: Lionel Henry <[email protected]>
Hints at the fact that this has to be tailored to your specific usage of `case_when()`, making it hard to generalize (like with a `.missing` argument)
2ed15ce
to
44ad4ac
Compare
Closes #6261
Closes #6206
Closes #5106
Closes #6145
Closes #6225
Supersedes #6286
This is a more conservative update of
case_when()
vs what was done in #6286. There shouldn't be any breaking changes here (but ill run revdeps), and it should make everyone pretty happy.The intention is to move
vec_case_when()
to vctrs and rewrite in C, but getting the semantics right in R is more important for now.vec_case_when()
will also be used to backif_else()
andcoalesce()
for sure. I think it could also backna_if()
.Important notes:
I have decided to keep the formula interface. I think my main complaint about
case_when()
is less about the formula interface, and more aboutTRUE ~
. It also doesn't seem like we are going to add a multi-conditionmutate_when()
that would use anfcase()
like interface, so there is less pressure to try and be consistent with something like that.I have added a
.default
argument, with the intention of being a direct replacement forTRUE ~
. In a future release, I would like to start deprecating usage ofTRUE ~
so we can stop recycling all of the inputs against each other. Providing the.default
argument and switching all our docs over to it is the first step towards that.New
.ptype
and.size
arguments, useful for predictable outputvec_case_when()
now takesconditions
andvalues
lists, rather than...
with anfcase()
interface. Providing two parallel lists has actually proven to be much more useful when programming withvec_case_when()
because typically you have a user facing interface that collects those two lists.After I move
vec_case_when()
to vctrs, I will come back and mention that if you want to program withcase_when()
, then you might want to look atvctrs::vec_case_when()
instead.The next section is a reference for future us if we ever think about changing how
.default
works with regards toNA
.An extremely important point is that
.default
works exactly the same asTRUE ~
. We debated for a long time if.default
should only be applied where all conditions areFALSE
(rather than forFALSE
orNA
, as is currently done), and we also considered adding a.missing
argument for handling the case where at least one condition returnedNA
and none returnedTRUE
. This ended up being more surprising in some cases than just handling the missing values manually. Ultimately there are too many ways that missing values can appear / disappear from the conditions, so I'm convinced that you have to have an explicit way to handle them that is specific to your usage ofcase_when()
For example, here is where changing
.default
and adding.missing
would have worked wellBut it gives confusing results when you use
%in%
or complex conditions with&
- these cases remove the missing values from the original input: