-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Formally switch to radix ordering in vec_order()
#1375
Formally switch to radix ordering in vec_order()
#1375
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are changing the default I think we should consider a C sort variant with letters swapped so that it's closer to POSIX sorting. It's probably nearly as efficient and will sort more usefully with English datasets (and other alphbetical languages).
Eventually we'll have character classes that attach a locale to a vector. Compared to sorting with the current locale, these will have the advantage of producing a consistent sort across platforms. The most obvious way to implement this class is with an order proxy. So I'm wondering what is the interaction between |
We're going to argue that this change increases consistency with stringr and data.table; I think using some custom sort would be a step in the wrong direction. |
- Mark it as internal - Update internal usage of `vec_order()` to instead point towards `vec_order_radix()` where applicable
Also folds `order-radix.R` into `order.R`
Since this is also used in `?vec_rank`, and mentioning a Details section there would be confusing
Like `stringi::stri_sort_key()` now does, as of this commit gagolews/stringi@606296a
204c5ea
to
cf52adc
Compare
Prepares for r-lib/vctrs#1375
Superseded by #1435 which is a clean rebase of this on master |
Closes #1302
vec_order()
to newvec_order_base()
vec_order_radix()
tovec_order()
order-radix.R
intoorder.R
(similar with tests, and C side)...
to bothvec_order()
andvec_sort()
, which feels appropriate given that we have so many optional arguments to them now, and could potentially add more.Will likely wait quite a while to merge this PR, as it will immediately affect dplyr and slider