-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update group_by()
algorithm to utilize vec_locate_sorted_groups()
#6018
Closed
DavisVaughan
wants to merge
6
commits into
tidyverse:main
from
DavisVaughan:feature/group-by-radix-order
Closed
Update group_by()
algorithm to utilize vec_locate_sorted_groups()
#6018
DavisVaughan
wants to merge
6
commits into
tidyverse:main
from
DavisVaughan:feature/group-by-radix-order
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
DavisVaughan
commented
Oct 5, 2021
Comment on lines
+3
to
+15
* `group_by()` uses a new algorithm for computing and ordering groups. This is | ||
often faster than the previous approach, especially when there are many | ||
groups. In most cases, there should be no user visible changes. However, | ||
character grouping columns are now ordered in the C locale rather than the | ||
system locale, for performance. This change shows up in functions that use | ||
the group data, such as `summarise()` or `group_split()`, where the order | ||
of the results may have changed due to the usage of a different locale. If | ||
the ordering of the results of a call to `summarise()` is important (i.e. | ||
for constructing a table to be used in a report), you should explicitly call | ||
`arrange()` after `summarise()` to sort as needed. If needed, the global | ||
option `dplyr.legacy_group_by_locale` can be set to `TRUE` to revert to the | ||
old algorithm, but this should be used extremely sparingly and will be | ||
removed in a future version of dplyr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to compact this NEWS bullet and link to the tidyup, like we did in the arrange()
PR
DavisVaughan
commented
Oct 5, 2021
Comment on lines
-5
to
+19
Message <simpleMessage> | ||
Message <rlib_message_name_repair> | ||
New names: | ||
* a -> a...1 | ||
* b -> b...2 | ||
* a -> a...3 | ||
* b -> b...4 | ||
* `a` -> `a...1` | ||
* `b` -> `b...2` | ||
* `a` -> `a...3` | ||
* `b` -> `b...4` | ||
|
||
# bind_cols() handles unnamed list with name repair (#3402) | ||
|
||
Code | ||
df <- bind_cols(list(1, 2)) | ||
Message <simpleMessage> | ||
Message <rlib_message_name_repair> | ||
New names: | ||
* NA -> ...1 | ||
* NA -> ...2 | ||
* `` -> `...1` | ||
* `` -> `...2` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These come from using dev vctrs
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Probably merge after #5942 when we are working on dplyr 1.1.0
Closes #5808
Closes #4406