-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple uniqueness checks for sorting-related functions #3312
Merged
Merged
Changes from 23 commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
376b569
Basic support for checkunique kwarg in sorting-related functions
alonsoC1s b9fc81d
Added tests for the checkunique kwarg
alonsoC1s c35b338
Fixing Tests + Changing Error to ArgumentError
alonsoC1s 9de0421
Accounting for the case with order(...) clauses
alonsoC1s 01761ee
Improved tests for selectors with checkunique
alonsoC1s add0ad6
Fixed broken uniqueness testing
alonsoC1s e310687
Merge branch 'JuliaData:main' into simple_uniqueness_sorting
alonsoC1s 7de6e70
Added missing tests for sort
alonsoC1s 262db54
Fixed Docstrings + Replace == with ===
alonsoC1s d433cf1
Apply suggestions from code review
alonsoC1s f7d2ad9
Fixed sorting with order kwarg
alonsoC1s bdff445
Fixed complex sorting order detection
alonsoC1s d68dda9
Improved tests for complex order clauses
alonsoC1s 651eab6
Merge branch 'main' into simple_uniqueness_sorting
alonsoC1s f49023d
Fixed logic error in is_complex + added tests
alonsoC1s 73a31bb
Apply suggestions from code review
alonsoC1s 5ef1e4d
More robust complexity checks for Order.Orderings + tests
alonsoC1s e5bd016
Update src/abstractdataframe/sort.jl
alonsoC1s bac221b
Merge branch 'main' into simple_uniqueness_sorting
alonsoC1s b672135
Added error message for unsupported Order types
alonsoC1s 8746370
Added complexity checks and tests for other subtypes of Order
alonsoC1s 1576c07
Merge branch 'main' into simple_uniqueness_sorting
alonsoC1s 2a71db8
Finalizing checks for multiple Ordering subtypes + tests
alonsoC1s ce79d07
Update src/abstractdataframe/sort.jl
bkamins 536926a
Update src/abstractdataframe/sort.jl
alonsoC1s cf59d15
Merge branch 'main' into simple_uniqueness_sorting
alonsoC1s d333474
Stylistic changes
alonsoC1s ac25bb5
Merge branch 'main' into simple_uniqueness_sorting
alonsoC1s e002b7b
Update src/abstractdataframe/sort.jl
bkamins 56c228e
Removed explanation from sort docstring
alonsoC1s 929cacd
Apply suggestions from code review
bkamins File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the reasoning behind "as their application can create duplicate items inadvertently". Doesn't that make that argument precisely useful to protect against inadvertent introduction of duplicate elements? We don't have to support this right away, but no need to provide misleading reasons for that. ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the comments!
My goal was to explain that the disallowed combination is not unsupported because we ran out of time, but rather than the process of sorting after the use of the
by
andlt
functions is a bit unintuitive, as the actual sorting happens on numbers that the user never sees. This would make it harder(er) to reason about why a particular dataframe and an exact copy look different even after being sorted the same way (which I think isn't supposed to happen because sorting is stable, but was something brought up on the issue that motivates this PR)In other words, the message tries to say "we assume you are concerned about duplicates, and it isn't always obvious why the use of
by
andlt
can create "invisible duplicates" at the time of sorting, and that is why we still don't support it at the moment"I don't know if I convey that properly in the current docstring, I'd love to hear your take
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just propose to remove the justification part. It is not strictly needed in the docstring (it needs to describe precisely what is being supported). It is just not supported now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. I'll remove them and rework the docstrings a bit