Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support descending sort for character and other non-numeric data #175

Merged
merged 7 commits into from
Jun 25, 2024

Conversation

toppyy
Copy link
Contributor

@toppyy toppyy commented Jun 2, 2024

Closes #92.

Creating a desc macro feels a bit like a work-around as it is created for the sole purpose of not raising a function-not-found -error. Couldn't think of an easy alternative. What do you think @krlmlr ?

@krlmlr
Copy link
Member

krlmlr commented Jun 2, 2024

Thanks. We need to adopt the translation of desc() and remove the macro. The translation lives in relational.R in the rel_translate() function. It's not very clean, and this is a shame because this is the space where we want and need to improve. Happy to guide you through and improve the code organization step by step.

@toppyy
Copy link
Contributor Author

toppyy commented Jun 2, 2024

Thanks! Before diving into the code organization of the rel_translate() function, I thought about what to translate desc into. To me it seems that one solution is to translate it into an expression reference of it's first argument with an attribute dictating the descending sort order.

So, for example, translating the expression desc(some_column) would return something like:
translate

Then we could use the attribute "desc" to derive the sort order in rel_order.duckdb_relation()

Would this make sense?

@krlmlr
Copy link
Member

krlmlr commented Jun 2, 2024

Thanks, good catch. I think we can just special-case desc() in arrange(), this doesn't need to live in rel_translate() . In other words, failing to translate desc() in all cases except when used as the surrounding call in an arrange() expression is fine.

@toppyy
Copy link
Contributor Author

toppyy commented Jun 5, 2024

I pushed a new version that handles calls to desc() in arrange(). We extract the sort order first and then manipulate the arguments to arrange so that calls to desc() are removed before passing the arguments for translation.

The arguments ("dots") is iterated twice which is not ideal, but I found it to be cleaner than manipulating dots-argument in place.

@toppyy
Copy link
Contributor Author

toppyy commented Jun 17, 2024

Do you have comments on the above @krlmlr ? If this was already on your todo-list, sorry for the unnecessary ping.

@krlmlr
Copy link
Member

krlmlr commented Jun 17, 2024

Thanks for the ping. I was waiting for duckdb 1.0.0 to hit CRAN, which now happened. We'll need duckdb (>= 1.0.0) in DESCRIPTION, I've started the checks to see if the new code passes checks.

Copy link
Member

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks workable, just a few remarks on style.

R/arrange.R Outdated
Comment on lines 22 to 26
ascending <- sapply(dots,function(dot) {
expr <- get_expr(dot)
if (typeof(expr) != "language") return(TRUE)
return(expr[[1]] != "desc")
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • map_lgl()
  • is.call() perhaps?
  • Avoid return() as last statement
  • Extract this and the next map into a function that returns a data frame (or a list with two parallel vectors) and uses ... a for loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I refactored the extraction of sort order and manipulation of dots into a separate function.

Also the parameter ascending in rel_order() now has NULL as a default argument (the tests for previous commit failed due to not having a default argument)

Copy link
Member

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, let me play with it and finish it.

R/arrange.R Outdated Show resolved Hide resolved
R/arrange.R Outdated
if (length(expr) > 2) cli::cli_abort("desc() must be called with exactly one argument.")

ascending[i] <- FALSE
dots[[i]] <- new_quosure(expr[[2]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use the original environment here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Good point.

@krlmlr krlmlr changed the title fix: Fix descending sort for character data feat: Support descending sort for character and other non-numeric data Jun 24, 2024
@krlmlr krlmlr merged commit 4f4d682 into tidyverse:main Jun 25, 2024
9 checks passed
@krlmlr
Copy link
Member

krlmlr commented Jun 25, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix descending sort for character data
2 participants