-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor lit, col, DataFrame, Series #369
Conversation
Merge remote-tracking branch 'origin/main' into tidy_up_lit_col_dataframe # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
news section is not complete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice refactoring, I just have a comment about the new arg explode
in date_range()
.
Otherwise, LGTM (I just had to fix tiny errors in tidypolars
but I can confirm there's no breaking change there).
The as_polars_series()
also works in tidypolars
, this looks nice
It might also be important to note that library(dplyr, warn.conflicts = FALSE)
library(ivs)
library(clock)
library(polars)
# a data.frame with {ivs} class + vctrs_rcrd
df <- tribble(
~x, ~start, ~end,
810L, "2020-12-15", "2020-12-16"
) |>
mutate(foo = iv(start, end))
df
#> # A tibble: 1 × 4
#> x start end foo
#> <int> <chr> <chr> <iv<chr>>
#> 1 810 2020-12-15 2020-12-16 [2020-12-15, 2020-12-16)
class(df$foo)
#> [1] "ivs_iv" "vctrs_rcrd" "vctrs_vctr"
# a data.frame with {clock} class + vctrs_rcrd
df2 <- tibble(foo = year_month_day(2019, 1, 30, 9) |> as_naive_time())
df2
#> # A tibble: 1 × 1
#> foo
#> <naive<hour>>
#> 1 2019-01-30T09
class(df2$foo)
#> [1] "clock_naive_time" "clock_time_point" "clock_rcrd" "vctrs_rcrd"
#> [5] "vctrs_vctr"
as_polars_series.vctrs_rcrd = function(x, ...) {
pl$DataFrame(unclass(x))$to_struct()
}
pl$DataFrame(df)
#> shape: (1, 4)
#> ┌─────┬────────────┬────────────┬─────────────────────────────┐
#> │ x ┆ start ┆ end ┆ foo │
#> │ --- ┆ --- ┆ --- ┆ --- │
#> │ i32 ┆ str ┆ str ┆ struct[2] │
#> ╞═════╪════════════╪════════════╪═════════════════════════════╡
#> │ 810 ┆ 2020-12-15 ┆ 2020-12-16 ┆ {"2020-12-15","2020-12-16"} │
#> └─────┴────────────┴────────────┴─────────────────────────────┘
pl$DataFrame(df2)
#> shape: (1, 1)
#> ┌─────────────────────┐
#> │ foo │
#> │ --- │
#> │ struct[2] │
#> ╞═════════════════════╡
#> │ {2.1475e9,430233.0} │
#> └─────────────────────┘ When this is merged, I'd like to experiment a bit in |
That is about the same requirement for arrow/polars struct so there is a fair chance it would work for any It seem if polars defines these methods e.g. the user (external package?) can still override them. #given polars internally keeps this definit in series__trait.R
# as_polars_series.vctrs_rcrd = function(x, ...) {
# pl$DataFrame(unclass(x))$to_struct()
# }
library(dplyr, warn.conflicts = FALSE)
library(ivs)
library(polars)
library(tidypolars)
#> Warning: package 'tidypolars' was built under R version 4.3.1
#> Registered S3 method overwritten by 'tidypolars':
#> method from
#> print.DataFrame polars
t_date <- as.Date("2020-05-05")
test_df <- tibble(id = 1:5,
grp = c("a", "a", "b", "b", "b"),
start = rep(t_date+1:5),
end = rep(t_date+11:7))
# adding an iv-variable to the dataframe
test_df_iv <- test_df |>
mutate(range = ivs::iv(start, end))
pl$DataFrame(test_df_iv)
#> shape: (5, 5)
#> ┌─────┬─────┬────────────┬────────────┬─────────────────────────┐
#> │ id ┆ grp ┆ start ┆ end ┆ range │
#> │ --- ┆ --- ┆ --- ┆ --- ┆ --- │
#> │ i32 ┆ str ┆ date ┆ date ┆ struct[2] │
#> ╞═════╪═════╪════════════╪════════════╪═════════════════════════╡
#> │ 1 ┆ a ┆ 2020-05-06 ┆ 2020-05-16 ┆ {2020-05-06,2020-05-16} │
#> │ 2 ┆ a ┆ 2020-05-07 ┆ 2020-05-15 ┆ {2020-05-07,2020-05-15} │
#> │ 3 ┆ b ┆ 2020-05-08 ┆ 2020-05-14 ┆ {2020-05-08,2020-05-14} │
#> │ 4 ┆ b ┆ 2020-05-09 ┆ 2020-05-13 ┆ {2020-05-09,2020-05-13} │
#> │ 5 ┆ b ┆ 2020-05-10 ┆ 2020-05-12 ┆ {2020-05-10,2020-05-12} │
#> └─────┴─────┴────────────┴────────────┴─────────────────────────┘
#user override polars definition, could a external package do that as well?
as_polars_series.vctrs_rcrd = function(x, ...) {
pl$DataFrame(unclass(x))$to_series(0L) #get first column only
}
pl$DataFrame(test_df_iv)
#> shape: (5, 5)
#> ┌─────┬─────┬────────────┬────────────┬────────────┐
#> │ id ┆ grp ┆ start ┆ end ┆ range │
#> │ --- ┆ --- ┆ --- ┆ --- ┆ --- │
#> │ i32 ┆ str ┆ date ┆ date ┆ date │
#> ╞═════╪═════╪════════════╪════════════╪════════════╡
#> │ 1 ┆ a ┆ 2020-05-06 ┆ 2020-05-16 ┆ 2020-05-06 │
#> │ 2 ┆ a ┆ 2020-05-07 ┆ 2020-05-15 ┆ 2020-05-07 │
#> │ 3 ┆ b ┆ 2020-05-08 ┆ 2020-05-14 ┆ 2020-05-08 │
#> │ 4 ┆ b ┆ 2020-05-09 ┆ 2020-05-13 ┆ 2020-05-09 │
#> │ 5 ┆ b ┆ 2020-05-10 ┆ 2020-05-12 ┆ 2020-05-10 │
#> └─────┴─────┴────────────┴────────────┴────────────┘ Created on 2023-08-31 with reprex v2.0.2 |
#' @export | ||
as_polars_series.POSIXlt = function(x, ...) { | ||
as.POSIXct(x) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not implemented correctly.
as_polars_series.vctrs_rcrd = function(x, ...) { | ||
pl$DataFrame(unclass(x))$to_struct() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not seem tested.
#368 showed some short comming in the conversions in polars
I took it as an occasion to tidy up
tidying
new features