-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify dplyr_reconstruct()
#5160
Conversation
I think this also fixes #5132? |
That issue links to #4617, which has this example: library(dplyr, warn.conflicts = FALSE)
df1 <- tibble(
a = 1:2,
b = 2:1
)
df2 <- tibble(
a = structure(1:2, foo = "bar"),
c = 2:1
)
left_join(df1, df2, by = "a")$a
#> [1] 1 2
left_join(df2, df1, by = "a")$a
#> [1] 1 2
#> attr(,"foo")
#> [1] "bar" That behavior isn't touched by this PR. But honestly I sort of feel like this behavior is right. You get the attributes on the columns of the first input? |
With |
attr_new <- attributes(data) | ||
attrs <- attributes(template) | ||
|
||
to_copy <- setdiff(names(attr_old), c("class", "row.names", "names", ".drop")) | ||
attr_new[to_copy] <- attr_old[to_copy] | ||
attrs$names <- names(data) | ||
attrs$row.names <- .row_names_info(data, type = 0L) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would benefit from a lightweight C function that walks the pairlist, replaces in place and appends if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that might be a useful helper for vctrs or similar. It doesn't need to block this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DavisVaughan!
Closes #5158
This drastically simplifies
dplyr_reconstruct.data.frame()
by taking an approach similar todplyr_col_modify.data.frame()
, which just restores the attributes directly.I've also pulled over the tests from #5158, but I think the attributes one might have been backwards. I think we care about keeping the attributes from
template
, notdata
.The job of the data frame method of
dplyr_reconstruct()
is now very simple and well defined. It keeps all attributes oftemplate
except:names
in case columns were modifiedrow.names
in case rows were modifiedThere was some code about the
.drop
attribute in the olddplyr_reconstruct.data.frame()
, but I don't think it belonged there because we have agrouped_df
method as well. It also wouldn't have worked right because the way to get to the drop attribute is through the group data of the data frame. It's not on the data frame directly.