-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow cudf.concat to process a dictionary of frames #15115 #15126
Conversation
/ok to test |
/ok to test |
1 similar comment
/ok to test |
d = { | ||
'first': cudf.DataFrame({'A': [1, 2], 'B': [3, 4]}), | ||
'second': cudf.DataFrame({'A': [5, 6], 'B': [7, 8]}), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we parametrize to include a few additional test cases?
- A single entry in
d
- A third entry in
d
with a different number of elements, e.g.,:d = { 'first': pd.DataFrame({'A': [1, 2], 'B': [3, 4]}), 'second': pd.DataFrame({'A': [5, 6], 'B': [7, 8]}), 'third': pd.DataFrame({'C': [1, 2, 3]}) }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't single entry in d
and a third entry in d
with just 1 key-value pair the same thing? Are you saying there should be a test for d = {'first': cudf.DataFrame({'A': [1, 2], 'B': [3, 4]})}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - I mistyped! I've edited my comment
@@ -1921,3 +1921,21 @@ def test_concat_mixed_list_types_error(s1, s2): | |||
|
|||
with pytest.raises(NotImplementedError): | |||
cudf.concat([s1, s2], ignore_index=True) | |||
|
|||
|
|||
def test_horizontal_concat_dictionary(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add tests for concatenating dict-of-Series and dict-of-Index objects?
If those cases don't work currently - that's OK. We should at least raise a NotImplementedError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add tests for these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this is looking pretty good! Just left a comment asking for more tests.
Looks like the PR is failing style checks. Can you run locally:
and fix up any style issues reported? |
@shwina I screwed up and deleted the branch by mistake which I think closed the PR. How do I start this up again? |
Here's what I would do; the following creates a local branch called
Then you can push the branch Does that help? Feel free to ask if you have any questions! |
@shwina just to clarify, do I create a new pull request? |
Yes I think that's best |
Closes #15115.
Unlike
pandas.concat
,cudf.concat
doesn't work with a dictionary of objects. The following code raises an error.This commit resolves this issue.
Checklist