Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Avoid index materialization when DataFrame is created with un-named Series objects #10071

Merged
merged 4 commits into from
Jan 19, 2022

Conversation

galipremsagar
Copy link
Contributor

Fixes: #10070

This PR removed materializing of index incase of list-like un-named Series inputs are passed to DataFrame constructor.

@galipremsagar galipremsagar added bug Something isn't working 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. 4 - Needs cuDF (Python) Reviewer non-breaking Non-breaking change labels Jan 18, 2022
@galipremsagar galipremsagar requested review from vyasr and shwina January 18, 2022 21:06
@galipremsagar galipremsagar requested a review from a team as a code owner January 18, 2022 21:06
@galipremsagar galipremsagar self-assigned this Jan 18, 2022
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to simplify the code by just inlining _get_union_of_series_names since it's a single-use function and the indirection makes this behavior a little more confusing than it needs to be, but for now this LGTM.

@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer labels Jan 18, 2022
@codecov
Copy link

codecov bot commented Jan 18, 2022

Codecov Report

Merging #10071 (e7cdab2) into branch-22.02 (967a333) will decrease coverage by 0.07%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.02   #10071      +/-   ##
================================================
- Coverage         10.49%   10.41%   -0.08%     
================================================
  Files               119      119              
  Lines             20305    20541     +236     
================================================
+ Hits               2130     2139       +9     
- Misses            18175    18402     +227     
Impacted Files Coverage Δ
python/custreamz/custreamz/kafka.py 29.16% <0.00%> (-0.63%) ⬇️
python/dask_cudf/dask_cudf/sorting.py 92.66% <0.00%> (-0.25%) ⬇️
python/dask_cudf/dask_cudf/core.py 70.85% <0.00%> (-0.17%) ⬇️
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/api/types.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <0.00%> (ø)
python/cudf/cudf/io/parquet.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/dtypes.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/scalar.py 0.00% <0.00%> (ø)
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8d7330f...e7cdab2. Read the comment docs.

@galipremsagar
Copy link
Contributor Author

@gpucibot merge

@galipremsagar galipremsagar removed the request for review from shwina January 19, 2022 02:02
@galipremsagar
Copy link
Contributor Author

@gpucibot merge

@galipremsagar
Copy link
Contributor Author

Thanks @shwina !

@gpucibot merge

@galipremsagar
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit e416188 into rapidsai:branch-22.02 Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Avoid materializing index when list of Series are used to construct DataFrame
3 participants