Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Fix/read parquet for empty DataFrame #6294

Merged

Conversation

marlenezw
Copy link
Contributor

This commit Closes #6167

I've added a try/except statement in parquet.pyx and updated the unit tests in parquet.py. This stops a Key error from appearing when a parquet file is created from an empty pandas data frame and then read using cudf.

raydouglass and others added 30 commits March 30, 2020 11:03
…-empty series. Changed test so dtypes for empty series are ignored.
…ithub.com/marlenezw/cudf into feature/concat_empty_and_non_empty_series

I'd like to merge my remote branch with my local branch
…-empty series. Changed test so dtypes for empty series are ignored.
Added this file by mistake from the old 0.14 cudf branch. Deleting it because it is not in the current branch.
…om/marlenezw/cudf into fix/output_based_on_dtype_for_acos

I'd like to pull in my origin branch changes
@marlenezw marlenezw added the Python Affects Python cuDF API. label Sep 22, 2020
@marlenezw marlenezw self-assigned this Sep 22, 2020
@codecov
Copy link

codecov bot commented Sep 22, 2020

Codecov Report

Merging #6294 into branch-0.16 will increase coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.16    #6294      +/-   ##
===============================================
+ Coverage        83.16%   83.18%   +0.02%     
===============================================
  Files               90       90              
  Lines            14692    14693       +1     
===============================================
+ Hits             12218    12222       +4     
+ Misses            2474     2471       -3     
Impacted Files Coverage Δ
python/cudf/cudf/testing/fuzzer.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/abc.py 91.48% <0.00%> (+4.25%) ⬆️
python/cudf/cudf/utils/gpu_utils.py 58.53% <0.00%> (+4.87%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe3072f...c92c28a. Read the comment docs.

@marlenezw marlenezw changed the title [WIP] Fix/read parquet for empty DataFrame [REVIEW] Fix/read parquet for empty DataFrame Sep 24, 2020
@marlenezw marlenezw added 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer and removed 2 - In Progress Currently a work in progress labels Sep 24, 2020
@kkraus14 kkraus14 added the cuIO cuIO issue label Sep 24, 2020
python/cudf/cudf/_lib/parquet.pyx Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_parquet.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_parquet.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_parquet.py Outdated Show resolved Hide resolved
@kkraus14 kkraus14 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer labels Sep 25, 2020
@kkraus14 kkraus14 merged commit be9b549 into rapidsai:branch-0.16 Sep 25, 2020
cwharris pushed a commit to cwharris/cudf that referenced this pull request Oct 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge cuIO cuIO issue Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] KeyError in cudf.read_parquet for empty dataframe
6 participants