Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add column_device_view pointers to EncColumnDesc #7097

Merged
merged 22 commits into from
Feb 4, 2021
Merged

Add column_device_view pointers to EncColumnDesc #7097

merged 22 commits into from
Feb 4, 2021

Conversation

kaatish
Copy link
Contributor

@kaatish kaatish commented Jan 7, 2021

Closes #6893, closes #6894. Contributes to #5682

Reduce usage of stats_column_desc members in Parquet writer with column_device_view members.

@kaatish kaatish added code quality libcudf Affects libcudf (C++/CUDA) code. cuIO cuIO issue labels Jan 7, 2021
@kaatish kaatish requested a review from devavret January 7, 2021 19:28
Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going great so far!

@kaatish kaatish requested a review from devavret January 21, 2021 04:03
@kaatish kaatish added 2 - In Progress Currently a work in progress non-breaking Non-breaking change bug Something isn't working labels Jan 21, 2021
Avoid early exit for empty table
@codecov
Copy link

codecov bot commented Jan 21, 2021

Codecov Report

Merging #7097 (3d80045) into branch-0.18 (8860baf) will increase coverage by 0.11%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #7097      +/-   ##
===============================================
+ Coverage        82.09%   82.20%   +0.11%     
===============================================
  Files               97      100       +3     
  Lines            16474    16952     +478     
===============================================
+ Hits             13524    13936     +412     
- Misses            2950     3016      +66     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/avro.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/csv.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/fuzzer.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/json.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/main.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/orc.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/parquet.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/utils.py 0.00% <ø> (ø)
... and 83 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe5e07d...fa5858d. Read the comment docs.

@devavret
Copy link
Contributor

You should be able to remove stringdata_to_nvstrdesc() and the code around its call site now.

@kaatish kaatish requested a review from devavret January 22, 2021 18:21
@kaatish kaatish marked this pull request as ready for review January 22, 2021 20:33
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial review.
Gotta say, static_cast -> element and strdesc mess -> string_view look 🔥

Copy link
Contributor

@davidwendt davidwendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you may need to add some parquet tests that use strings columns that contain non-ASCII UTF-8 characters.

Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, couple of small changes.

Comment on lines 2003 to 2004
* @param[out] col_desc Column description array [column_id]
* @param[out] leaf_column_views Device vector to store leaf columns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, prefer return value to output parameters.

@kaatish kaatish requested a review from vuule February 1, 2021 20:53
@vuule
Copy link
Contributor

vuule commented Feb 2, 2021

rerun tests

@harrism
Copy link
Member

harrism commented Feb 2, 2021

@kaatish @vuule is this ready to merge assuming we can get CI / style to pass? Please update labels if so.

@vuule
Copy link
Contributor

vuule commented Feb 2, 2021

@kaatish @vuule is this ready to merge assuming we can get CI / style to pass? Please update labels if so.

I think we need @devavret 's review to merge.

Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one tiny thing but I've already changed it in my branch so can be let go.

@devavret
Copy link
Contributor

devavret commented Feb 3, 2021

rerun tests

1 similar comment
@vuule
Copy link
Contributor

vuule commented Feb 4, 2021

rerun tests

@vuule
Copy link
Contributor

vuule commented Feb 4, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 8334700 into rapidsai:branch-0.18 Feb 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
5 participants