Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Adding decimal writing support to parquet #7017

Merged

Conversation

hyperbolic2346
Copy link
Contributor

Since cudf doesn't support precision, the precision must be passed in as a write option. This is handled as a vector of uint8's that indicates the precision of each flattened column in order to support nested types.

Partially closes #6474

…o parquet, the user must pass in a vector of uint8's that indicate the precision of the columns.
@hyperbolic2346 hyperbolic2346 requested a review from a team as a code owner December 16, 2020 04:02
@hyperbolic2346 hyperbolic2346 added 3 - Ready for Review Ready for review by team 4 - Needs cuIO Reviewer improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Dec 16, 2020
@codecov
Copy link

codecov bot commented Dec 16, 2020

Codecov Report

Merging #7017 (a1e1bcc) into branch-0.18 (1963111) will increase coverage by 0.08%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #7017      +/-   ##
===============================================
+ Coverage        82.02%   82.11%   +0.08%     
===============================================
  Files               96       97       +1     
  Lines            16381    16477      +96     
===============================================
+ Hits             13437    13530      +93     
- Misses            2944     2947       +3     
Impacted Files Coverage Δ
python/cudf/cudf/io/csv.py 93.33% <0.00%> (-0.42%) ⬇️
python/cudf/cudf/core/frame.py 89.97% <0.00%> (-0.38%) ⬇️
python/cudf/cudf/core/column/timedelta.py 89.16% <0.00%> (-0.38%) ⬇️
python/cudf/cudf/core/column/numerical.py 94.41% <0.00%> (-0.16%) ⬇️
python/cudf/cudf/__init__.py 100.00% <0.00%> (ø)
python/cudf/cudf/core/dataframe.py 90.70% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/fuzzer.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/datetime.py 89.42% <0.00%> (ø)
python/cudf/cudf/utils/hash_vocab_utils.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_csv.py 100.00% <0.00%> (ø)
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1963111...a1e1bcc. Read the comment docs.

@mike-wendt
Copy link
Contributor

rerun tests

Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestions/questions.

cpp/include/cudf/fixed_point/fixed_point.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/fixed_point/fixed_point.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/io/parquet.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/io/detail/parquet.hpp Outdated Show resolved Hide resolved
cpp/src/io/parquet/writer_impl.cu Show resolved Hide resolved
cpp/src/io/parquet/writer_impl.cu Outdated Show resolved Hide resolved
cpp/tests/io/parquet_test.cpp Outdated Show resolved Hide resolved
cpp/src/io/parquet/writer_impl.cu Outdated Show resolved Hide resolved
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small suggestion

cpp/include/cudf/io/parquet.hpp Outdated Show resolved Hide resolved
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment on naming.

cpp/include/cudf/io/parquet.hpp Outdated Show resolved Hide resolved
cpp/include/cudf_test/column_wrapper.hpp Outdated Show resolved Hide resolved
cpp/src/io/functions.cpp Outdated Show resolved Hide resolved
@rapids-bot rapids-bot bot merged commit 31c0d29 into rapidsai:branch-0.18 Jan 5, 2021
@hyperbolic2346 hyperbolic2346 deleted the mwilson/decimal_write branch February 3, 2021 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] support read/write parquet/orc with fixed-point decimal data
5 participants