Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Introduces make_optional_iterator for nullable column and scalars #7772

Merged

Conversation

robertmaynard
Copy link
Contributor

@robertmaynard robertmaynard commented Mar 31, 2021

Introduces make_optional_iterator for nullable column and scalars, as the first step in fixing issues brought up in #6952 and #7573.

The iterator produces thrust::optional<T> to better represent nullable column elements and scalars.

make_optional_iterator supports three different contains_null modes:

- `YES` means that the column supports nulls and has null values, therefore
 the optional might not contain a value

- `NO` means that the column has no null values, therefore the optional will
 always have a value

- `DYNAMIC` defers the assumption of nullability to runtime with the users stating
 on construction of the iterator if column has nulls.

@robertmaynard robertmaynard requested review from a team as code owners March 31, 2021 00:36
@github-actions github-actions bot added CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. labels Mar 31, 2021
@robertmaynard robertmaynard force-pushed the fea/introduce_optional_iterator branch from a9ca157 to 997c64d Compare March 31, 2021 12:23
@robertmaynard robertmaynard force-pushed the fea/introduce_optional_iterator branch from 997c64d to 0639524 Compare April 1, 2021 19:30
@robertmaynard robertmaynard force-pushed the fea/introduce_optional_iterator branch 2 times, most recently from 8e2d1a0 to 67f527f Compare April 7, 2021 12:59
@robertmaynard robertmaynard changed the title Introduces make_optional_iterator for nullable column and scalars [REVIEW] Introduces make_optional_iterator for nullable column and scalars Apr 7, 2021
@robertmaynard robertmaynard force-pushed the fea/introduce_optional_iterator branch from 67f527f to 1523ac4 Compare April 7, 2021 14:02
This is a first step in fixing issues brought up in rapidsai#6952 and rapidsai#7573.

The iterator produces `thrust::optional<T>` to better represent nullable
column elements or scalars.

`make_optional_iterator` supports three different `contains_null` modes:

    - `YES` means that the column supports nulls and has null values, therefore
     the optional might not contain a value

    - `NO` means that the column has no null values, therefore the optional will
     always have a value

    - `DYNAMIC` defers the assumption of nullability to runtime with the users stating
     on construction of the iterator if column has nulls.
@robertmaynard robertmaynard force-pushed the fea/introduce_optional_iterator branch from 69163c3 to ce6d864 Compare April 8, 2021 13:38
@jrhemstad jrhemstad added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 8, 2021
@robertmaynard
Copy link
Contributor Author

I was curious what impact thrust::optional had on compilation times so I ported the minmax and clamp algorithms to optional ( https://github.com/robertmaynard/cudf/commits/fea/use_optional_iterator ).

source pair_iterator optional_iterator
clamp 30.16s 30.21s
minmax 45.68s 47.36s

So it looks like we shouldn't see a significant compilation cost by switching algorithms over to use optional iterators.

@codecov
Copy link

codecov bot commented Apr 12, 2021

Codecov Report

Merging #7772 (610f459) into branch-0.20 (599f62d) will increase coverage by 0.61%.
The diff coverage is 92.81%.

❗ Current head 610f459 differs from pull request most recent head 6437e44. Consider uploading reports for the commit 6437e44 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.20    #7772      +/-   ##
===============================================
+ Coverage        82.30%   82.91%   +0.61%     
===============================================
  Files              101      103       +2     
  Lines            17053    17658     +605     
===============================================
+ Hits             14035    14642     +607     
+ Misses            3018     3016       -2     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/core/column/column.py 88.64% <85.36%> (+1.20%) ⬆️
python/cudf/cudf/core/_internals/where.py 94.61% <94.61%> (ø)
python/cudf/cudf/__init__.py 100.00% <100.00%> (ø)
python/cudf/cudf/core/__init__.py 100.00% <100.00%> (ø)
python/cudf/cudf/core/_compat.py 100.00% <100.00%> (ø)
python/cudf/cudf/core/_internals/__init__.py 100.00% <100.00%> (ø)
python/cudf/cudf/core/column/datetime.py 89.91% <100.00%> (+0.71%) ⬆️
python/cudf/cudf/utils/dtypes.py 83.44% <0.00%> (-6.45%) ⬇️
python/cudf/cudf/utils/gpu_utils.py 53.65% <0.00%> (-4.88%) ⬇️
... and 55 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 348ad4d...6437e44. Read the comment docs.

@robertmaynard
Copy link
Contributor Author

@gpucibot merge

cpp/include/cudf/column/column_device_view.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/iterator.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/iterator.cuh Outdated Show resolved Hide resolved
@robertmaynard
Copy link
Contributor Author

rerun tests

@harrism
Copy link
Member

harrism commented Apr 20, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 5c2f744 into rapidsai:branch-0.20 Apr 20, 2021
@robertmaynard robertmaynard deleted the fea/introduce_optional_iterator branch April 20, 2021 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants