Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract Syntax Tree Cleanup and Tests #7418

Merged
merged 22 commits into from
May 11, 2021

Conversation

codereport
Copy link
Contributor

Resolves #6320

@codereport codereport added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. tech debt improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 22, 2021
@codereport codereport self-assigned this Feb 22, 2021
@codereport codereport added 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond and removed 2 - In Progress Currently a work in progress labels Feb 22, 2021
@codereport codereport marked this pull request as ready for review February 22, 2021 14:42
@codereport codereport requested a review from a team as a code owner February 22, 2021 14:42
@codereport codereport added 2 - In Progress Currently a work in progress and removed 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond labels Feb 22, 2021
@codereport codereport marked this pull request as draft February 22, 2021 14:53
Copy link
Contributor

@jrhemstad jrhemstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at all this, I think it could be further improved by making the ast_plan own both the host_data_buffer and device_data_buffer as well as provide member functions for accessing the aliased pointers into the device data buffer in order to move that messy logic inside the class.

cpp/include/cudf/ast/detail/transform.cuh Outdated Show resolved Hide resolved
@codereport
Copy link
Contributor Author

Looking at all this, I think it could be further improved by making the ast_plan own both the host_data_buffer and device_data_buffer as well as provide member functions for accessing the aliased pointers into the device data buffer in order to move that messy logic inside the class.

Yea, I realized there was more logic/work that could be moved to the ctor (as pointed out by the linked comment in the issue that I clicked on but GitHub hung for a while retrieving it - so I went and worked on something and forgot to go back and check). Will address.

@codecov
Copy link

codecov bot commented Feb 22, 2021

Codecov Report

Merging #7418 (98b219a) into branch-0.20 (51336df) will increase coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.20    #7418      +/-   ##
===============================================
+ Coverage        82.88%   82.92%   +0.03%     
===============================================
  Files              103      104       +1     
  Lines            17668    17901     +233     
===============================================
+ Hits             14645    14845     +200     
- Misses            3023     3056      +33     
Impacted Files Coverage Δ
python/cudf/cudf/core/abc.py 91.66% <ø> (+0.17%) ⬆️
python/cudf/cudf/core/algorithms.py 82.35% <ø> (ø)
python/cudf/cudf/core/column/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/core/column/categorical.py 92.37% <ø> (+0.13%) ⬆️
python/cudf/cudf/core/column/column.py 88.20% <ø> (-0.44%) ⬇️
python/cudf/cudf/core/column/datetime.py 88.03% <ø> (-1.88%) ⬇️
python/cudf/cudf/core/column/decimal.py 91.04% <ø> (-1.89%) ⬇️
python/cudf/cudf/core/column/interval.py 91.66% <ø> (+0.55%) ⬆️
python/cudf/cudf/core/column/lists.py 86.98% <ø> (-0.43%) ⬇️
python/cudf/cudf/core/column/numerical.py 94.72% <ø> (+0.29%) ⬆️
... and 50 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7623f39...98b219a. Read the comment docs.

@github-actions
Copy link

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

@github-actions github-actions bot added the conda label Apr 21, 2021
@vyasr vyasr marked this pull request as ready for review April 21, 2021 19:30
@vyasr vyasr requested a review from a team as a code owner April 21, 2021 19:30
@vyasr vyasr self-assigned this Apr 23, 2021
@codereport codereport added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Apr 26, 2021
@vyasr vyasr mentioned this pull request May 3, 2021
10 tasks
Copy link
Contributor

@nvdbaranec nvdbaranec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small stuff.

cpp/include/cudf/ast/detail/transform.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/ast/detail/transform.cuh Outdated Show resolved Hide resolved
cpp/src/ast/transform.cu Outdated Show resolved Hide resolved
@vyasr vyasr requested a review from nvdbaranec May 6, 2021 00:01
@vyasr vyasr mentioned this pull request May 11, 2021
@vyasr
Copy link
Contributor

vyasr commented May 11, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 9a063b6 into rapidsai:branch-0.20 May 11, 2021
rapids-bot bot pushed a commit that referenced this pull request Jul 14, 2021
This PR implements conditional joins using expressions that are decomposed into abstract syntax trees for evaluation. This PR builds on the AST evaluation framework established in #5494 and #7418, but significantly refactors the internals and generalizes them to enable 1) expressions on two tables and 2) operations on nullable columns. This PR uses the nested loop join code created in #5397 for inner joins, but also substantially generalizes that code to enable 1) all types of joins, 2) joins with arbitrary AST expressions rather than just equality, and 3) handling of null values (with user-specified `null_equality`). A significant chunk of the code is currently out of place, but since this changeset is rather large I've opted not to move things in ways that will make reviewing this PR significantly more challenging. I will make a follow-up to address those issues once this PR is merged.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Jake Hemstad (https://github.com/jrhemstad)
  - Conor Hoekstra (https://github.com/codereport)

URL: #8214
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Remaining testing and cleanup tasks for Abstract Syntax Tree
7 participants