Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gpuCI] Forward-merge branch-21.12 to branch-22.02 [skip gpuci] #9671

Merged
merged 2 commits into from
Nov 12, 2021

Conversation

GPUtester
Copy link
Collaborator

Forward-merge triggered by push to branch-21.12 that creates a PR to keep branch-22.02 up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge.

This PR removes implementations of `__sizeof__` from cudf classes. Previously, `__sizeof__` was overridden to return the total GPU memory usage, but this is inconsistent with the standard Python semantics of this function and should be removed. The appropriate way to query for total GPU memory usage is via the `memory_usage` function, which is now standardized across various objects. The sizeof dispatch for dask is set to use `memory_usage` as well to avoid any breakage here.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Charles Blackmon-Luca (https://github.com/charlesbluca)
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #9544
This PR is a minimal set of changes that appear to be required to maximize inlining of code involved in AST-based expression evaluation. Specifically, applying the `__forceinline__` qualifier to the type dispatcher results in a significant performance improvement when nullable data is passed through the AST evaluator. I observe roughly a 2x performance improvement with a negligible increase in compilation time. Note that the specific improvements appear to be heavily dependent on the architecture being tested on. Interestingly, removing the `__forceinline__` on the `evaluate` methods results in performance regressing back to its original values despite them being defined inline, an unexpected result.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Jake Hemstad (https://github.com/jrhemstad)

URL: #9530
@GPUtester GPUtester requested review from a team as code owners November 12, 2021 14:35
@GPUtester GPUtester merged commit 9ec8b30 into branch-22.02 Nov 12, 2021
@GPUtester
Copy link
Collaborator Author

SUCCESS - forward-merge complete.

@github-actions github-actions bot added Python Affects Python cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels Nov 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants