Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile-time ipow computation with array lookup #15110

Merged
merged 7 commits into from
Feb 28, 2024

Conversation

pmattione-nvidia
Copy link
Contributor

@pmattione-nvidia pmattione-nvidia commented Feb 21, 2024

Description

Compile-time ipow() computation with array lookup. Results in up to 8% speed improvement for decimal64 -> double conversions. Improvement is negligible for other conversions but is not worse. New benchmark test will be in a separate PR. Fix fixed_point -> string conversion test. Also fix rounding comments. Closes #9346

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@pmattione-nvidia pmattione-nvidia requested a review from a team as a code owner February 21, 2024 19:06
Copy link

copy-pr-bot bot commented Feb 21, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Feb 21, 2024
@davidwendt davidwendt added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 21, 2024
@davidwendt
Copy link
Contributor

/ok to test

@davidwendt
Copy link
Contributor

/ok to test

@davidwendt
Copy link
Contributor

/ok to test

Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I've updated the PR description a bit so the corresponding issue can be closed once this PR gets merged. Thanks!

cpp/include/cudf/fixed_point/fixed_point.hpp Outdated Show resolved Hide resolved
@davidwendt
Copy link
Contributor

/ok to test

@pmattione-nvidia
Copy link
Contributor Author

Note that it's unclear (to me) what the optimal algorithm is for get_power(). It was previously the (logarithmic) squaring algorithm instead of this recursive one. However since we have to compute every power to fill the array, the compiler may be smart enough to optimize that, or benefit from caching effects. Either way this call is only performed at compile time (good candidate for consteval in C++20), and the compile time is dominated by the type dispatcher anyway. So we'll just use the recursive algorithm for now (simplest, perhaps easiest for compiler to optimize).

Copy link
Contributor

@shrshi shrshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you for adding the comment!

@PointKernel
Copy link
Member

Either way this call is only performed at compile time (good candidate for consteval in C++20), and the compile time is dominated by the type dispatcher anyway.

Valid point. I won't worry much about a build-time recursive call.

@pmattione-nvidia
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 896b5bc into rapidsai:branch-24.04 Feb 28, 2024
69 checks passed
rapids-bot bot pushed a commit that referenced this pull request Mar 11, 2024
The addition of an array of integers in this function placed too much register pressure on our code base. This function is used by the fixed_point constructor and cast operators, so it potentially affects every kernel.  Too many unrelated kernels were impacted and suffered performance degradations to justify this change.  This reverts the algorithm introduced in #15110 to what it was previously, with some very minor tweaks.

Authors:
  - Paul Mattione (https://github.com/pmattione-nvidia)

Approvers:
  - Yunsong Wang (https://github.com/PointKernel)
  - Mike Wilson (https://github.com/hyperbolic2346)
  - Shruti Shivakumar (https://github.com/shrshi)
  - MithunR (https://github.com/mythrocks)

URL: #15242
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] numeric::detail::ipow optimization for Base 10 & 2
4 participants