Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing runtime of JSON reader options benchmark #15681

Merged
merged 1 commit into from
May 8, 2024

Conversation

shrshi
Copy link
Contributor

@shrshi shrshi commented May 7, 2024

Description

This PR cleans up the JSON reader options benchmark by reducing the number of runtime configurations from 162 to 20.
Reasoning behind the splitting of the benchmark -

  1. The normalize_single_quotes and normalize_whitespace are pre-processing options and do not impact each other - the runtimes of the FSTs are additive.
  2. The performance of raw input ingestion (row_selection::ALL and row_selection::BYTE_RANGE) is independent of the token generation and tree algorithms.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label May 7, 2024
@shrshi shrshi added non-breaking Non-breaking change Performance Performance related issue libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function and removed libcudf Affects libcudf (C++/CUDA) code. labels May 7, 2024
@shrshi shrshi marked this pull request as ready for review May 7, 2024 14:15
@shrshi shrshi requested a review from a team as a code owner May 7, 2024 14:15
Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Member

@mhaseeb123 mhaseeb123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Though I am not too well-versed in NVBENCH.

Minor comment:
Do we need to update any documentation against this change and mark the PR checkbox?

@shrshi
Copy link
Contributor Author

shrshi commented May 8, 2024

This looks good to me. Though I am not too well-versed in NVBENCH.

Minor comment: Do we need to update any documentation against this change and mark the PR checkbox?
Thanks!
No documentation needs to be updated - I've marked the checkbox now :)

@shrshi
Copy link
Contributor Author

shrshi commented May 8, 2024

/merge

@rapids-bot rapids-bot bot merged commit f965f3c into rapidsai:branch-24.06 May 8, 2024
70 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants