Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Update IO benchmarks for consistency between formats #12739

Open
11 of 12 tasks
GregoryKimball opened this issue Feb 8, 2023 · 0 comments
Open
11 of 12 tasks

[FEA] Update IO benchmarks for consistency between formats #12739

GregoryKimball opened this issue Feb 8, 2023 · 0 comments
Labels
2 - In Progress Currently a work in progress cuIO cuIO issue feature request New feature or request good first issue Good for newcomers libcudf Affects libcudf (C++/CUDA) code.
Milestone

Comments

@GregoryKimball
Copy link
Contributor

GregoryKimball commented Feb 8, 2023

Is your feature request related to a problem? Please describe.

Additional context
The initial set of topics came from a comparison of file read throughput across the supported formats in cuIO.
image
We are also preparing for a comparison of memory footprint across cuIO, especially with Zstd compression/decompression.

@GregoryKimball GregoryKimball added feature request New feature or request Needs Triage Need team to review and classify labels Feb 8, 2023
@GregoryKimball GregoryKimball added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. cuIO cuIO issue and removed Needs Triage Need team to review and classify labels Feb 8, 2023
rapids-bot bot pushed a commit that referenced this issue Mar 1, 2023
- Add JSON writer benchmark. This benchmark is modeled after CSV writer.
- Add JSON reader benchmark with file data source ([NESTED_JSON](https://github.com/rapidsai/cudf/blob/branch-23.04/cpp/benchmarks/io/json/nested_json.cpp?rgh-link-date=2023-02-08T22%3A43%3A38Z) only does parsing and only on device buffers). This benchmark is modeled after BM_csv_read_io

fixes part of #12739

Authors:
  - Karthikeyan (https://github.com/karthikeyann)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - David Wendt (https://github.com/davidwendt)

URL: #12753
@vuule vuule added the good first issue Good for newcomers label Apr 5, 2023
@GregoryKimball GregoryKimball added this to the Benchmarking milestone Jul 23, 2023
@GregoryKimball GregoryKimball removed this from libcudf Oct 26, 2023
rapids-bot bot pushed a commit that referenced this issue Dec 11, 2023
Addresses issue: [#12739](#12739)


This PR transforms compression and io into string axis types to enable the selection of different values via the CLI, eliminating the need to execute all values in an automation when required.  Additionally, this PR introduces two new functions, `retrieve_io_type_enum` and `retrieve_compression_type_enum`, which facilitate the conversion of string input into the corresponding enum type that can be used in benchmarking functions.

IO Benchmarks:
- [x] PARQUET READER 


For example:
`./PARQUET_READER_NVBENCH -b parquet_read_io_compression --axis io_type=[HOST_BUFFER] --axis compression_type=[NONE]`

Authors:
  - Suraj Aralihalli (https://github.com/SurajAralihalli)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Nghia Truong (https://github.com/ttnghia)

URL: #14347
karthikeyann pushed a commit to karthikeyann/cudf that referenced this issue Dec 12, 2023
…dsai#14347)

Addresses issue: [rapidsai#12739](rapidsai#12739)


This PR transforms compression and io into string axis types to enable the selection of different values via the CLI, eliminating the need to execute all values in an automation when required.  Additionally, this PR introduces two new functions, `retrieve_io_type_enum` and `retrieve_compression_type_enum`, which facilitate the conversion of string input into the corresponding enum type that can be used in benchmarking functions.

IO Benchmarks:
- [x] PARQUET READER 


For example:
`./PARQUET_READER_NVBENCH -b parquet_read_io_compression --axis io_type=[HOST_BUFFER] --axis compression_type=[NONE]`

Authors:
  - Suraj Aralihalli (https://github.com/SurajAralihalli)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Nghia Truong (https://github.com/ttnghia)

URL: rapidsai#14347
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress cuIO cuIO issue feature request New feature or request good first issue Good for newcomers libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

2 participants