Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ORC chunked reader #15094

Merged
merged 356 commits into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
356 commits
Select commit Hold shift + click to select a range
a38b115
Fix stripe lookup bug
ttnghia Feb 24, 2024
75cec9b
Fix a bug
ttnghia Feb 24, 2024
a7bd47a
Fix another bug
ttnghia Feb 24, 2024
db768fb
Debugging
ttnghia Feb 25, 2024
f8652d7
All tests pass
ttnghia Feb 25, 2024
537ea0c
Reverse tests
ttnghia Feb 25, 2024
24e1552
Fix for temp concatenation
ttnghia Feb 25, 2024
df8d9b3
Turn off debug printing
ttnghia Feb 25, 2024
4e1614e
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Feb 26, 2024
12dff3b
Some fixes
ttnghia Feb 26, 2024
5401826
Fix host memory issue
ttnghia Feb 26, 2024
e53cf56
Some cleanup
ttnghia Feb 26, 2024
f2ec94c
Compute table row size
ttnghia Feb 27, 2024
fd325b6
Compute column row size
ttnghia Feb 27, 2024
416d810
Test column size
ttnghia Feb 27, 2024
b745787
Test column sizes using `segmented_bit_count`
ttnghia Feb 28, 2024
d0ed05a
Compute table sizes using `segmented_bit_count`
ttnghia Feb 28, 2024
ae06017
Temporary store multiple decoded tables
ttnghia Feb 29, 2024
6cccca3
Add test file
ttnghia Feb 29, 2024
2488cb2
Add comment
ttnghia Feb 29, 2024
e3db4dc
Add `output_row_granularity` parameter
ttnghia Feb 29, 2024
94d66ad
Fix segment length
ttnghia Feb 29, 2024
e270aa3
Use chunking for chunked reader
ttnghia Feb 29, 2024
b307b80
Fix bug in chunking
ttnghia Feb 29, 2024
fcdc9c1
Add debug info
ttnghia Feb 29, 2024
915a3fc
Fix a bug in setting row granularity
ttnghia Feb 29, 2024
de4a365
Fix test
ttnghia Feb 29, 2024
119002e
Improve tests
ttnghia Feb 29, 2024
818cfb7
Implement adaptive size limit for decoding
ttnghia Feb 29, 2024
bce6e8d
Update `row_bit_count.cu`
ttnghia Mar 1, 2024
f6fc6f0
Fix caller to `segmented_row_bit_count`
ttnghia Mar 1, 2024
bd198dc
Remove adaptive size for decoding
ttnghia Mar 1, 2024
d23591d
Update test
ttnghia Mar 1, 2024
a581e96
Fix segment size processing
ttnghia Mar 1, 2024
6072ffa
Add more test
ttnghia Mar 1, 2024
afb4ffa
Add test with strings
ttnghia Mar 1, 2024
4b1665e
Add more tests
ttnghia Mar 1, 2024
f8ae741
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 1, 2024
e08984f
Add more test
ttnghia Mar 1, 2024
d555b54
Implement test limit function
ttnghia Mar 1, 2024
cfb8345
Implement `load_limit_ratio`
ttnghia Mar 1, 2024
e072124
Add new test
ttnghia Mar 1, 2024
37aaeeb
Add strong type for limits
ttnghia Mar 1, 2024
4531ab3
Fix test check
ttnghia Mar 1, 2024
9a80faf
Cleanup
ttnghia Mar 2, 2024
6279ad6
Fix bug in stream data access
ttnghia Mar 2, 2024
dc6bc2e
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 2, 2024
3a89549
Add temp docs
ttnghia Mar 2, 2024
d1cc44c
Add new tests
ttnghia Mar 2, 2024
ac97dc2
Add test
ttnghia Mar 2, 2024
9b2bbaa
Allow to control number of rows per stripe
ttnghia Mar 2, 2024
a959db2
Write a bit larger stripes to test
ttnghia Mar 2, 2024
81b78ea
Add the final test
ttnghia Mar 2, 2024
5537033
Change debug info
ttnghia Mar 2, 2024
41b9f52
Implement peak memory usage
ttnghia Mar 3, 2024
6597699
Optimize memory usage
ttnghia Mar 3, 2024
277758e
Add debug info
ttnghia Mar 3, 2024
83ba727
Fix a bug in memory write, and add debug info for memory usage
ttnghia Mar 3, 2024
5dcd612
Debugging memory leak
ttnghia Mar 3, 2024
04acd0f
Fix memory leak
ttnghia Mar 3, 2024
97f80c8
Change comments
ttnghia Mar 4, 2024
e425e41
Change memory stats
ttnghia Mar 4, 2024
8d77309
Change read limit ratio
ttnghia Mar 4, 2024
c4f98ee
Test read with very large file
ttnghia Mar 4, 2024
ae665a0
Support `skip_rows` and `num_rows`
ttnghia Mar 4, 2024
883ccc0
Fix test with very large file
ttnghia Mar 4, 2024
625d0f4
Some refactors
ttnghia Mar 4, 2024
974bb7f
Update debug info
ttnghia Mar 4, 2024
bdb586e
Add a temporary test
ttnghia Mar 4, 2024
1bee174
Revert "Add a temporary test"
ttnghia Mar 4, 2024
2b3b92e
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 4, 2024
17096d3
Fix format
ttnghia Mar 4, 2024
18a4e9f
Temporarily fix use-after-free bug
ttnghia Mar 5, 2024
4886db4
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 5, 2024
9697813
Revert "Temporarily fix use-after-free bug"
ttnghia Mar 5, 2024
0016935
This is indeed the fix for use-after-free bug
ttnghia Mar 5, 2024
759246d
Final workaround for use-after-free bug
ttnghia Mar 5, 2024
d5912b9
Split file
ttnghia Mar 6, 2024
87fd67b
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 6, 2024
c44f0ec
Change comment and docs
ttnghia Mar 6, 2024
b842118
Add error check for `output_row_granularity`
ttnghia Mar 6, 2024
248f0ef
Update docs
ttnghia Mar 6, 2024
497eea5
Update docs
ttnghia Mar 6, 2024
33aff94
Cleanup and change docs
ttnghia Mar 7, 2024
d071f46
Support 64bit size for `rows_to_read`
ttnghia Mar 7, 2024
388adb3
Implement `cumulative_size_and_row`
ttnghia Mar 7, 2024
7e451ab
Split if num rows exceeds size limit
ttnghia Mar 7, 2024
758e2d0
Add test
ttnghia Mar 7, 2024
5de8179
Changing skip and num rows
ttnghia Mar 7, 2024
31f6b6d
Fix test
ttnghia Mar 7, 2024
07a095a
Fix skip rows and num rows
ttnghia Mar 7, 2024
6a6061a
Add test
ttnghia Mar 7, 2024
d8c7c44
Fix a bug
ttnghia Mar 7, 2024
c33ebce
Fix return order bug
ttnghia Mar 7, 2024
e7958cc
Change local test
ttnghia Mar 7, 2024
2cc1fe7
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 7, 2024
5295187
Add changes in `hostdevice_vector.hpp` ahead of time
ttnghia Mar 8, 2024
fe2f55e
Fix style
ttnghia Mar 8, 2024
0ced9f4
Fix doxygen
ttnghia Mar 8, 2024
223f078
Rename struct
ttnghia Mar 8, 2024
112131f
Change error message
ttnghia Mar 8, 2024
be544f5
Reverse changes in parquet code
ttnghia Mar 8, 2024
ead3124
Fix option access
ttnghia Mar 8, 2024
74d14d1
Remove outdated test
ttnghia Mar 8, 2024
fb92f17
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 8, 2024
07103ad
Wrap the debug print lines in `#ifdef/#endif`
ttnghia Mar 8, 2024
971296f
Update benchmark
ttnghia Mar 8, 2024
10b7ca7
Revert changes in `orc_read_input.cpp`
ttnghia Mar 8, 2024
0b8a2b5
Revert changes in `parquet/reader_impl_helpers.cpp`
ttnghia Mar 8, 2024
5899751
Implement chunked read benchmark
ttnghia Mar 8, 2024
9df437b
Remove redundant parameters, and rewrite docs
ttnghia Mar 9, 2024
dff0235
Cleanup
ttnghia Mar 9, 2024
28e631f
Rename variables
ttnghia Mar 9, 2024
d8163db
Rename variable
ttnghia Mar 9, 2024
3f57b5f
Change variable name
ttnghia Mar 9, 2024
7a04022
Change data type
ttnghia Mar 9, 2024
bcdfab8
Change from chunk to range
ttnghia Mar 9, 2024
98d82fc
Cleanup
ttnghia Mar 9, 2024
1ec9dc0
Cleanup and rename variable
ttnghia Mar 10, 2024
64c155a
Further cleanup and rename variable
ttnghia Mar 10, 2024
5b361fb
Cleanup
ttnghia Mar 10, 2024
1206ba1
Cleanup and rename variables
ttnghia Mar 10, 2024
71386a2
Cleanup heavily
ttnghia Mar 10, 2024
17c3393
Continue cleaning up
ttnghia Mar 10, 2024
86e429f
Cleanup and add docs
ttnghia Mar 11, 2024
c500719
Rename variables
ttnghia Mar 11, 2024
a03cb3d
Change return type of `get_range`
ttnghia Mar 11, 2024
cebb051
More cleanup
ttnghia Mar 11, 2024
a0492fd
Fix num stripes
ttnghia Mar 11, 2024
d2e892d
Update docs
ttnghia Mar 11, 2024
10b8558
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 11, 2024
1531f98
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 11, 2024
5e4b16f
Cleanup and add docs
ttnghia Mar 11, 2024
3ec50ef
Cleanup, docs, and rename variables
ttnghia Mar 11, 2024
73c1a19
Update `hostdevice_vector.hpp`
ttnghia Mar 11, 2024
a897155
Optimize `tz_table` parameter usage
ttnghia Mar 11, 2024
91f9cce
Make `null_count_prefix_sums` local to decoding step
ttnghia Mar 11, 2024
dd7e850
Make `lvl_chunks` local to decoding step and some cleanup
ttnghia Mar 11, 2024
89a2ac0
Reorder variables
ttnghia Mar 11, 2024
c585c44
Cleanup and rename variables
ttnghia Mar 11, 2024
f339b23
Reorder code
ttnghia Mar 11, 2024
9a2cee0
More cleanup and code reordering
ttnghia Mar 11, 2024
a0d1528
Update docs
ttnghia Mar 11, 2024
75a96d1
Change variable types
ttnghia Mar 11, 2024
96274ab
More cleanup
ttnghia Mar 11, 2024
246dd5b
Complete cleaning up
ttnghia Mar 11, 2024
027f899
Revert error message
ttnghia Mar 11, 2024
961d468
Revert error handling that may be wrong
ttnghia Mar 11, 2024
b445ca6
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 11, 2024
30b5899
Fix spell
ttnghia Mar 11, 2024
40b28fa
Update python code
ttnghia Mar 12, 2024
51c2abf
Update copyright year
ttnghia Mar 12, 2024
de5cf15
Fix style
ttnghia Mar 12, 2024
10945a6
Change benchmark
ttnghia Mar 12, 2024
9c9a3c9
Change benchmark
ttnghia Mar 12, 2024
bc34e40
Fix python code
ttnghia Mar 12, 2024
de5ce78
Fix spell
ttnghia Mar 12, 2024
79dd429
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 12, 2024
56750bd
Disable mem stat
ttnghia Mar 13, 2024
44d8e4a
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 13, 2024
89009dd
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 13, 2024
8e8579a
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 14, 2024
c97150e
Change memory limits for data loading and decoding
ttnghia Mar 15, 2024
710734c
Fix tests due to changing internal parameters
ttnghia Mar 15, 2024
7b89094
Cleanup
ttnghia Mar 15, 2024
cb41a65
Cleanup
ttnghia Mar 15, 2024
d68562c
Fix a bug in stripe rows computation
ttnghia Mar 15, 2024
86ec436
Cleanup `reader_impl_chunking.cu`
ttnghia Mar 15, 2024
0f78b0d
Cleanup `reader_impl_decode.cu`
ttnghia Mar 15, 2024
f197946
Cleanup `reader_impl_chunking.hpp`
ttnghia Mar 15, 2024
74d806b
Change row selection test
ttnghia Mar 15, 2024
2a67770
Cleanup test
ttnghia Mar 15, 2024
f76f61e
Construct timezone table in global step
ttnghia Mar 15, 2024
de72389
Use `rmm::exec_policy_nosync`
ttnghia Mar 15, 2024
4b64637
Merge branch 'branch-24.04' into chunked_orc_reader
ttnghia Mar 20, 2024
28f7cfc
Optimize benchmark code
ttnghia Mar 20, 2024
f527c99
Do not sync
ttnghia Mar 20, 2024
96f89a1
Simplify code
ttnghia Mar 20, 2024
6960f92
Merge branch 'branch-24.06' into chunked_orc_reader
ttnghia Mar 28, 2024
20d8e81
Add assertion to `num_rows` in parquet reader
ttnghia Mar 28, 2024
99afb2e
Fix comment
ttnghia Mar 28, 2024
38d8748
Add assertion to `skip_rows`
ttnghia Mar 28, 2024
6e658dc
Update docs
ttnghia Mar 28, 2024
734dcf3
Separate `impl` class from `reader` resulting into `reader_impl`, and…
ttnghia Mar 28, 2024
75c5b7c
Fix spell
ttnghia Mar 28, 2024
861cdcc
Only update `total_stripe_sizes` if in chunked read mode
ttnghia Mar 28, 2024
ee50701
Implement optimized code path splitting stripe range in special cases
ttnghia Mar 28, 2024
b976d99
Remove test
ttnghia Mar 29, 2024
ec7303f
Add `read_mode` param to `decompress_and_decode`, and change comments
ttnghia Mar 29, 2024
acd9689
Compute compinfo on the fly
ttnghia Mar 29, 2024
b9f07a2
Separate `compinfo_map` into levels
ttnghia Mar 29, 2024
ab1afdc
Revert "Separate `compinfo_map` into levels"
ttnghia Mar 29, 2024
bf5b111
Simplify code
ttnghia Mar 29, 2024
4d01ad7
Remove local `stream_compinfo_map`
ttnghia Mar 29, 2024
a06cf49
Optimize hashing by combining `orc_col_idx` and `kind`
ttnghia Mar 29, 2024
54ed4fd
Optimize by using one array of compinfo for all levels
ttnghia Mar 29, 2024
0447271
Use only one array of compinfo for all levels
ttnghia Mar 29, 2024
33c92a9
Use only one `device_buffer` for storing all stripe data
ttnghia Mar 29, 2024
0a16bb4
Revert "Use only one `device_buffer` for storing all stripe data"
ttnghia Mar 31, 2024
3f8a220
Revert changes to `reader` class
ttnghia Apr 2, 2024
cbb3858
Use byte count instead of bit count
ttnghia Apr 2, 2024
1c62ba7
Change bench limits
ttnghia Apr 2, 2024
15639eb
Merge branch 'branch-24.06' into chunked_orc_reader
ttnghia Apr 2, 2024
1c2bdc1
Update cpp/benchmarks/io/orc/orc_reader_input.cpp
ttnghia Apr 2, 2024
7d12de2
Update cpp/benchmarks/io/orc/orc_reader_input.cpp
ttnghia Apr 2, 2024
c426e4c
Fix/add comment and cleanup
ttnghia Apr 3, 2024
faea7bc
Use pointers instead of optionals
ttnghia Apr 3, 2024
7bfcdf5
Require `output_row_granularity` to be positive all the time
ttnghia Apr 3, 2024
a915f33
Reorganize code, removing constructors
ttnghia Apr 3, 2024
69d70c5
Rename functor
ttnghia Apr 3, 2024
4e94d53
Using `host_span` instead of `const&`
ttnghia Apr 3, 2024
70adb9c
Merge branch 'branch-24.06' into chunked_orc_reader
ttnghia Apr 23, 2024
8c05654
Use `device_async_resource_ref`
ttnghia Apr 23, 2024
3b4d7f2
Rename `global_preprocess` into `preprocess_file`
ttnghia Apr 23, 2024
33f6d15
Optimize memory usage
ttnghia Apr 23, 2024
5a253bd
Fix overflow handling
ttnghia Apr 23, 2024
7a9c436
Remove `reader.cu`
ttnghia Apr 23, 2024
2193722
Add a test
ttnghia Apr 23, 2024
4d3ddd1
Rewrite benchmark
ttnghia Apr 23, 2024
80488f9
Rename parameters
ttnghia Apr 23, 2024
b5343dc
Rename parameter
ttnghia Apr 23, 2024
bdc92a0
Rename functions
ttnghia Apr 23, 2024
252d546
Fix format
ttnghia Apr 23, 2024
8ebbb2c
Change comments
ttnghia Apr 24, 2024
a793eb7
Change comments and rename variable
ttnghia Apr 24, 2024
1a7c3a9
Change comments
ttnghia Apr 24, 2024
6c3bb4f
Inline a small function
ttnghia Apr 24, 2024
df9a6fe
Merge branch 'branch-24.06' into chunked_orc_reader
ttnghia Apr 24, 2024
673b034
Fix format
ttnghia Apr 24, 2024
767e35f
Allocate `null_count_prefix_sums` as just one buffer
ttnghia Apr 26, 2024
ca15afc
Change initialization style
ttnghia Apr 26, 2024
f34b7b6
Change comment
ttnghia Apr 26, 2024
1d19ede
Reserve vector
ttnghia Apr 26, 2024
4087199
Merge branch 'branch-24.06' into chunked_orc_reader
ttnghia Apr 26, 2024
3ed1e2e
Change variable order
ttnghia Apr 26, 2024
6335586
Move data to output
ttnghia Apr 29, 2024
437c9c0
Rename function
ttnghia Apr 29, 2024
cc174bb
Change `has_no_data` into `has_data`
ttnghia Apr 29, 2024
1e69335
Rename variable
ttnghia Apr 29, 2024
ad69236
Implement `size` for range
ttnghia Apr 29, 2024
890abb4
Change docs
ttnghia Apr 29, 2024
cb21a6d
Rename `_config` into `_options`
ttnghia Apr 29, 2024
4e64eb7
Change comments
ttnghia Apr 29, 2024
d42ed14
Change `cumulative_size_and_row` to subclass `cumulative_size`
ttnghia Apr 29, 2024
bd5290c
Merge branch 'branch-24.06' into chunked_orc_reader
ttnghia Apr 29, 2024
a0ca333
Address some review comments
ttnghia May 2, 2024
536078b
Merge branch 'branch-24.06' into chunked_orc_reader
ttnghia May 2, 2024
cf18e4c
Remove handling for `READ_ALL` when number of rows exceed 2B rows
ttnghia May 2, 2024
7b8a96f
Merge branch 'branch-24.06' into chunked_orc_reader
ttnghia May 2, 2024
42601b2
Rename parameter
ttnghia May 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -395,8 +395,9 @@ add_library(
src/io/orc/dict_enc.cu
src/io/orc/orc.cpp
src/io/orc/reader_impl.cu
src/io/orc/reader_impl_chunking.cu
src/io/orc/reader_impl_decode.cu
src/io/orc/reader_impl_helpers.cpp
src/io/orc/reader_impl_preprocess.cu
src/io/orc/stats_enc.cu
src/io/orc/stripe_data.cu
src/io/orc/stripe_enc.cu
Expand Down
106 changes: 83 additions & 23 deletions cpp/benchmarks/io/orc/orc_reader_input.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
* Copyright (c) 2022-2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -24,31 +24,59 @@

#include <nvbench/nvbench.cuh>

namespace {

// Size of the data in the benchmark dataframe; chosen to be low enough to allow benchmarks to
// run on most GPUs, but large enough to allow highest throughput
constexpr int64_t data_size = 512 << 20;
constexpr cudf::size_type num_cols = 64;
constexpr std::size_t data_size = 512 << 20;
constexpr std::size_t Mbytes = 1024 * 1024;

template <bool is_chunked_read>
void orc_read_common(cudf::size_type num_rows_to_read,
cuio_source_sink_pair& source_sink,
nvbench::state& state)
{
cudf::io::orc_reader_options read_opts =
cudf::io::orc_reader_options::builder(source_sink.make_source_info());
auto const read_opts =
cudf::io::orc_reader_options::builder(source_sink.make_source_info()).build();

auto mem_stats_logger = cudf::memory_stats_logger(); // init stats logger
state.set_cuda_stream(nvbench::make_cuda_stream_view(cudf::get_default_stream().value()));
state.exec(
nvbench::exec_tag::sync | nvbench::exec_tag::timer, [&](nvbench::launch& launch, auto& timer) {
try_drop_l3_cache();

timer.start();
auto const result = cudf::io::read_orc(read_opts);
timer.stop();

CUDF_EXPECTS(result.tbl->num_columns() == num_cols, "Unexpected number of columns");
CUDF_EXPECTS(result.tbl->num_rows() == num_rows_to_read, "Unexpected number of rows");
});
if constexpr (is_chunked_read) {
state.exec(
nvbench::exec_tag::sync | nvbench::exec_tag::timer, [&](nvbench::launch&, auto& timer) {
try_drop_l3_cache();
auto const output_limit_MB =
static_cast<std::size_t>(state.get_int64("chunk_read_limit_MB"));
auto const read_limit_MB = static_cast<std::size_t>(state.get_int64("pass_read_limit_MB"));

auto reader =
cudf::io::chunked_orc_reader(output_limit_MB * Mbytes, read_limit_MB * Mbytes, read_opts);
cudf::size_type num_rows{0};

timer.start();
do {
auto chunk = reader.read_chunk();
num_rows += chunk.tbl->num_rows();
ttnghia marked this conversation as resolved.
Show resolved Hide resolved
} while (reader.has_next());
timer.stop();

CUDF_EXPECTS(num_rows == num_rows_to_read, "Unexpected number of rows");
});
} else { // not is_chunked_read
state.exec(
nvbench::exec_tag::sync | nvbench::exec_tag::timer, [&](nvbench::launch&, auto& timer) {
try_drop_l3_cache();

timer.start();
auto const result = cudf::io::read_orc(read_opts);
timer.stop();

CUDF_EXPECTS(result.tbl->num_columns() == num_cols, "Unexpected number of columns");
CUDF_EXPECTS(result.tbl->num_rows() == num_rows_to_read, "Unexpected number of rows");
});
}

auto const time = state.get_summary("nv/cold/time/gpu/mean").get_float64("value");
state.add_element_count(static_cast<double>(data_size) / time, "bytes_per_second");
Expand All @@ -57,6 +85,8 @@ void orc_read_common(cudf::size_type num_rows_to_read,
state.add_buffer_size(source_sink.size(), "encoded_file_size", "encoded_file_size");
}

} // namespace

template <data_type DataType, cudf::io::io_type IOType>
void BM_orc_read_data(nvbench::state& state,
nvbench::type_list<nvbench::enum_type<DataType>, nvbench::enum_type<IOType>>)
Expand All @@ -79,13 +109,11 @@ void BM_orc_read_data(nvbench::state& state,
return view.num_rows();
}();

orc_read_common(num_rows_written, source_sink, state);
orc_read_common<false>(num_rows_written, source_sink, state);
}

template <cudf::io::io_type IOType, cudf::io::compression_type Compression>
void BM_orc_read_io_compression(
nvbench::state& state,
nvbench::type_list<nvbench::enum_type<IOType>, nvbench::enum_type<Compression>>)
template <cudf::io::io_type IOType, cudf::io::compression_type Compression, bool chunked_read>
void orc_read_io_compression(nvbench::state& state)
{
auto const d_type = get_type_or_group({static_cast<int32_t>(data_type::INTEGRAL_SIGNED),
static_cast<int32_t>(data_type::FLOAT),
Expand All @@ -95,15 +123,21 @@ void BM_orc_read_io_compression(
static_cast<int32_t>(data_type::LIST),
static_cast<int32_t>(data_type::STRUCT)});

cudf::size_type const cardinality = state.get_int64("cardinality");
cudf::size_type const run_length = state.get_int64("run_length");
auto const [cardinality, run_length] = [&]() -> std::pair<cudf::size_type, cudf::size_type> {
if constexpr (chunked_read) {
return {0, 4};
} else {
return {static_cast<cudf::size_type>(state.get_int64("cardinality")),
static_cast<cudf::size_type>(state.get_int64("run_length"))};
}
}();
cuio_source_sink_pair source_sink(IOType);

auto const num_rows_written = [&]() {
auto const tbl = create_random_table(
cycle_dtypes(d_type, num_cols),
table_size_bytes{data_size},
data_profile_builder().cardinality(cardinality).avg_run_length(run_length));
data_profile_builder{}.cardinality(cardinality).avg_run_length(run_length));
auto const view = tbl->view();

cudf::io::orc_writer_options opts =
Expand All @@ -113,7 +147,23 @@ void BM_orc_read_io_compression(
return view.num_rows();
}();

orc_read_common(num_rows_written, source_sink, state);
orc_read_common<chunked_read>(num_rows_written, source_sink, state);
}

template <cudf::io::io_type IOType, cudf::io::compression_type Compression>
void BM_orc_read_io_compression(
nvbench::state& state,
nvbench::type_list<nvbench::enum_type<IOType>, nvbench::enum_type<Compression>>)
{
return orc_read_io_compression<IOType, Compression, false>(state);
}

template <cudf::io::compression_type Compression>
void BM_orc_chunked_read_io_compression(nvbench::state& state,
nvbench::type_list<nvbench::enum_type<Compression>>)
{
// Only run benchmark using HOST_BUFFER IO.
return orc_read_io_compression<cudf::io::io_type::HOST_BUFFER, Compression, true>(state);
}

using d_type_list = nvbench::enum_type_list<data_type::INTEGRAL_SIGNED,
Expand Down Expand Up @@ -146,3 +196,13 @@ NVBENCH_BENCH_TYPES(BM_orc_read_io_compression, NVBENCH_TYPE_AXES(io_list, compr
.set_min_samples(4)
.add_int64_axis("cardinality", {0, 1000})
.add_int64_axis("run_length", {1, 32});

// Should have the same parameters as `BM_orc_read_io_compression` for comparison.
NVBENCH_BENCH_TYPES(BM_orc_chunked_read_io_compression, NVBENCH_TYPE_AXES(compression_list))
.set_name("orc_chunked_read_io_compression")
.set_type_axes_names({"compression"})
.set_min_samples(4)
// The input has approximately 520MB and 127K rows.
// The limits below are given in MBs.
.add_int64_axis("chunk_read_limit_MB", {50, 250, 700})
.add_int64_axis("pass_read_limit_MB", {50, 250, 700});
64 changes: 60 additions & 4 deletions cpp/include/cudf/io/detail/orc.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,15 @@ class chunked_orc_writer_options;

namespace orc::detail {

// Forward declaration of the internal reader class
class reader_impl;

/**
* @brief Class to read ORC dataset data into columns.
*/
class reader {
ttnghia marked this conversation as resolved.
Show resolved Hide resolved
private:
class impl;
std::unique_ptr<impl> _impl;
std::unique_ptr<reader_impl> _impl;

public:
/**
Expand All @@ -68,10 +70,63 @@ class reader {
/**
* @brief Reads the entire dataset.
*
* @param options Settings for controlling reading behavior
* @return The set of columns along with table metadata
*/
table_with_metadata read(orc_reader_options const& options);
table_with_metadata read();
};

/**
* @brief The reader class that supports iterative reading from an array of data sources.
*/
class chunked_reader {
private:
std::unique_ptr<reader_impl> _impl;

public:
/**
* @copydoc cudf::io::chunked_orc_reader::chunked_orc_reader(std::size_t, std::size_t, size_type,
* orc_reader_options const&, rmm::cuda_stream_view, rmm::device_async_resource_ref)
*
* @param sources Input `datasource` objects to read the dataset from
*/
explicit chunked_reader(std::size_t chunk_read_limit,
std::size_t pass_read_limit,
size_type output_row_granularity,
std::vector<std::unique_ptr<cudf::io::datasource>>&& sources,
orc_reader_options const& options,
rmm::cuda_stream_view stream,
rmm::device_async_resource_ref mr);
/**
* @copydoc cudf::io::chunked_orc_reader::chunked_orc_reader(std::size_t, std::size_t,
* orc_reader_options const&, rmm::cuda_stream_view, rmm::device_async_resource_ref)
*
* @param sources Input `datasource` objects to read the dataset from
*/
explicit chunked_reader(std::size_t chunk_read_limit,
std::size_t pass_read_limit,
std::vector<std::unique_ptr<cudf::io::datasource>>&& sources,
orc_reader_options const& options,
rmm::cuda_stream_view stream,
rmm::device_async_resource_ref mr);

/**
* @brief Destructor explicitly-declared to avoid inlined in header.
*
* Since the declaration of the internal `_impl` object does not exist in this header, this
* destructor needs to be defined in a separate source file which can access to that object's
* declaration.
*/
~chunked_reader();

/**
* @copydoc cudf::io::chunked_orc_reader::has_next
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since cudf::io::chunked_orc_reader derives from cudf::io::chunked_reader, wouldn't it be preferable to have the main docs here, and the @copydoc in chunked_orc_reader?
Have I misread the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reminding that. I'll move the docs to chunked_orc_reader in io/orc.hppwhich is the public header. The detail headerio/detail/orc.hppshould just have@copydoc`.

*/
[[nodiscard]] bool has_next() const;

/**
* @copydoc cudf::io::chunked_orc_reader::read_chunk
*/
[[nodiscard]] table_with_metadata read_chunk() const;
};

/**
Expand Down Expand Up @@ -126,5 +181,6 @@ class writer {
*/
void close();
};

} // namespace orc::detail
} // namespace cudf::io
Loading
Loading