Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write string data directly to column_buffer in Parquet reader #13302

Merged
merged 193 commits into from
Jun 23, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
193 commits
Select commit Hold shift + click to select a range
63a2d88
Rework of level decoding to be considerably more parallel. Previousl…
nvdbaranec Apr 23, 2023
85dfe8a
Merge branch 'branch-23.06' into parquet_level_optimization
nvdbaranec Apr 23, 2023
eb37a59
Merge branch 'branch-23.06' into parquet_level_optimization
nvdbaranec Apr 24, 2023
9211bcc
Style formatting.
nvdbaranec Apr 24, 2023
2a2f6b2
checkpoint
etseidl Apr 25, 2023
0cd8481
Merge remote-tracking branch 'origin/parquet_level_optimization' into…
etseidl Apr 25, 2023
2b1f7d5
checkpoint
etseidl Apr 25, 2023
6569684
fix is_bounds_page()
etseidl Apr 25, 2023
2f8836b
pass decoders into page_bounds
etseidl Apr 25, 2023
db7e2a4
copy over changes from string_cols
etseidl Apr 25, 2023
90e214c
works except skip_rows
etseidl Apr 26, 2023
567a0ab
fix bug with skip_rows
etseidl Apr 26, 2023
fb45e8c
debug prints
etseidl Apr 26, 2023
6d89752
fix bug in page_bounds
etseidl Apr 26, 2023
5035703
optimization for countDictEntries
etseidl Apr 26, 2023
595e2e1
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl Apr 26, 2023
37f7d46
fix another skip_rows bug, and round robin the countDictEntries calc
etseidl Apr 26, 2023
19396bf
fix for chunked reads
etseidl Apr 26, 2023
3780494
fix bug with setting the offsets for null values...chunked reader
etseidl Apr 27, 2023
4373b8f
fix edge case where skip_rows ends on a page boundary
etseidl Apr 27, 2023
3a39970
move test for long strings
etseidl Apr 28, 2023
743b3f5
more string tweaks
etseidl Apr 28, 2023
ad651cf
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl Apr 28, 2023
08b68d7
change offsets to size_type
etseidl Apr 28, 2023
269043d
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 1, 2023
b79c9ec
Remove definition and repetition levels from page_data_s struct to de…
nvdbaranec May 1, 2023
38792e1
Merge remote-tracking branch 'origin/parquet_level_optimization' into…
etseidl May 1, 2023
3320cde
fixes after merging
etseidl May 1, 2023
897db8c
split out separate decoder for string columns
etseidl May 1, 2023
15f4e12
remove test for string hash
etseidl May 1, 2023
b9399c0
get rid of little used variables
etseidl May 1, 2023
57d7aa8
fix a few edge cases
etseidl May 2, 2023
655c048
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 2, 2023
72d301a
use char parallel strcpy when avg string len is 32 or higher
etseidl May 2, 2023
7768ae5
overlap decode kernels using stream pool
etseidl May 3, 2023
cbabce2
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 3, 2023
2d42bf3
Squeeze level values into uint16_t instead of uint32_t, shrink deocde…
nvdbaranec May 3, 2023
51624c8
refactor to remove string decoding code from page_data.cu
etseidl May 3, 2023
b3afd25
Merge remote-tracking branch 'origin/parquet_level_optimization' into…
etseidl May 3, 2023
59bd2d6
finish merge
etseidl May 3, 2023
8d81822
Merge branch 'branch-23.06' into parquet_level_optimization
nvdbaranec May 3, 2023
077ff39
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 4, 2023
cfbd5e1
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 4, 2023
8986afb
clean up
etseidl May 4, 2023
02bf251
Merge branch 'feature/string_cols_v2' of github.com:etseidl/cudf into…
etseidl May 4, 2023
6b352a5
clean up
etseidl May 4, 2023
6aa1120
add docstrings
etseidl May 4, 2023
fbd9fc6
more docstrings and clean up
etseidl May 4, 2023
d85a8e4
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 4, 2023
f4cf521
PR review fixes. Removed unused shuffle_ptr() function. Corrected a…
nvdbaranec May 4, 2023
305bf09
test for string col earlier
etseidl May 4, 2023
2d406c2
Merge remote-tracking branch 'origin/parquet_level_optimization' into…
etseidl May 4, 2023
ad231f8
Change the level_decode_buf (temp space) to use rmm::mr::get_current_…
nvdbaranec May 4, 2023
301cce8
need to call setupLocalPageInfo or bad things happen
etseidl May 4, 2023
6db20c1
add todo
etseidl May 4, 2023
07b0d73
final fix for restoring decode cache. add some consts.
etseidl May 5, 2023
2c8dbb4
more consts
etseidl May 5, 2023
00a87aa
Merge remote-tracking branch 'origin/parquet_level_optimization' into…
etseidl May 5, 2023
8cb430e
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 5, 2023
7b392f6
simplify string col detection
etseidl May 5, 2023
6156e66
more consts
etseidl May 5, 2023
fa0cdfc
cleanup
etseidl May 5, 2023
62b61a2
add some TODOs
etseidl May 5, 2023
ab6d42e
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 5, 2023
3d5c1c8
Use a dynamically sized type for level/repetition data. In almost al…
nvdbaranec May 8, 2023
ee1a6fc
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 8, 2023
e01404a
Merge remote-tracking branch 'origin/parquet_level_optimization' into…
etseidl May 8, 2023
ecb336e
finish merge
etseidl May 8, 2023
c917d27
Merge branch 'feature/string_cols_v2' of github.com:etseidl/cudf into…
etseidl May 8, 2023
c1aebf3
fix string buffer length
etseidl May 8, 2023
217e12f
fix for columns that start with null values
etseidl May 8, 2023
b49ff95
fix for decimal columns
etseidl May 8, 2023
8054f10
another fix for null handlng
etseidl May 9, 2023
84762b9
one more bug cleaning up nulls
etseidl May 9, 2023
86ec00d
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 9, 2023
24fb8f2
PR review feedback.
nvdbaranec May 9, 2023
e85577d
Merge branch 'branch-23.06' into parquet_level_optimization
nvdbaranec May 9, 2023
abf4153
Merge remote-tracking branch 'origin/parquet_level_optimization' into…
etseidl May 9, 2023
32ce89b
minor cleanup
etseidl May 9, 2023
859eb43
PR review feedback.
nvdbaranec May 10, 2023
809d4e9
Merge remote-tracking branch 'origin/parquet_level_optimization' into…
etseidl May 10, 2023
d994b0c
finish merge
etseidl May 10, 2023
f12bcc9
Fix a bug where specific usage of skip_rows/num_rows could cause a ra…
nvdbaranec May 10, 2023
2d0739e
Merge remote-tracking branch 'origin/parquet_level_optimization' into…
etseidl May 10, 2023
26d03a3
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 10, 2023
9ceaf18
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 12, 2023
8804007
PR review feedback.
nvdbaranec May 15, 2023
8bbbab1
Merge branch 'branch-23.06' into parquet_level_optimization
nvdbaranec May 15, 2023
5bbf9a1
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 15, 2023
86e8c2a
Merge remote-tracking branch 'origin/parquet_level_optimization' into…
etseidl May 15, 2023
80219c9
Merge remote-tracking branch 'cudf/branch-23.06' into feature/string_…
etseidl May 15, 2023
dc681ec
Merge branch 'branch-23.06' into feature/string_cols_v2
vuule May 15, 2023
9d09842
spelling
etseidl May 16, 2023
f23f9cf
simplify out_thread0 calc
etseidl May 16, 2023
6f73510
Merge branch 'feature/string_cols_v2' of github.com:etseidl/cudf into…
etseidl May 16, 2023
e80c07b
fix for string col detection
etseidl May 16, 2023
9d754fd
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 16, 2023
140749d
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 16, 2023
62befec
Merge remote-tracking branch 'cudf/branch-23.06' into feature/string_…
etseidl May 17, 2023
6e89596
finish merge
etseidl May 17, 2023
f196e60
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 18, 2023
d7db9bb
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 19, 2023
f1669bc
alternate way to do column_buffer
etseidl May 19, 2023
2b5a5e0
remove unused constructor
etseidl May 19, 2023
30bfe9f
get rid of another unnecessary function
etseidl May 19, 2023
dea407c
rearrange some
etseidl May 19, 2023
c4781c0
Merge branch 'rapidsai:branch-23.06' into col_buf_v2
etseidl May 20, 2023
7388dec
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 20, 2023
10d00d0
move make_column into policy object
etseidl May 22, 2023
22b3d55
reduce diff
etseidl May 22, 2023
b907a1a
unify interfaces for policy objects
etseidl May 22, 2023
f1d2b84
Merge branch 'col_buf_v2' into feature/string_cols_v2
etseidl May 22, 2023
73b8229
change template param name to string_policy
etseidl May 22, 2023
e921bd9
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 22, 2023
af1e407
restore null_count_back_copier
etseidl May 22, 2023
1e18f1c
fix for page spanning rows
etseidl May 22, 2023
681d57d
undo some reformatting of comments
etseidl May 23, 2023
d8bb072
change make_column to make_string_column
etseidl May 23, 2023
2c596f9
move gpuDecodeRleBooleans
etseidl May 23, 2023
ca84c23
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 23, 2023
06521bd
remove t from docs
etseidl May 23, 2023
c20214b
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 24, 2023
6625690
CRTP, I think
vuule May 25, 2023
fcc6ca3
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 25, 2023
321eb46
Merge branch 'rapidsai:branch-23.06' into feature/string_cols_v2
etseidl May 25, 2023
eaf457f
Merge branch 'pr/etseidl/13302-1' into feature/string_cols_v2
etseidl May 25, 2023
d742498
checkpoint CRTP changes
etseidl May 25, 2023
3e70821
better fix for initializing _strings
etseidl May 25, 2023
cb934a8
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl May 25, 2023
a120a4a
json no longer needs to fully qualify make_column
etseidl May 25, 2023
755918b
calculate col_sizes on device to save a round trip for the PageInfo
etseidl May 26, 2023
e1fd103
calculate offsets with exclusive scan
etseidl May 26, 2023
539ef1f
cleanups
etseidl May 26, 2023
5070a20
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl May 26, 2023
723e21d
only call string decode kernel if there are string columns
etseidl May 26, 2023
769945e
Merge branch 'feature/string_cols_v2' of github.com:etseidl/cudf into…
etseidl May 26, 2023
2bf8c19
offsets can be page local now
etseidl May 30, 2023
449ab65
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl May 30, 2023
15986a9
move create() back to cpp file
etseidl May 30, 2023
614c460
remove memory resource from column buffer. instead pass it in when
etseidl May 30, 2023
7f1d245
remove if and add CUDF_EXPECTS to allocate_strings_data()
etseidl May 30, 2023
a0fb80e
get rid of anonymous namespace
etseidl May 30, 2023
fac2f3f
delete copy constructor
etseidl May 30, 2023
a03e3ca
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl May 30, 2023
53b38b8
revert removal of memory resource
etseidl May 31, 2023
d35236c
Merge branch 'feature/string_cols_v2' of github.com:etseidl/cudf into…
etseidl May 31, 2023
fc696e7
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 1, 2023
30d1698
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 2, 2023
d9614ff
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 2, 2023
6408f5b
fix some bitrot and add explanation for presence of gpuDecodeStringPa…
etseidl Jun 2, 2023
204d2a2
add const versions of pointer accessors
etseidl Jun 2, 2023
f2028d2
document template params
etseidl Jun 2, 2023
2a9b0ff
only need one version of is_string_col
etseidl Jun 6, 2023
2b30f23
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 6, 2023
519ce00
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 7, 2023
7989135
check for string overflow
etseidl Jun 7, 2023
f1b3d23
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 7, 2023
1bfece4
more size_type -> size_t
etseidl Jun 7, 2023
6a3400f
implement suggestion from review
etseidl Jun 7, 2023
0799fed
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 7, 2023
eb94a91
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 7, 2023
59bfd1b
use thrust to calculate page string offsets
etseidl Jun 9, 2023
5f3e5af
some cleanup
etseidl Jun 9, 2023
6c3959b
Merge remote-tracking branch 'cudf/branch-23.08' into feature/string_…
etseidl Jun 9, 2023
cbd742f
throw std::overflow_error if string column gets too big
etseidl Jun 9, 2023
7aef9e8
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 9, 2023
da830a2
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 9, 2023
967ecf6
only allocate memory for string nesting data if there are string columns
etseidl Jun 9, 2023
8749724
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 9, 2023
7842c00
Merge remote-tracking branch 'cudf/branch-23.08' into feature/string_…
etseidl Jun 12, 2023
0d1ac33
east const for new files
etseidl Jun 12, 2023
9f054dc
Merge branch 'feature/string_cols_v2' of github.com:etseidl/cudf into…
etseidl Jun 12, 2023
774e88f
Merge branch 'branch-23.08' into feature/string_cols_v2
ttnghia Jun 12, 2023
9653fef
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 14, 2023
b09f31c
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 15, 2023
1c84c50
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 20, 2023
98b345f
add new worst-case benchmark for strings
etseidl Jun 21, 2023
3de3554
use stream pool for decode kernels
etseidl Jun 21, 2023
a4548e7
move stream pool to impl object
etseidl Jun 21, 2023
6ce50af
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 21, 2023
a2ebf32
Merge branch 'feature/string_cols_v2' of github.com:etseidl/cudf into…
etseidl Jun 21, 2023
572836e
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 22, 2023
47cd9e1
filter on data types in setupLocalPageInfo
etseidl Jun 22, 2023
a1304c2
remove experimental decode kernel
etseidl Jun 22, 2023
ce2acbe
Revert "move stream pool to impl object"
etseidl Jun 22, 2023
8653b93
finish moving back to static stream pool
etseidl Jun 22, 2023
6ee7b29
add comment for NUM_DECODERS
etseidl Jun 22, 2023
a0db39c
call synch on _stream before launching decode kernels
etseidl Jun 22, 2023
a42137a
Merge branch 'branch-23.08' into feature/string_cols_v2
etseidl Jun 22, 2023
2d1d556
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 22, 2023
b3ebab5
workaround for nvbench shutdown error
etseidl Jun 22, 2023
19487af
Merge branch 'rapidsai:branch-23.08' into feature/string_cols_v2
etseidl Jun 23, 2023
5b3d070
move page bounds check into setupLocalPageInfo
etseidl Jun 23, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 1 addition & 20 deletions cpp/src/io/parquet/page_data.cu
Original file line number Diff line number Diff line change
Expand Up @@ -761,34 +761,15 @@ __global__ void __launch_bounds__(decode_block_size) gpuDecodePageData(
int page_idx = blockIdx.x;
int t = threadIdx.x;
int out_thread0;
[[maybe_unused]] null_count_back_copier _{s, t};

if (!setupLocalPageInfo(
s, &pages[page_idx], chunks, min_row, num_rows, non_string_filter{chunks}, true)) {
return;
}

// this needs to be declared after we've decided to process this page
[[maybe_unused]] null_count_back_copier _{s, t};
bool const has_repetition = s->col.max_level[level_type::REPETITION] > 0;

// FIXME do this in setupLocalPageInfo
//
// if we have no work to do (eg, in a skip_rows/num_rows case) in this page.
//
// corner case: in the case of lists, we can have pages that contain "0" rows if the current row
// starts before this page and ends after this page:
// P0 P1 P2
// |---------|---------|----------|
// ^------------------^
// row start row end
// P1 will contain 0 rows
//
if (s->num_rows == 0 &&
!(has_repetition && (is_bounds_page(s, min_row, num_rows, has_repetition) ||
is_page_contained(s, min_row, num_rows)))) {
return;
}

if (s->dict_base) {
out_thread0 = (s->dict_bits > 0) ? 64 : 32;
} else {
Expand Down
75 changes: 44 additions & 31 deletions cpp/src/io/parquet/page_decode.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -975,21 +975,22 @@ inline __device__ bool setupLocalPageInfo(page_state_s* const s,
bool is_decode_step)
{
int t = threadIdx.x;
int chunk_idx;

// Fetch page info
if (!t) {
s->page = *p;
s->nesting_info = nullptr;
s->col = chunks[s->page.chunk_idx];
}
__syncthreads();

// return false if this is a dictionary page or it does not pass the filter condition
if ((s->page.flags & PAGEINFO_FLAGS_DICTIONARY) != 0 || (!filter(s->page))) { return false; }
if ((s->page.flags & PAGEINFO_FLAGS_DICTIONARY) != 0 || !filter(s->page)) { return false; }

// Fetch column chunk info
chunk_idx = s->page.chunk_idx;
if (!t) { s->col = chunks[chunk_idx]; }
// our starting row (absolute index) is
// col.start_row == absolute row index
// page.chunk-row == relative row index within the chunk
size_t const page_start_row = s->col.start_row + s->page.chunk_row;

// if we can use the nesting decode cache, set it up now
auto const can_use_decode_cache = s->page.nesting_info_size <= max_cacheable_nesting_decode_info;
Expand All @@ -1011,8 +1012,28 @@ inline __device__ bool setupLocalPageInfo(page_state_s* const s,
depth += blockDim.x;
}
}

if (!t) {
s->nesting_info = can_use_decode_cache ? s->nesting_decode_cache : s->page.nesting_decode;

// NOTE: s->page.num_rows, s->col.chunk_row, s->first_row and s->num_rows will be
// invalid/bogus during first pass of the preprocess step for nested types. this is ok
// because we ignore these values in that stage.
auto const max_row = min_row + num_rows;

// if we are totally outside the range of the input, do nothing
if ((page_start_row > max_row) || (page_start_row + s->page.num_rows < min_row)) {
s->first_row = 0;
s->num_rows = 0;
}
// otherwise
else {
s->first_row = page_start_row >= min_row ? 0 : min_row - page_start_row;
auto const max_page_rows = s->page.num_rows - s->first_row;
s->num_rows = (page_start_row + s->first_row) + max_page_rows <= max_row
? max_page_rows
: max_row - (page_start_row + s->first_row);
}
}

__syncthreads();
Expand All @@ -1030,14 +1051,27 @@ inline __device__ bool setupLocalPageInfo(page_state_s* const s,
}
__syncthreads();

// if we have no work to do (eg, in a skip_rows/num_rows case) in this page.
//
// corner case: in the case of lists, we can have pages that contain "0" rows if the current row
// starts before this page and ends after this page:
// P0 P1 P2
// |---------|---------|----------|
// ^------------------^
// row start row end
// P1 will contain 0 rows
//
// NOTE: this check needs to be done after the null counts have been zeroed out
bool const has_repetition = s->col.max_level[level_type::REPETITION] > 0;
if (is_decode_step && s->num_rows == 0 &&
!(has_repetition && (is_bounds_page(s, min_row, num_rows, has_repetition) ||
is_page_contained(s, min_row, num_rows)))) {
return false;
}

if (!t) {
s->error = 0;

// our starting row (absolute index) is
// col.start_row == absolute row index
// page.chunk-row == relative row index within the chunk
size_t page_start_row = s->col.start_row + s->page.chunk_row;

// IMPORTANT : nested schemas can have 0 rows in a page but still have
// values. The case is:
// - On page N-1, the last row starts, with 2/6 values encoded
Expand Down Expand Up @@ -1126,27 +1160,6 @@ inline __device__ bool setupLocalPageInfo(page_state_s* const s,
s->dtype_len = 8; // Convert to 64-bit timestamp
}

// NOTE: s->page.num_rows, s->col.chunk_row, s->first_row and s->num_rows will be
// invalid/bogus during first pass of the preprocess step for nested types. this is ok
// because we ignore these values in that stage.
{
auto const max_row = min_row + num_rows;

// if we are totally outside the range of the input, do nothing
if ((page_start_row > max_row) || (page_start_row + s->page.num_rows < min_row)) {
s->first_row = 0;
s->num_rows = 0;
}
// otherwise
else {
s->first_row = page_start_row >= min_row ? 0 : min_row - page_start_row;
auto const max_page_rows = s->page.num_rows - s->first_row;
s->num_rows = (page_start_row + s->first_row) + max_page_rows <= max_row
? max_page_rows
: max_row - (page_start_row + s->first_row);
}
}

// during the decoding step we need to offset the global output buffers
// for each level of nesting so that we write to the section this page
// is responsible for.
Expand Down
21 changes: 1 addition & 20 deletions cpp/src/io/parquet/page_string_decode.cu
Original file line number Diff line number Diff line change
Expand Up @@ -663,38 +663,19 @@ __global__ void __launch_bounds__(decode_block_size) gpuDecodeStringPageData(
page_state_buffers_s* const sb = &state_buffers;
int const page_idx = blockIdx.x;
int const t = threadIdx.x;
[[maybe_unused]] null_count_back_copier _{s, t};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this avoid the race condition when two separate kernels visit the same page? Won't one of them erroneously zero the page out that another may have written a valid value to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one invocation should make it past the filter. That one will zero out the null count and then the back copier will copy it back to the page. @vuule added the logic to make the back copy a no-op if the setup returns early.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Checking to see if the nesting_info pointer is null.


if (!setupLocalPageInfo(
s, &pages[page_idx], chunks, min_row, num_rows, string_filter{chunks}, true)) {
return;
}

// this needs to be declared after we've decided to process this page
[[maybe_unused]] null_count_back_copier _{s, t};
bool const has_repetition = s->col.max_level[level_type::REPETITION] > 0;

// offsets are local to the page
if (t == 0) { last_offset = 0; }
__syncthreads();

// FIXME do this in setupLocalPageInfo
//
// if we have no work to do (eg, in a skip_rows/num_rows case) in this page.
//
// corner case: in the case of lists, we can have pages that contain "0" rows if the current row
// starts before this page and ends after this page:
// P0 P1 P2
// |---------|---------|----------|
// ^------------------^
// row start row end
// P1 will contain 0 rows
//
if (s->num_rows == 0 &&
!(has_repetition && (is_bounds_page(s, min_row, num_rows, has_repetition) ||
is_page_contained(s, min_row, num_rows)))) {
return;
}

int const out_thread0 = s->dict_base && s->dict_bits == 0 ? 32 : 64;
int const leaf_level_index = s->col.max_nesting_depth - 1;
PageNestingDecodeInfo* const nesting_info_base = s->nesting_info;
Expand Down
8 changes: 6 additions & 2 deletions cpp/src/io/parquet/reader_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,12 @@ int constexpr STREAM_POOL_SIZE = NUM_DECODERS * APPROX_NUM_THREADS;

auto& get_stream_pool()
{
static auto pool = rmm::cuda_stream_pool(STREAM_POOL_SIZE);
return pool;
// TODO: creating this on the heap because there were issues with trying to call the
// stream pool destructor during cuda shutdown that lead to a segmentation fault in
// nvbench. this allocation is being deliberately leaked to avoid the above, but still
// results in non-fatal warnings when running nvbench in cuda-gdb.
static auto pool = new rmm::cuda_stream_pool{STREAM_POOL_SIZE};
return *pool;
}

} // namespace
Expand Down