Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix race in ORC string dictionary creation (#13214)
Unfortunately this is really hard to reproduce. For whatever reason I had to try and reproduce this on a relatively small data set with at least 140,001 rows or more, where one column is a LIST<STRING> but all of the lists are empty lists and another column is a STRUCT column with two STRING child columns where all of the STRINGS are empty. I also had to sort and partition the data before doing the write, and it had to be in a very specific environment with T4 GPUs. I don't know why all of those were needed to make the race happen regularly, but it did. Because of this complexity in reproducing it I have not added in any unit tests. The problem was essentially a race when trying to calculate dictionary duplication for strings in ORC. As a part of this a function `LoadNonNullIndices` was being called that was supposed to set a value `nnz` in a shared memory location `s`. In the normal case a loop was taken where `__syncthreads()` was called, but if there were no rows in the column (the LIST<STRING> column) then the loop was not taken and it was a race to see if `nnz` which was set to 0 by thread 0 showed up in all of the threads or not. What made this crash is that this `nnz` value is used to determine what happens in the rest of the kernel to see if it reads data, or writes to temp memory (which is not allocated if previous processing shows that there is no need for it), or any of that. If `nnz` is non-zero then it tries to do all of those things and bad stuff starts to happen. Authors: - Robert (Bobby) Evans (https://github.com/revans2) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) URL: #13214
- Loading branch information