[BUG] decompressed batches corrupt if they are made spillable #7827
Labels
bug
Something isn't working
reliability
Features to improve reliability or bugs that severly impact the reliability of the plugin
shuffle
things that impact the shuffle plugin
While working on #7777 I ran into an issue where a decompressed batch (via nvcomp/UCX) was made spillable, but then I got a corrupted batch out when calling
getColumnarBatch
. This is likely an issue of 23.04.So far, when we decompress batches in
GpuCoalesceBatches
https://github.com/NVIDIA/spark-rapids/blob/branch-23.02/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCoalesceBatches.scala#L645, we are taking theTableMeta
directly from the compressed vector, instead of building a new one without compression info.Code that uses the metadata to rebuild the ColumnarBatch would produce an invalid batch, because we do different things when the batch has codecs defined: https://github.com/NVIDIA/spark-rapids/blob/branch-23.02/sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsDeviceMemoryStore.scala#L282, and
spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsBufferStore.scala
Line 309 in 96ed06b
I think I can fix this with #7777, but adding an issue since it's a bug, and I am not sure if it was made worse in 23.04 because of #7572, since we now rely on
TableMeta
on the first creation of a batch from a RapidsBuffer.The text was updated successfully, but these errors were encountered: