[FEA] Make SpillableColumnarBatch inform Spill code of actual usage of the batch #6561
Labels
performance
A performance related task/issue
reliability
Features to improve reliability or bugs that severly impact the reliability of the plugin
Is your feature request related to a problem? Please describe.
Currently SpillableColumnarBatch does some things that are far from ideal for the spill code. When you get a batch it will lock the underlying spill id, create the ColumnarBatch, and then release the spill id. After that the regular reference counting is used to keep the buffers that make up the ColumnarBatch around until they are no longer needed.
The problem with this is that the RapidsBufferCatalog thinks that all of the buffers are spillable, even when reference counts prevent them from actually being freed. Ideally as long as someone sill holds a reference to the underlying buffer we would not release the spill id. I think we can do this, but we would need to add a layer of indirection at the DeviceMemoryBuffer layer. We could create a new SpillableMemoryBuffer that would hold both a DeviceMemoryBuffer and the buffer/spill id. It would have a set of reference counts separate from the DeviceMemoryBuffer. When the SpillableMemoryBuffer reaches a reference count of 0, then it would release the spill id. Then the spill code would be allowed to actually free the underlying DeviceMemoryBuffer when spilling.
The text was updated successfully, but these errors were encountered: