CompactProtocolWriter
and ProtobufWriter
API provide no encapsulation of the output buffer
#7015
Labels
cuIO
cuIO issue
improvement
Improvement / enhancement to an existing function
libcudf
Affects libcudf (C++/CUDA) code.
Problem:
The protocol writer classes take a pointer to a vector and use it as the output buffer. Writes change the size of this vector. This vector is also modified (including size changes) outside of the writers. The ORC/Parquet writers have a
std::vector
data member that is reused for protocol writes and manually reset between uses. ORC writer also reuses theProtobufWriter
object. In addition, Parquet writer reuses the output buffer to output data unrelated toCompactProtocolWriter
. All this makes the use error-prone.Solution proposal:
Modify the protocol writer API to use an internal output buffer and only provide getters for it. Also, protocol writer objects should not be reused (and cannot, with the proposed API). There shouldn't be a buffer data member in
xyz::writer::impl
.These changes would limit the scope of the state to functions instead of the lifetime of
xyz::writer::impl
objects.The text was updated successfully, but these errors were encountered: