Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CompactProtocolWriter and ProtobufWriter API provide no encapsulation of the output buffer #7015

Open
vuule opened this issue Dec 15, 2020 · 3 comments
Labels
cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code.

Comments

@vuule
Copy link
Contributor

vuule commented Dec 15, 2020

Problem:
The protocol writer classes take a pointer to a vector and use it as the output buffer. Writes change the size of this vector. This vector is also modified (including size changes) outside of the writers. The ORC/Parquet writers have a std::vector data member that is reused for protocol writes and manually reset between uses. ORC writer also reuses the ProtobufWriter object. In addition, Parquet writer reuses the output buffer to output data unrelated to CompactProtocolWriter. All this makes the use error-prone.

Solution proposal:
Modify the protocol writer API to use an internal output buffer and only provide getters for it. Also, protocol writer objects should not be reused (and cannot, with the proposed API). There shouldn't be a buffer data member in xyz::writer::impl.
These changes would limit the scope of the state to functions instead of the lifetime of xyz::writer::impl objects.

@vuule vuule added libcudf Affects libcudf (C++/CUDA) code. cuIO cuIO issue tech debt improvement Improvement / enhancement to an existing function labels Dec 15, 2020
@harrism harrism changed the title CompactProtocolWriter and ProtobufWriter API provide no encapsulation of the output buffer [BUG] CompactProtocolWriter and ProtobufWriter API provide no encapsulation of the output buffer Dec 16, 2020
@vuule vuule changed the title [BUG] CompactProtocolWriter and ProtobufWriter API provide no encapsulation of the output buffer CompactProtocolWriter and ProtobufWriter API provide no encapsulation of the output buffer Dec 16, 2020
@github-actions
Copy link

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

@github-actions github-actions bot added the stale label Feb 16, 2021
@vuule
Copy link
Contributor Author

vuule commented Feb 17, 2021

Issue still relevant.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

2 participants