-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix leaks in read_table #449
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #449 +/- ##
=======================================
Coverage ? 78.16%
=======================================
Files ? 55
Lines ? 11594
Branches ? 11594
=======================================
Hits ? 9062
Misses ? 2032
Partials ? 500 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the fixed problems seem to be due to g_arrow?
Yep. So much g_object_unref
:)
Could you elaborate on what parts are kernel footguns and why?
Main things I was referencing are:
get_raw_arrow_data
-> returns a leaked struct with two elements. This means you have to consider the lifetimes of the raw schema and data, which generally will go into garrow or similar, and the actual struct itself, which is confusing. I added a todo in the code for this where we could probably use a callback.kernel_scan_data_next
-> This uses a callback model, but the things we pass into the callback are owned by engine. This is an inversion of how we usually handle it in which we say that things that get passed into a callback will be freed by kernel after the callback completes. I'd like to have a look at this and see if there's a good reason we don't free on kernel side, or if we can just change those semantics.- Longer lived things like a
Scan
which also get "scan data" and then we have to individually free each. I'm not sure there's much we can do here, but it's still important that we carefully document this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
What changes are proposed in this pull request?
Fix leaks in
read_table
. It now runs cleanly with-fsanitize=address
.Some of these are probably footguns in kernel that we should fix. I've also added comments for now in kernel for some places where engine is responsible to free things. We should look at these and see if we can/should move to a callback model so the freeing is not necessary.
The good news is, this didn't find any leaks in kernel itself, just bad c code :)
Fixes #448
How was this change tested?
valgrind