-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement ArrowArrayViewValidateFull()
#174
Conversation
Codecov Report
@@ Coverage Diff @@
## main #174 +/- ##
=======================================
Coverage ? 93.42%
=======================================
Files ? 7
Lines ? 1750
Branches ? 54
=======================================
Hits ? 1635
Misses ? 84
Partials ? 31
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a couple questions
src/nanoarrow/array.c
Outdated
"Expected element size >0 but found element size %ld at position %ld", | ||
(long)diff, (long)i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the message is kind of a non-sequitur for the error case here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took another pass at making them all a bit better!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few small nits and questions.
/// type_id == union_type_id_map[128 + child_index]. This value may be | ||
/// NULL in the case where child_id == type_id. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that part of the Arrow format? I couldn't find such a detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an implementation detail of the ArrowArrayView
...in the spec the type ids are always there. Arguably they should always be here, too, but there's at least one test that exploits that behaviour that I discovered so I figured I should probably document it. I think the reason it ended up that way was because you can ArrowArrayViewInitFromType()
and ArrowArrayViewAllocateChildren()
and letting the type id map stay NULL saves some special casing of unions there.
static int ArrowAssertInt8In(struct ArrowBufferView view, const int8_t* values, | ||
int64_t n_values, struct ArrowError* error) { | ||
for (int64_t i = 0; i < view.size_bytes; i++) { | ||
int item_found = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: why int
over bool
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's not really bool in C (there is _Bool, I suppose with C99 we can use that and I guess we're assuming C99 here?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😮
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's #include <stdbool.h>
too, which defines macros for bool
, true
, and false
. I didn't use it initially because there are some functions that return true/false in public headers and I wanted as few includes there as possible. I should probably go through and make it consistent (I may have used char
in some places).
Co-authored-by: Will Jones <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
This is required for IPC reading because corrupted offset and/or union type ID buffers could result in consumers accessing out-of-bounds elements.
ArrowArrayViewSetArray()
already checked the last element of offset buffers against lengths but didn't check the first element and didn't check for negative sequential offsets.