-
Notifications
You must be signed in to change notification settings - Fork 224
Fixed empty reader panic for NDJSON type infer #974
Fixed empty reader panic for NDJSON type infer #974
Conversation
assert_eq!( | ||
coerce_data_type(&[DataType::Null, DataType::Boolean]), | ||
DataType::Utf8 | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended? Should it not be just DataType::Boolean
?
Similarly, {"a": null}"
is currently infered as DataType::Struct(vec![])
and not DataType::Struct(vec![Field::new("a", DataType::Null, true)])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good observation. I think it is intended - infer_array
does not pass DataType::Null
to coerce_data_type
since arrow arrays are nullable and thus we handle Null
as part before passing to coerce_data_type
. The idea is here.
With that said, imo it is not the cleanest design and maybe we should clean it up?
Codecov Report
@@ Coverage Diff @@
## main #974 +/- ##
==========================================
+ Coverage 71.50% 71.52% +0.02%
==========================================
Files 356 356
Lines 19671 19679 +8
==========================================
+ Hits 14065 14076 +11
+ Misses 5606 5603 -3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks wonderful. Thanks a lot, very clean and super easy to follow.
Feel welcome to use this project to learn Rust - e.g. via incomplete PRs, PR reviews and/or challenging the design or code. :)
assert_eq!( | ||
coerce_data_type(&[DataType::Null, DataType::Boolean]), | ||
DataType::Utf8 | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good observation. I think it is intended - infer_array
does not pass DataType::Null
to coerce_data_type
since arrow arrays are nullable and thus we handle Null
as part before passing to coerce_data_type
. The idea is here.
With that said, imo it is not the cleanest design and maybe we should clean it up?
Hi there,
I am currently in the process of learning some Rust & what better way then to contribute ;)
infer
. Indeserialize
as well, to have consistent behavior.DataType::Null
. I decided against that, since an empty string is not valid JSON. Butnull
is, which also panicked—fixed that. Or do you prefer a separate PR?coerce_data_type
Fixes #911