-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: remove panics in datafusion-common::scalar by making more operations return Result
#7901
chore: remove panics in datafusion-common::scalar by making more operations return Result
#7901
Conversation
The CI appears to be failing. Marking as draft until they are passing. If there are specific questions about this PR, please let us know. |
Thank you for the work @junjunjd |
863ed9a
to
4b61807
Compare
@alamb CI is fixed. This MR is ready for final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @junjunjd 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epic work @junjunjd thanks
I'm thinking if we can get rid of .expect
? Or at least provide more details in expect message?
I removed the |
4b61807
to
60d1543
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get rid of .expect
in tests 🤔?
@Weijun-H @comphead using unwrap and expect in tests is actually the preferred practice, see https://github.com/influxdata/influxdb/blob/main/docs/style_guide.md#dont-return-result-from-test-functions. It makes the test failure easier to parse for a human and the test framework will already provide all the necessary context on failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @junjunjd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR @junjunjd. We very much appreciate the effort -- I am sorry that the ticket description may have been misleading
Buried in #3313, it says
The goal is not to remove all panics but review and make sure we are using them appropriately. Bonus points for adding documentation for invariants.
Can you explain why you removed the panics that you did? I think most of them are "unreachable" so forcing client code to check for errors that will never happen makes it harder to work with (and is why this PR adds around 500 new lines of code)
@@ -330,9 +330,9 @@ impl PartialOrd for ScalarValue { | |||
let arr2 = list_arr2.value(i); | |||
|
|||
let lt_res = | |||
arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap(); | |||
arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This panic is reachable, for example if arr1
and arr2
have different data type, arrow::compute::kernels::cmp::lt
will panic.
I think it makes sense to return None
here instead of panicking and exiting since user just performs a partial order comparison.
This does not require any code change in client side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The potential downside of returning None rather than panic'ing is that it may mask a real bug and make it harder to track down -- comparing scalars of different types likely means they should have been coerced before
@@ -1970,13 +2020,14 @@ impl ScalarValue { | |||
), | |||
}, | |||
ScalarValue::Fixedsizelist(..) => { | |||
unimplemented!("FixedSizeList is not supported yet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb What would be the preferred way to handle unimplemented errors in datafusion
? There are many places where a NotImplemented
error is returned instead of using unimplemented!
and panicking. IMO returning an error makes more sense as user can choose to ignore unimplemented errors instead of panicking and exiting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree returning NotYetImplemented is a better choice
@alamb Thanks for the review. The majority of the added lines should be caused by line wrap reformat in tests. I have added comments to the panics I removed in scalar.rs. To summarize, these panics can be categorized into five types in general:
|
Result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thank you to @junjunjd for investing so much time, not just in the code but also evaluating the implications of the changes.
In general I think there are tradeoffs between panic
and returning Err
s -- specifically:
- Panic's are not as user friendly, but they stop computation immediately when something "unexpected" happens and thus are often easier to debug and locate the problem
- Errs are more user friendly, and can return messages that may help users workaround/fix whatever is wrong.
I realize there is a judgement call required to decide if something is "expected" or not and how much information users can get from error messages vs panics and weighing off a better user experience vs code that is more efficient to debug
I also realize the existing DataFusion codebase is not consistent in its handling of panics and errors.
On the balance I think this PR is an improvement to the code, and therefore I think it could be merged. Users of the affected APIs can simply unwrap
the Result and get the same panic behavior as before.
I think it would be ok not to merge it too if other reviewers feel strongly in the other direction.
@@ -330,9 +330,9 @@ impl PartialOrd for ScalarValue { | |||
let arr2 = list_arr2.value(i); | |||
|
|||
let lt_res = | |||
arrow::compute::kernels::cmp::lt(&arr1, &arr2).unwrap(); | |||
arrow::compute::kernels::cmp::lt(&arr1, &arr2).ok()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The potential downside of returning None rather than panic'ing is that it may mask a real bug and make it harder to track down -- comparing scalars of different types likely means they should have been coerced before
@@ -1970,13 +2020,14 @@ impl ScalarValue { | |||
), | |||
}, | |||
ScalarValue::Fixedsizelist(..) => { | |||
unimplemented!("FixedSizeList is not supported yet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree returning NotYetImplemented is a better choice
@junjunjd can you please merge and resolve the conflicts in this PR? Then we can merge it in |
Marking as draft as it needs conflicts resolved prior to being mergable. |
60d1543
to
691ebf4
Compare
691ebf4
to
2026514
Compare
@alamb Thank you for the review! I rebased the MR. It is ready for final review/merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @junjunjd
This reverts commit e642cc2.
Which issue does this PR close?
It removes the majority of panics in datafusion-common::scalar #3313.
Rationale for this change
Important move towards closing #3313
Closes #3313
What changes are included in this PR?
Replace most of the panics in datafusion-common::scalar by
internal_err
,not_impl_err
or otherDataFusionError
variantsAre these changes tested?
Yes
Are there any user-facing changes?
No