-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add additional testing for unwrap_cast_in_comparison
#4147
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -379,8 +379,10 @@ fn try_cast_literal_to_type( | |
|
||
#[cfg(test)] | ||
mod tests { | ||
use super::*; | ||
use crate::unwrap_cast_in_comparison::UnwrapCastExprRewriter; | ||
use arrow::datatypes::DataType; | ||
use arrow::compute::{cast_with_options, CastOptions}; | ||
use arrow::datatypes::{DataType, Field}; | ||
use datafusion_common::{DFField, DFSchema, DFSchemaRef, ScalarValue}; | ||
use datafusion_expr::expr_rewriter::ExprRewritable; | ||
use datafusion_expr::{cast, col, in_list, lit, try_cast, Expr}; | ||
|
@@ -653,4 +655,196 @@ mod tests { | |
fn null_decimal(precision: u8, scale: u8) -> Expr { | ||
lit(ScalarValue::Decimal128(None, precision, scale)) | ||
} | ||
|
||
#[test] | ||
fn test_try_cast_to_type_nulls() { | ||
// test values that can be cast to/from all integer types | ||
let scalars = vec![ | ||
ScalarValue::Int8(None), | ||
ScalarValue::Int16(None), | ||
ScalarValue::Int32(None), | ||
ScalarValue::Int64(None), | ||
ScalarValue::Decimal128(None, 3, 0), | ||
ScalarValue::Decimal128(None, 8, 2), | ||
]; | ||
|
||
for s1 in &scalars { | ||
for s2 in &scalars { | ||
expect_cast( | ||
s1.clone(), | ||
s2.get_datatype(), | ||
ExpectedCast::Value(s2.clone()), | ||
); | ||
} | ||
} | ||
} | ||
|
||
#[test] | ||
fn test_try_cast_to_type_int_in_range() { | ||
// test values that can be cast to/from all integer types | ||
let scalars = vec![ | ||
ScalarValue::Int8(Some(123)), | ||
ScalarValue::Int16(Some(123)), | ||
ScalarValue::Int32(Some(123)), | ||
ScalarValue::Int64(Some(123)), | ||
ScalarValue::Decimal128(Some(123), 3, 0), | ||
ScalarValue::Decimal128(Some(12300), 8, 2), | ||
]; | ||
|
||
for s1 in &scalars { | ||
for s2 in &scalars { | ||
expect_cast( | ||
s1.clone(), | ||
s2.get_datatype(), | ||
ExpectedCast::Value(s2.clone()), | ||
); | ||
} | ||
} | ||
} | ||
|
||
#[test] | ||
fn test_try_cast_to_type_int_out_of_range() { | ||
let max_i64 = ScalarValue::Int64(Some(i64::MAX)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If want to cover this case maybe adding the test then notate it with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe we should support unsigned data type in the rule There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah- as @liukun4515 mentions I have a PR queued up to support the unsigned case :) |
||
let max_u64 = ScalarValue::UInt64(Some(u64::MAX)); | ||
expect_cast(max_i64.clone(), DataType::Int8, ExpectedCast::NoValue); | ||
|
||
expect_cast(max_i64.clone(), DataType::Int16, ExpectedCast::NoValue); | ||
|
||
expect_cast(max_i64, DataType::Int32, ExpectedCast::NoValue); | ||
|
||
expect_cast(max_u64, DataType::Int64, ExpectedCast::NoValue); | ||
|
||
// decimal out of range | ||
expect_cast( | ||
ScalarValue::Decimal128(Some(99999999999999999999999999999999999900), 38, 0), | ||
DataType::Int64, | ||
ExpectedCast::NoValue, | ||
); | ||
|
||
expect_cast( | ||
ScalarValue::Decimal128(Some(-9999999999999999999999999999999999), 37, 1), | ||
DataType::Int64, | ||
ExpectedCast::NoValue, | ||
); | ||
} | ||
|
||
#[test] | ||
fn test_try_decimal_cast_in_range() { | ||
expect_cast( | ||
ScalarValue::Decimal128(Some(12300), 5, 2), | ||
DataType::Decimal128(3, 0), | ||
ExpectedCast::Value(ScalarValue::Decimal128(Some(123), 3, 0)), | ||
); | ||
|
||
expect_cast( | ||
ScalarValue::Decimal128(Some(12300), 5, 2), | ||
DataType::Decimal128(8, 0), | ||
ExpectedCast::Value(ScalarValue::Decimal128(Some(123), 8, 0)), | ||
); | ||
|
||
expect_cast( | ||
ScalarValue::Decimal128(Some(12300), 5, 2), | ||
DataType::Decimal128(8, 5), | ||
ExpectedCast::Value(ScalarValue::Decimal128(Some(12300000), 8, 5)), | ||
); | ||
} | ||
|
||
#[test] | ||
fn test_try_decimal_cast_out_of_range() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe I have tested unsupported cases in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree you have covered most (if not all ) of these cases already in The reason I want to add redundant coverage was while write tests for unwrapping timestamp casts I also wanted to verify they were consistent with the cast kernel which I couldn't figure out how to do easily. Also, the tests as written tested both So since the only code that is different for different data types was |
||
// decimal would lose precision | ||
expect_cast( | ||
ScalarValue::Decimal128(Some(12345), 5, 2), | ||
DataType::Decimal128(3, 0), | ||
ExpectedCast::NoValue, | ||
); | ||
|
||
// decimal would lose precision | ||
expect_cast( | ||
ScalarValue::Decimal128(Some(12300), 5, 2), | ||
DataType::Decimal128(2, 0), | ||
ExpectedCast::NoValue, | ||
); | ||
} | ||
|
||
#[test] | ||
fn test_try_cast_to_type_unsupported() { | ||
// int64 to list | ||
expect_cast( | ||
ScalarValue::Int64(Some(12345)), | ||
DataType::List(Box::new(Field::new("f", DataType::Int32, true))), | ||
ExpectedCast::NoValue, | ||
); | ||
} | ||
|
||
#[derive(Debug, Clone)] | ||
enum ExpectedCast { | ||
/// test successfully cast value and it is as specified | ||
Value(ScalarValue), | ||
/// test returned OK, but could not cast the value | ||
NoValue, | ||
} | ||
|
||
/// Runs try_cast_literal_to_type with the specified inputs and | ||
/// ensure it computes the expected output, and ensures the | ||
/// casting is consistent with the Arrow kernels | ||
fn expect_cast( | ||
literal: ScalarValue, | ||
target_type: DataType, | ||
expected_result: ExpectedCast, | ||
) { | ||
let actual_result = try_cast_literal_to_type(&literal, &target_type); | ||
|
||
println!("expect_cast: "); | ||
println!(" {:?} --> {:?}", literal, target_type); | ||
println!(" expected_result: {:?}", expected_result); | ||
println!(" actual_result: {:?}", actual_result); | ||
|
||
match expected_result { | ||
ExpectedCast::Value(expected_value) => { | ||
let actual_value = actual_result | ||
.expect("Expected success but got error") | ||
.expect("Expected cast value but got None"); | ||
|
||
assert_eq!(actual_value, expected_value); | ||
|
||
// Verify that calling the arrow | ||
// cast kernel yields the same results | ||
// input array | ||
let literal_array = literal.to_array_of_size(1); | ||
let expected_array = expected_value.to_array_of_size(1); | ||
let cast_array = cast_with_options( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. great test for the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am glad I had it too 😅 as I rediscovered that the arrow cast kernel doesn't support decimal <--> unsigned and so my initial implementation to support unsigned would have been inconsistent. |
||
&literal_array, | ||
&target_type, | ||
&CastOptions { safe: true }, | ||
) | ||
.expect("Expected to be cast array with arrow cast kernel"); | ||
|
||
assert_eq!( | ||
&expected_array, &cast_array, | ||
"Result of casing {:?} with arrow was\n {:#?}\nbut expected\n{:#?}", | ||
literal, cast_array, expected_array | ||
); | ||
|
||
// Verify that for timestamp types the timezones are the same | ||
// (ScalarValue::cmp doesn't account for timezones); | ||
if let ( | ||
DataType::Timestamp(left_unit, left_tz), | ||
DataType::Timestamp(right_unit, right_tz), | ||
) = (actual_value.get_datatype(), expected_value.get_datatype()) | ||
{ | ||
assert_eq!(left_unit, right_unit); | ||
assert_eq!(left_tz, right_tz); | ||
} | ||
} | ||
Comment on lines
+810
to
+838
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I want this coverage in particular to ensure that the casting we implement in unwrap cast is consistent with the arrow cast kernel -- this is especially important for timestamps (see #3938) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is super cool, double verification 💯 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
thanks @alamb There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, indeed this is why I care so much about timestamps :) |
||
ExpectedCast::NoValue => { | ||
let actual_value = actual_result.expect("Expected success but got error"); | ||
|
||
assert!( | ||
actual_value.is_none(), | ||
"Expected no cast value, but got {:?}", | ||
actual_value | ||
); | ||
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit confusing that this description is the same as the one below, so might make sense to actually emphasize this is only testing null values for integer types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, from the test case i got your mean.
You want to test that the
from value
isNull
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a copy/pasted comment 🤦 -- I will fix it in the next PR. Sorry for the confusion and nice spot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in #4148