-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CAST(tinyint/smallint/integer/bigint as varbinary) for Spark #9819
Conversation
✅ Deploy Preview for meta-velox canceled.
|
@PHILO-HE Would you like to take a review? Thanks. |
acab0aa
to
59045ff
Compare
Do you need to update the document? |
1d4bbb0
to
2dae8eb
Compare
@jinchengchenghh Updated the documentation. Thanks. |
a6a213a
to
6d1ecd0
Compare
@mbasmanova Could you help review this change? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rui-mo Looks great. Do we have Fuzzer coverage for this type of CAST?
@@ -184,3 +184,22 @@ Valid example | |||
SELECT cast(' -3E+2' as decimal(12, 2)); -- -300.00 | |||
SELECT cast('-3E+2 ' as decimal(12, 2)); -- -300.00 | |||
SELECT cast(' -3E+2 ' as decimal(12, 2)); -- -300.00 | |||
|
|||
Cast to Varbinary | |||
--------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please, extend to cover full length of the title
velox/expression/CastExpr.cpp
Outdated
const BaseVector& input) { | ||
VELOX_USER_CHECK( | ||
hooks_->canCastIntToBinary(), | ||
"Cannot cast {} to VARBINARY.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems too late for this type of check. Shouldn't we fail much earlier when compiling the expression and creating an instance of CastExpr?
CC: @kagamiori
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this check to constructSpecialForm which will be called by compileRewrittenExpression.
velox/velox/expression/ExprCompiler.cpp
Lines 408 to 409 in b842c14
result = specialForm->constructSpecialForm( | |
resultType, std::move(compiledInputs), trackCpuUsage, config); |
velox/expression/CastExpr.cpp
Outdated
VectorPtr result; | ||
context.ensureWritable(rows, VARBINARY(), result); | ||
(*result).clearNulls(rows); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified to
auto result = BaseVector::create<FlatVector<StringView>>(VARBINARY(), rows.end(), context.pool());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Updated.
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
956cfbd
to
9d05813
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have Fuzzer coverage for this type of CAST?
@mbasmanova We need to add below signatures in getSignaturesForCast for this type.
for (auto fromType : {"tinyint", "smallint", "integer", "bigint"}) {
signatures.push_back(makeCastSignature(fromType, "varbinary"));
}
velox/velox/expression/tests/ExpressionFuzzer.cpp
Lines 125 to 126 in b842c14
std::vector<facebook::velox::exec::FunctionSignaturePtr> | |
getSignaturesForCast() { |
But this modification would cause Presto fuzzer test to fail because of this check #9819 (comment). I wonder if I can follow-up below TODO in a separate PR to provide custom cast signatures for Presto and Spark. Thanks.
velox/velox/expression/tests/ExpressionFuzzer.cpp
Lines 251 to 254 in b842c14
"cast", | |
/// TODO: Add supported Cast signatures to CastTypedExpr and expose | |
/// them to fuzzer instead of hard-coding signatures here. | |
getSignaturesForCast(), |
That would be great. Thanks. |
velox/expression/CastExpr.cpp
Outdated
VectorPtr result = BaseVector::create<FlatVector<StringView>>( | ||
VARBINARY(), rows.end(), context.pool()); | ||
auto flatResult = result->asFlatVector<StringView>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BaseVector::create<FlatVector<StringView>>
returns FlatVectorPtr<StringView>
. If you want VectorPtr, you can just call BaseVector::create
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks the the pointer. Updated.
@@ -980,6 +1029,15 @@ ExprPtr CastCallToSpecialForm::constructSpecialForm( | |||
std::vector<ExprPtr>&& compiledChildren, | |||
bool trackCpuUsage, | |||
const core::QueryConfig& config) { | |||
const auto inputKind = compiledChildren[0]->type()->kind(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this happen after checking that compiledChildren.size() == 1 on L1041?
Might be safer to do that in CastExpr's constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch.
Might be safer to do that in CastExpr's constructor.
SparkCastExpr extends CastExpr, so the check would impact Spark as well if it's in the constructor of CastExpr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I thought you had an API in the CastHooks that tells whether this cast is supported or not. I see that you removed that API now.
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@rui-mo I'm seeing "Conbench performance report — Found 2 regressions". Would you take a look? CC: @assignUser @kgpai |
@mbasmanova merged this pull request in 9446f67. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
@mbasmanova Thanks for your review. It appears that no regression is mentioned in the most recent report. Please kindly contact me if I'm missing something. |
@rui-mo Perf testing can be flaky at times. This must be one of these cases. CC: @assignUser |
@mbasmanova @rui-mo Yeah looks like at the time this was committed the times where very close together so false positives can happen. But while checking this out I noticed that PR a day later clearly made the results worse: https://velox-conbench.voltrondata.run/compare/benchmark-results/0665116962307f828000737782ce0ccc...06651355ec2377e8800073bd529feb0a/ |
…ebookincubator#9819) Summary: Fixes facebookincubator#9820. Pull Request resolved: facebookincubator#9819 Reviewed By: amitkdutta Differential Revision: D57756701 Pulled By: mbasmanova fbshipit-source-id: 954440314175b4eb9a9dcba553909d448babd935
…ebookincubator#9819) Summary: Fixes facebookincubator#9820. Pull Request resolved: facebookincubator#9819 Reviewed By: amitkdutta Differential Revision: D57756701 Pulled By: mbasmanova fbshipit-source-id: 954440314175b4eb9a9dcba553909d448babd935
Fixes #9820.