Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] NDS query 72 logs codegen fallback exception and produces incorrect results #4403

Closed
jlowe opened this issue Dec 20, 2021 · 2 comments · Fixed by #4407
Closed

[BUG] NDS query 72 logs codegen fallback exception and produces incorrect results #4403

jlowe opened this issue Dec 20, 2021 · 2 comments · Fixed by #4407
Assignees
Labels
bug Something isn't working P0 Must have for release

Comments

@jlowe
Copy link
Contributor

jlowe commented Dec 20, 2021

Describe the bug
Running NDS query 72 on Spark 3.1.2 throws the following exception on partitioned data and then produces an empty result which is incorrect.

21/12/20 21:58:53 WARN Predicate: Expr codegen error and falling back to interpreter mode
java.lang.RuntimeException: Unsupported literal type class org.apache.spark.sql.catalyst.expressions.UnsafeRow [0,2567cd,2a41,144e]
	at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:90)
	at org.apache.spark.sql.catalyst.expressions.InSet.$anonfun$genCodeWithSwitch$2(predicates.scala:542)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
	at scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:321)
	at scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:977)
	at scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:977)
	at scala.collection.TraversableLike.map(TraversableLike.scala:238)
	at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
	at scala.collection.AbstractSet.scala$collection$SetLike$$super$map(Set.scala:51)
	at scala.collection.SetLike.map(SetLike.scala:104)
	at scala.collection.SetLike.map$(SetLike.scala:104)
	at scala.collection.AbstractSet.map(Set.scala:51)
	at org.apache.spark.sql.catalyst.expressions.InSet.genCodeWithSwitch(predicates.scala:542)
	at org.apache.spark.sql.catalyst.expressions.InSet.doGenCode(predicates.scala:513)
	at org.apache.spark.sql.execution.InSubqueryExec.doGenCode(subquery.scala:159)
	at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:146)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.sql.catalyst.expressions.Expression.genCode(Expression.scala:141)
	at org.apache.spark.sql.catalyst.expressions.DynamicPruningExpression.doGenCode(DynamicPruning.scala:93)
	at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:146)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.sql.catalyst.expressions.Expression.genCode(Expression.scala:141)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.$anonfun$generateExpressions$1(CodeGenerator.scala:1187)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at scala.collection.TraversableLike.map(TraversableLike.scala:238)
	at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
	at scala.collection.immutable.List.map(List.scala:298)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodegenContext.generateExpressions(CodeGenerator.scala:1187)
	at org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$.create(GeneratePredicate.scala:41)
	at org.apache.spark.sql.catalyst.expressions.codegen.GeneratePredicate$.generate(GeneratePredicate.scala:33)
	at org.apache.spark.sql.catalyst.expressions.Predicate$.createCodeGeneratedObject(predicates.scala:88)
	at org.apache.spark.sql.catalyst.expressions.Predicate$.createCodeGeneratedObject(predicates.scala:85)
	at org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:52)
	at org.apache.spark.sql.catalyst.expressions.Predicate$.create(predicates.scala:101)
	at org.apache.spark.sql.rapids.GpuFileSourceScanExec.dynamicallySelectedPartitions$lzycompute(GpuFileSourceScanExec.scala:132)
	at org.apache.spark.sql.rapids.GpuFileSourceScanExec.dynamicallySelectedPartitions(GpuFileSourceScanExec.scala:120)
	at org.apache.spark.sql.rapids.GpuFileSourceScanExec.inputRDD$lzycompute(GpuFileSourceScanExec.scala:316)
	at org.apache.spark.sql.rapids.GpuFileSourceScanExec.inputRDD(GpuFileSourceScanExec.scala:293)
	at org.apache.spark.sql.rapids.GpuFileSourceScanExec.doExecuteColumnar(GpuFileSourceScanExec.scala:386)

Steps/Code to reproduce bug
Run NDS query 72 on scale factor 100 data that is partitioned

Expected behavior
Query should run without a warning emitted and produce the same results as the query run on the CPU.

@jlowe jlowe added bug Something isn't working ? - Needs Triage Need team to review and classify P0 Must have for release labels Dec 20, 2021
@jlowe
Copy link
Contributor Author

jlowe commented Dec 20, 2021

I tracked down the cause of the failure to #4385. If I build the plugin just before that change, the query runs normally. cc: @sperlingxx

@sperlingxx
Copy link
Collaborator

Hi @jlowe, it is due to my blunder. I created a PR #4407 to fix this bug.

@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Dec 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P0 Must have for release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants