You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
10:05:47 FAILED ../../src/main/python/hash_aggregate_test.py::test_hash_groupby_collect_partial_replace_fallback[false-false-{'spark.rapids.sql.variableFloatAgg.enabled': 'true', 'spark.rapids.sql.hasNans': 'true', 'spark.rapids.sql.castStringToFloat.enabled': 'true', 'spark.rapids.sql.hashAgg.replaceMode': 'partial'}-[('a', RepeatSeq(Long)), ('b', RepeatSeq(Boolean)), ('c', LongRange(not_null))]][IGNORE_ORDER({'local': True}), APPROXIMATE_FLOAT, ALLOW_NON_GPU(ObjectHashAggregateExec,SortAggregateExec,ShuffleExchangeExec,HashPartitioning,SortExec,SortArray,Alias,Literal,Count,CollectList,CollectSet,GpuToCpuCollectBufferTransition,CpuToGpuCollectBufferTransition,AggregateExpression)]
and alot of others of that same test.
E py4j.protocol.Py4JJavaError: An error occurred while calling z:com.nvidia.spark.rapids.ExecutionPlanCaptureCallback.assertContains.
10:05:47 E : java.lang.AssertionError: assertion failed: Could not find GpuCollectList in the Spark plan
10:05:47 E ObjectHashAggregate(keys=[a#1739656L], functions=[collect_list(b#1739657, 0, 0), collect_set(b#1739657, 0, 0)], output=[a#1739656L, sort_array(collect_list(b), true)#1739667, sort_array(collect_set(b), true)#1739668])
10:05:47 E +- GpuColumnarToRow false
10:05:47 E +- GpuShuffleCoalesce 2147483647
10:05:47 E +- GpuCustomShuffleReader coalesced
10:05:47 E +- ShuffleQueryStage 0, Statistics(sizeInBytes=4.3 KiB, rowCount=100, isRuntime=true)
10:05:47 E +- GpuColumnarExchange gpuhashpartitioning(a#1739656L, 12), true, [id=#218025]
10:05:47 E +- GpuProject [a#1739656L, b#1739657]
10:05:47 E +- GpuProject [a#1739656L, b#1739657]
10:05:47 E +- GpuRowToColumnar targetsize(2147483647)
10:05:47 E +- *(1) Scan ExistingRDD[a#1739656L,b#1739657,c#1739658L]
10:05:47 E
10:05:47 E at scala.Predef$.assert(Predef.scala:223)
10:05:47 E at com.nvidia.spark.rapids.ExecutionPlanCaptureCallback$.assertContains(Plugin.scala:336)
10:05:47 E at com.nvidia.spark.rapids.ExecutionPlanCaptureCallback$.assertContains(Plugin.scala:341)
10:05:47 E at com.nvidia.spark.rapids.ExecutionPlanCaptureCallback.assertContains(Plugin.scala)
10:05:47 E at sun.reflect.GeneratedMethodAccessor472.invoke(Unknown Source)
10:05:47 E at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
10:05:47 E at java.lang.reflect.Method.invoke(Method.java:498)
10:05:47 E at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
10:05:47 E at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
10:05:47 E at py4j.Gateway.invoke(Gateway.java:295)
10:05:47 E at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
10:05:47 E at py4j.commands.CallCommand.execute(CallCommand.java:79)
10:05:47 E at py4j.GatewayConnection.run(GatewayConnection.java:251)
10:05:47 E at java.lang.Thread.run(Thread.java:748)
10:05:47
The text was updated successfully, but these errors were encountered:
tgravescs
changed the title
[BUG] Databricks build fails test_hash_groupby_collect_partial_replace_fallback
[BUG] Databricks test fails test_hash_groupby_collect_partial_replace_fallback
Aug 30, 2021
It failed because DB runtime eliminated the map-side combine of map-reduce aggregation. Therefore, the physical plan only contains the reduce-side AggregateExec. Meanwhile, the test here assumes that there exist two AggregateExecs: one on CPU, another on GPU.
I am trying to fix the tests, but I found we need to rework the tagForReplaceMode and corresponding python tests to adapt DB runtime, especially on cases for distinct aggregation. I am working on it.
10:05:47 FAILED ../../src/main/python/hash_aggregate_test.py::test_hash_groupby_collect_partial_replace_fallback[false-false-{'spark.rapids.sql.variableFloatAgg.enabled': 'true', 'spark.rapids.sql.hasNans': 'true', 'spark.rapids.sql.castStringToFloat.enabled': 'true', 'spark.rapids.sql.hashAgg.replaceMode': 'partial'}-[('a', RepeatSeq(Long)), ('b', RepeatSeq(Boolean)), ('c', LongRange(not_null))]][IGNORE_ORDER({'local': True}), APPROXIMATE_FLOAT, ALLOW_NON_GPU(ObjectHashAggregateExec,SortAggregateExec,ShuffleExchangeExec,HashPartitioning,SortExec,SortArray,Alias,Literal,Count,CollectList,CollectSet,GpuToCpuCollectBufferTransition,CpuToGpuCollectBufferTransition,AggregateExpression)]
and alot of others of that same test.
The text was updated successfully, but these errors were encountered: