Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tests failures in hash_aggregate_test.py #11018

Closed
Tracked by #11004
razajafri opened this issue Jun 8, 2024 · 1 comment · Fixed by #11219
Closed
Tracked by #11004

Fix tests failures in hash_aggregate_test.py #11018

razajafri opened this issue Jun 8, 2024 · 1 comment · Fixed by #11219
Assignees
Labels
bug Something isn't working Spark 4.0+ Spark 4.0+ issues

Comments

@razajafri
Copy link
Collaborator

FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_agg_count
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_agg_nested_array
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_agg_nested_map
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_agg_nested_struct
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_arithmetic_reductions
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_computation_in_grpby_columns
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_count
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_count_distinct_with_nan_floats
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_decimal128_count_group_by
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_decimal128_count_reduction
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_distinct_count_reductions
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_distinct_float_count_reductions
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_exceptAll
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_generic_reductions
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_agg_force_pre_sort
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_agg_with_nan_keys
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_agg_with_struct_keys
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_aggregate_complete_with_grouping_expressions
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_avg_nulls_partial_only
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_count_with_filter
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_approx_percentile_byte_scalar
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_approx_percentile_decimal128_single
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_approx_percentile_decimal32_single
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_approx_percentile_decimal64_single
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_approx_percentile_long_single
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_collect_partial_replace_with_distinct_fallback
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_collect_set
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_collect_with_multi_distinct
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_collect_with_single_distinct
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_single_distinct_collect
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_avg
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_avg_nulls
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_pivot
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_sum
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_sum_count_action
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_sum_full_decimal
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_multiple_filters
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_multiple_grpby_pivot
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_multiple_mode_query
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_multiple_mode_query_avg_distincts
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_pivot_groupby_duplicates_fallback
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_query_max_with_multiple_distincts
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_query_multiple_distincts_with_non_distinct
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_avg_nulls
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_collect_set
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_decimal_overflow_sum
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_pivot
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_sum
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_sum_count_action
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_sum_full_decimal
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_intersectAll
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_reduction_nested_array
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_reduction_nested_map
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_reduction_nested_struct
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_struct_cast_groupby_count
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_struct_count_distinct_cast
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_subquery_in_agg
@razajafri razajafri added bug Something isn't working ? - Needs Triage Need team to review and classify labels Jun 8, 2024
@razajafri razajafri added the Spark 4.0+ Spark 4.0+ issues label Jun 8, 2024
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Jun 11, 2024
@mythrocks mythrocks self-assigned this Jun 11, 2024
@mythrocks
Copy link
Collaborator

mythrocks commented Jun 11, 2024

Trying to tackle the biggish ones first. It looks like the majority of the problems here are with spark.sql.ansi.enabled=true. The tests are passing, with ANSI mode disabled:

=============== 1661 passed, 435 warnings in 1137.02s (0:18:57) ================

mythrocks added a commit to mythrocks/spark-rapids that referenced this issue Jul 16, 2024
Fixes NVIDIA#11018.

This commit fixes the hash aggregate tests that fail with ANSI enabled.

These tests fail most visibly on Spark 4.0, where ANSI mode is enabled by default.

Signed-off-by: MithunR <[email protected]>
mythrocks added a commit that referenced this issue Jul 18, 2024
)

* Fix hash-aggregate tests failing in ANSI mode

Fixes #11018.  

This commit fixes the tests in `hash_aggregate_test.py` to run correctly when run with ANSI enabled.  This is essential for running the tests with Spark 4.0, where ANSI mode is on by default.  

A vast majority of the tests here happen to exercise aggregations like `SUM`, `COUNT`, `AVG`, etc. which fall to CPU, on account of #5114.  These tests have been marked with `@disable_ansi_mode`, so that they run to completion correctly.  These may be revisited after #5114 has been addressed.  

In cases where #5114 does not apply, the tests have been modified to run with ANSI on and off.

---------

Signed-off-by: MithunR <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Spark 4.0+ Spark 4.0+ issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants