We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_hash_groupby_approx_percentile_long_repeated_keys failed during a Databricks premerge CI run with the following error:
[2021-09-29T00:36:22.861Z] =================================== FAILURES =================================== [2021-09-29T00:36:22.861Z] ____________ test_hash_groupby_approx_percentile_long_repeated_keys ____________ [2021-09-29T00:36:22.861Z] [gw3] linux -- Python 3.7.10 /databricks/conda/envs/databricks-ml-gpu/bin/python [2021-09-29T00:36:22.861Z] [2021-09-29T00:36:22.861Z] @ignore_order(local=True) [2021-09-29T00:36:22.861Z] def test_hash_groupby_approx_percentile_long_repeated_keys(): [2021-09-29T00:36:22.861Z] compare_percentile_approx( [2021-09-29T00:36:22.861Z] lambda spark: gen_df(spark, [('k', RepeatSeqGen(LongGen(), length=20)), [2021-09-29T00:36:22.861Z] ('v', LongRangeGen())], length=100), [2021-09-29T00:36:22.862Z] > [0.05, 0.25, 0.5, 0.75, 0.95]) [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] ../../src/main/python/hash_aggregate_test.py:1084: [2021-09-29T00:36:22.862Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] df_fun = <function test_hash_groupby_approx_percentile_long_repeated_keys.<locals>.<lambda> at 0x7f088f769f80> [2021-09-29T00:36:22.862Z] percentiles = [0.05, 0.25, 0.5, 0.75, 0.95] [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] def compare_percentile_approx(df_fun, percentiles): [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] # create SQL statements for exact and approx percentiles [2021-09-29T00:36:22.862Z] p_exact_sql = create_percentile_sql("percentile", percentiles) [2021-09-29T00:36:22.862Z] p_approx_sql = create_percentile_sql("approx_percentile", percentiles) [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] def run_exact(spark): [2021-09-29T00:36:22.862Z] df = df_fun(spark) [2021-09-29T00:36:22.862Z] df.createOrReplaceTempView("t") [2021-09-29T00:36:22.862Z] return spark.sql(p_exact_sql) [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] def run_approx(spark): [2021-09-29T00:36:22.862Z] df = df_fun(spark) [2021-09-29T00:36:22.862Z] df.createOrReplaceTempView("t") [2021-09-29T00:36:22.862Z] return spark.sql(p_approx_sql) [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] # run exact percentile on CPU [2021-09-29T00:36:22.862Z] exact = run_with_cpu(run_exact, 'COLLECT', _approx_percentile_conf) [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] # run approx_percentile on CPU and GPU [2021-09-29T00:36:22.862Z] approx_cpu, approx_gpu = run_with_cpu_and_gpu(run_approx, 'COLLECT', _approx_percentile_conf) [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] for result in zip(exact, approx_cpu, approx_gpu): [2021-09-29T00:36:22.862Z] # assert that keys match [2021-09-29T00:36:22.862Z] assert result[0]['k'] == result[1]['k'] [2021-09-29T00:36:22.862Z] assert result[1]['k'] == result[2]['k'] [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] exact = result[0]['the_percentile'] [2021-09-29T00:36:22.862Z] cpu = result[1]['the_percentile'] [2021-09-29T00:36:22.862Z] gpu = result[2]['the_percentile'] [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] if exact is not None: [2021-09-29T00:36:22.862Z] if isinstance(exact, list): [2021-09-29T00:36:22.862Z] > for x in zip(exact, cpu, gpu): [2021-09-29T00:36:22.862Z] E TypeError: zip argument #3 must support iteration [2021-09-29T00:36:22.862Z] [2021-09-29T00:36:22.862Z] ../../src/main/python/hash_aggregate_test.py:1151: TypeError
The text was updated successfully, but these errors were encountered:
andygrove
Successfully merging a pull request may close this issue.
test_hash_groupby_approx_percentile_long_repeated_keys failed during a Databricks premerge CI run with the following error:
The text was updated successfully, but these errors were encountered: