Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Write test wrapper to run SQL queries via pyspark #317

Closed
mythrocks opened this issue Jul 2, 2020 · 0 comments · Fixed by #383
Closed

[FEA] Write test wrapper to run SQL queries via pyspark #317

mythrocks opened this issue Jul 2, 2020 · 0 comments · Fixed by #383
Assignees
Labels
feature request New feature or request test Only impacts tests

Comments

@mythrocks
Copy link
Collaborator

There are Python integration tests that construct data sources and run SQL queries. Consider this window-function test:

https://github.com/NVIDIA/spark-rapids/blob/branch-0.2/integration_tests/src/main/python/window_function_test.py#L79
def test_window_aggs_for_ranges(data_gen):
    df = with_cpu_session(
        lambda spark : gen_df(spark, data_gen, length=2048))
    df.createOrReplaceTempView("window_agg_table")
    assert_gpu_and_cpu_are_equal_collect(
        lambda spark: spark.sql(
            'select '
            ' sum(c) over '
            '   (partition by a order by cast(b as timestamp) asc  '
            '       range between interval 1 day preceding and interval 1 day following) as sum_c_asc '
            # ...
            'from window_agg_table'))

It would be good to have a pytest wrapper that does the following:

  1. Takes a test input data_gen, and a SQL string.
  2. Constructs an input DataFrame, and a corresponding input table.
  3. Runs the supplied SQL on CPU and GPU, and compares results.

Should be a tiny utility.

@mythrocks mythrocks added feature request New feature or request ? - Needs Triage Need team to review and classify labels Jul 2, 2020
@mythrocks mythrocks self-assigned this Jul 2, 2020
@mythrocks mythrocks added test Only impacts tests and removed ? - Needs Triage Need team to review and classify labels Jul 2, 2020
@mythrocks mythrocks added this to the Jul 6 - Jul 17 milestone Jul 2, 2020
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023
[auto-merge] bot-auto-merge-branch-22.06 to branch-22.08 [skip ci] [bot]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request test Only impacts tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant