Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Updated join tests for cache #286

Merged
merged 17 commits into from
Jul 21, 2020

Conversation

razajafri
Copy link
Collaborator

@razajafri razajafri commented Jun 25, 2020

This PR only deals with join for now. I will add more tests to this PR as I work through them.

  • Join tests
  • GenerateExec tests
  • ExpandExec tests
  • Sort tests
  • Filter tests

@revans2
Copy link
Collaborator

revans2 commented Jun 25, 2020

I'm not sure we want to modify the existing join tests to deal with caching. The reason for this is that cache is very complex, it has both an input and an output. Also the output can be reused, which is the entire reason for the cache. To really test that we are likely going to require some changes to how we do assertions to make this work.

This means we are going to want to run something on the CPU that produces a cache. Collect the results. Run something else on the CPU that uses the cache and collect its results. Then run the first thing on the GPU, and then the second thing on the GPU. Then finally assert that the first runs match and that the second runs match.

Tests that cache a df, then use the cached df to
execute another operation
@revans2
Copy link
Collaborator

revans2 commented Jun 29, 2020

build

integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/spark_session.py Outdated Show resolved Hide resolved
@sameerz sameerz added the test Only impacts tests label Jul 2, 2020
@razajafri razajafri added this to the Jul 6 - Jul 17 milestone Jul 2, 2020
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
@razajafri razajafri changed the title [WIP] Updated join tests for cache [REVIEW] Updated join tests for cache Jul 13, 2020
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved

assert_gpu_and_cpu_are_equal_collect(do_join)

#sort locally because of https://github.com/NVIDIA/spark-rapids/issues/84
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't actually sorting locally. Instead you removed -0.0 to avoid this issue.

integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
integration_tests/src/main/python/cache_test.py Outdated Show resolved Hide resolved
revans2
revans2 previously approved these changes Jul 17, 2020
@revans2
Copy link
Collaborator

revans2 commented Jul 17, 2020

build

1 similar comment
@revans2
Copy link
Collaborator

revans2 commented Jul 17, 2020

build

@revans2
Copy link
Collaborator

revans2 commented Jul 17, 2020

One of your new tests faild.

@razajafri
Copy link
Collaborator Author

build

@revans2 revans2 merged commit 8f038ae into NVIDIA:branch-0.2 Jul 21, 2020
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
pxLi pushed a commit to pxLi/spark-rapids that referenced this pull request May 12, 2022
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this pull request Nov 30, 2023
Signed-off-by: spark-rapids automation <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Only impacts tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants