-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Updated join tests for cache #286
Conversation
I'm not sure we want to modify the existing join tests to deal with caching. The reason for this is that cache is very complex, it has both an input and an output. Also the output can be reused, which is the entire reason for the cache. To really test that we are likely going to require some changes to how we do assertions to make this work. This means we are going to want to run something on the CPU that produces a cache. Collect the results. Run something else on the CPU that uses the cache and collect its results. Then run the first thing on the GPU, and then the second thing on the GPU. Then finally assert that the first runs match and that the second runs match. |
Tests that cache a df, then use the cached df to execute another operation
build |
|
||
assert_gpu_and_cpu_are_equal_collect(do_join) | ||
|
||
#sort locally because of https://github.com/NVIDIA/spark-rapids/issues/84 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't actually sorting locally. Instead you removed -0.0 to avoid this issue.
build |
1 similar comment
build |
One of your new tests faild. |
build |
Signed-off-by: spark-rapids automation <[email protected]>
This PR only deals with join for now. I will add more tests to this PR as I work through them.