-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable memcheck for jni unit tests #1321
Enable memcheck for jni unit tests #1321
Conversation
Signed-off-by: Liangcai Li <[email protected]>
some questions about the tooling, thanks |
You can specify where Compute Sanitizer will write all of its text output to by setting the By default, Compute Sanitizer will print all output to stdout. Per my local tests, This tool only records the errors into log, without breaking the running process.
@GaryShen2008 may know what we should do. |
Don't we want the premerge to fail if it detects an error? If it just records into a log on the side without failing the permerge or nightly build, nobody is going to think to check the log. |
We could try |
Move to draft, since the requirement is quite different than I thought. |
Thx and yes, this option works, setting it to -1 can let sanitizer fail if it detects any error even the application succeeds. |
Signed-off-by: Liangcai Li <[email protected]>
Signed-off-by: Liangcai Li <[email protected]>
Signed-off-by: Liangcai Li <[email protected]>
Depends on rapidsai/cudf#13872 |
Signed-off-by: Liangcai Li <[email protected]>
Please do not triggre premerge until rapidsai/cudf#13872 is merged. |
Co-authored-by: Jason Lowe <[email protected]>
…pids-jni into tests-with-memcheck
@jlowe @abellina Liangcai helps record the test duration w/ sanitizer in PR desc |
Signed-off-by: Liangcai Li <[email protected]>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think we are okay to add it to nightly only (or a dedicated sanitizer pipeline) but not pre-merge CI?
Yes, that's totally reasonable to me. I'm fine with this going in as-is or we can disable the sanitizer option in the premerge-build.sh before it goes in.
@pxLi note that we should also enable the sanitizer builds on the cudf submodule sync job as well, since we don't want to pull in a cudf change that the sanitizer thinks is broken. @firestarman this is running the sanitizer on the Java unit test, but I assume it is not running it on the spark-rapids-jni C++ unit tests, correct? If so that can be a followup, since we should be running the sanitizer on those as well. |
@jlowe Yeah, this is only for the Java unit tests. BTW, Where can I find the the spark-rapids-jni C++ unit tests ? There is only a json file under https://github.com/NVIDIA/spark-rapids-jni/tree/branch-23.10/src/test/cpp/ . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will follow up on the update to submodule syncs-up CI
So I am going to merge this as-is and file a new PR. |
Here it is #1347 |
Under src/main/cpp/tests. |
Thx for filing #1353 |
closes #1221
compute-saniziter is a tool can detect some GPU memory relevant issues, this PR is to enable it in memory check mode for the JNI unit tests.
Some explanation for the parameters of the Compute Sanitizer.
--log-file
should be used to avoid a corrupting output issue from the surefire plugin.--error-exitcode
is used to fail the build process if any error is caught by the Compute Sanitizer.--launch-timeout
is set to 10 minutes, and it should be enough since we monitor only the forked test processes.Please note it will take much more time to run the tests with the compute sanitizer. There are some numbers for some tests that the elapsed time became quite long.