Reduce error handling verbosity in CI tests scripts #1113

AjayThorve · 2023-02-08T23:41:15Z

This PR adds a less verbose trap method, for error handling to help ensure that we capture all potential error codes in our test scripts, and works as follows:

setting an environment variable, EXITCODE, with a default value of 0
setting a trap statement triggered by ERR signals which will set EXITCODE=1 when any commands return a non-zero exit code
cc @ajschmidt8

pentschev

Unless I'm missing some detail, I must say I don't like this approach. The problem is, and correct me if I'm wrong, the script will terminate upon the first error and thus we would miss running all the other tests, which IMO is not something we want, if there's a problem we would like to see everything that is failing, not just the first failure.

AjayThorve · 2023-02-09T17:29:52Z

Unless I'm missing some detail, I must say I don't like this approach. The problem is, and correct me if I'm wrong, the script will terminate upon the first error and thus we would miss running all the other tests, which IMO is not something we want, if there's a problem we would like to see everything that is failing, not just the first failure.

@pentschev from what I understand the set -e command makes sure we do not abort the script on error. This allows all tests to run regardless of pass/fail, but relies on the ERR trap above to manage the EXITCODE for the script.

So in essence, the behavior of the tests scripts should not change by these changes, just a less verbose method of doing the same thing. @ajschmidt8 correct me if I am wrong.

pentschev · 2023-02-09T18:01:13Z

@AjayThorve I think you meant set +e, but yes, it seems you're right. I did not know it would also apply to trap. I wrote a small script to confirm:

trap "echo CTRL+C pressed, exiting; break" SIGINT
set +e

while true
do
    echo "Press CTRL+C to stop"
    sleep 1
done

echo Trapped

With set +e on line 2, we see Trapped being printed after pressing CTRL+C, but if we write set -e instead, then the script exits instantly. Therefore, I agree, this looks like a better way of handling issues. The only part I think it's a pitty to lose are custom messages, depending on how a test ends it may be difficult to discern which test(s) failed. However, I think we can still do that by rewriting the trap, for example:

set +e

trap "echo CTRL+C pressed, going to next loop; break" SIGINT
while true
do
    echo "Press CTRL+C to stop first trap"
    sleep 1
done

trap "echo CTRL+C pressed again, exiting; break" SIGINT
while true
do
    echo "Press CTRL+C to stop second trap"
    sleep 1
done

echo Trapped

The code above would outputs the following:

$ bash trap.sh
Press CTRL+C to stop first trap
Press CTRL+C to stop first trap
^CCTRL+C pressed, going to next loop
Press CTRL+C to stop second trap
Press CTRL+C to stop second trap
^CCTRL+C pressed again, exiting
Trapped

I'm not sure this is absolutely necessary, but throwing that as an idea to see what you think of this.

ajschmidt8 · 2023-02-09T20:15:47Z

@pentschev, I'd like to keep this PR as it is for consistency with other projects.

To your point about easily discerning test failures: our shared workflow for Python tests makes use of the test-summary/action (see here for where we implement it).

This means that pytest job summaries can be viewed on the GitHub Action workflow summary cards (in addition to the job's raw log output).

Here's a quick example of what that looks like for a failing pytest for cuml.

ajschmidt8 · 2023-02-09T20:17:24Z

@pentschev, I'll wait for your reply to merge.

Or feel free to merge this yourself if you're okay with the current changes.

…update-error-handling

pentschev · 2023-02-10T08:59:24Z

This means that pytest job summaries can be viewed on the GitHub Action workflow summary cards (in addition to the job's raw log output).

In our CI, not everything is a pytest, we also have some benchmarks that are just CLI tools, for example this and this. It's not a big deal currently, but I'd be interested in knowing if there's anything we can also add for non-pytest executables, do you know if there's anything we can report based on their outputs, or just on their exit codes?

ajschmidt8 · 2023-02-10T21:31:10Z

do you know if there's anything we can report based on their outputs, or just on their exit codes?

@pentschev, what's your ideal way to report?

GitHub Summary Cards are markdown files that you can redirect output to.

So you could effectively print anything you want to it.

In your shell script, you would just have to do something like:

echo "this is my output" >> "$GITHUB_STEP_SUMMARY"

pentschev · 2023-02-10T21:54:31Z

GitHub Summary Cards are markdown files that you can redirect output to.

This looks super cool, I think we could print a summary of our results there, but probably have to rethink how we run those benchmarks, and if it makes sense to run them here. In any case, I think this is a little beyond this PR, so let's merge it as is for now and I'll try to rethink this a bit later. Thanks so much for the suggestion, I was not aware of that!

pentschev · 2023-02-10T21:54:38Z

/merge

use trap method to handle errors

fcf1d10

AjayThorve requested a review from a team as a code owner February 8, 2023 23:41

github-actions bot added the gpuCI gpuCI issue label Feb 8, 2023

madsbk added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 9, 2023

pentschev reviewed Feb 9, 2023

View reviewed changes

ajschmidt8 approved these changes Feb 9, 2023

View reviewed changes

Merge branch 'branch-23.04' of github.com:rapidsai/dask-cuda into ci/…

fb3bafe

…update-error-handling

rapids-bot bot merged commit f44ba33 into rapidsai:branch-23.04 Feb 10, 2023

AjayThorve deleted the ci/update-error-handling branch February 28, 2023 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce error handling verbosity in CI tests scripts #1113

Reduce error handling verbosity in CI tests scripts #1113

AjayThorve commented Feb 8, 2023

pentschev left a comment

AjayThorve commented Feb 9, 2023

pentschev commented Feb 9, 2023

ajschmidt8 commented Feb 9, 2023

ajschmidt8 commented Feb 9, 2023

pentschev commented Feb 10, 2023

ajschmidt8 commented Feb 10, 2023

pentschev commented Feb 10, 2023

pentschev commented Feb 10, 2023

Reduce error handling verbosity in CI tests scripts #1113

Reduce error handling verbosity in CI tests scripts #1113

Conversation

AjayThorve commented Feb 8, 2023

pentschev left a comment

Choose a reason for hiding this comment

AjayThorve commented Feb 9, 2023

pentschev commented Feb 9, 2023

ajschmidt8 commented Feb 9, 2023

ajschmidt8 commented Feb 9, 2023

pentschev commented Feb 10, 2023

ajschmidt8 commented Feb 10, 2023

pentschev commented Feb 10, 2023

pentschev commented Feb 10, 2023