-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: keep waiting on failing workloads for sending slack alerts #9371
Conversation
✅ Deploy Preview for determined-ui canceled.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #9371 +/- ##
=======================================
Coverage 45.28% 45.29%
=======================================
Files 1227 1227
Lines 154048 154048
Branches 2404 2403 -1
=======================================
+ Hits 69767 69773 +6
+ Misses 84089 84083 -6
Partials 192 192
Flags with carried forward coverage won't be shown. Click here to find out more. |
This reverts commit 6b99a16.
@@ -41,7 +41,7 @@ def send_alerts_for_failed_jobs(sent_alerts: Set[str]) -> bool: | |||
continue | |||
|
|||
workflow_id = w["id"] | |||
if not workflows_are_running and w["status"] == "running": | |||
if not workflows_are_running and w["status"] in ["running", "failing"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe w["stopped_at"]
might be better to use?
This reverts commit 0c6d970.
Ticket
Description
Sometimes we could miss alerts due to the alert job stopping before all jobs are completed.
https://hpe-aiatscale.slack.com/archives/C9LFPNA3Y/p1715639036613519
Update the script to wait for failing state too.
Test Plan
see if we keep having this issue
Checklist
docs/release-notes/
.See Release Note for details.