-
Notifications
You must be signed in to change notification settings - Fork 629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: scale up for long waiting jobs (job retry) #4064
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I have yet to test these changes, will do it after pressing this request changes button.
Co-authored-by: Brend Smits <[email protected]>
Co-authored-by: Brend Smits <[email protected]>
3726de8
to
161e50e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I tested it and works :)
Good job 👍🏼
🤖 I have created a release *beep* *boop* --- ## [5.15.0](philips-labs/terraform-aws-github-runner@v5.14.1...v5.15.0) (2024-08-16) ### Features * add time zone support for pool schedules ([#4063](https://github.com/philips-labs/terraform-aws-github-runner/issues/4063)) ([b8f9eb4](philips-labs/terraform-aws-github-runner@b8f9eb4)) @janslow * scale up for long waiting jobs (job retry) ([#4064](https://github.com/philips-labs/terraform-aws-github-runner/issues/4064)) ([6120571](philips-labs/terraform-aws-github-runner@6120571)) ### Bug Fixes * **lambda:** bump axios from 1.7.2 to 1.7.4 in /lambdas ([#4071](https://github.com/philips-labs/terraform-aws-github-runner/issues/4071)) ([2f32195](philips-labs/terraform-aws-github-runner@2f32195)) * **lambda:** bump the aws group in /lambdas with 5 updates ([#4057](https://github.com/philips-labs/terraform-aws-github-runner/issues/4057)) ([5ecdbad](philips-labs/terraform-aws-github-runner@5ecdbad)) * **lambda:** bump the aws-powertools group in /lambdas with 3 updates ([#4058](https://github.com/philips-labs/terraform-aws-github-runner/issues/4058)) ([f9533f3](philips-labs/terraform-aws-github-runner@f9533f3)) * **lambda:** Prevent scale-up lambda from starting runner for user repo if org level runners is enabled ([#3909](https://github.com/philips-labs/terraform-aws-github-runner/issues/3909)) ([98b1560](philips-labs/terraform-aws-github-runner@98b1560)) @PerGon --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: forest-releaser[bot] <80285352+forest-releaser[bot]@users.noreply.github.com>
Thanks for this interesting option.
|
Description
This feature add the capability to retry scaling a runner when a job is still queued after a defined delay. This feature is added to avoid pool for ephemeral runners.
Implementation
The module is extended with configuration top optional enable one or more retries. Once enabled the scale-up lambda will publish the same message as it recieves extend with a counter on a retry-job-queueu with a delay. A new lambda will pick the message from this queue and checks if the job is still queued (via GitHub API). In case it is still queued it is published again on je the job queue, incoming queue of the scale-up lambda
Consequences
Testing
Testing can be done as follow
Trigger a workflow
Terminate the created instance before the job starts
Wait, after the delay the retry job should publish the message again which triggers a new instance creation.
Multi runners.
Default runners, not enabled requires configuraton update
Tasks