Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds swap space for ALI runners and report to ci and metrics if it is used #6058

Merged
merged 2 commits into from
Dec 13, 2024

Conversation

jeanschmidt
Copy link
Contributor

Adds a swap space for the autoscaled runners.

Prints on post_job step if the swap usage was detected during the job running, and sends metrics related to swap usage per job.

Copy link

vercel bot commented Dec 13, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Updated (UTC)
torchci ⬜️ Ignored (Inspect) Dec 13, 2024 7:27pm

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 13, 2024
@@ -138,6 +138,11 @@ fi

${post_install}

sudo fallocate -l 3G /swapfile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I'll test this out and land this next Monday to avoid deploying this during the weekend

@jeanschmidt jeanschmidt merged commit 1577e6b into main Dec 13, 2024
5 checks passed
@jeanschmidt jeanschmidt deleted the jeanschmidt/add_sawp_space branch December 13, 2024 21:19
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Dec 17, 2024
A swapfile on Linux runner has been prepared by pytorch/test-infra#6058.  So this PR does 2 things:

* Start using the swapfile on all Linux build and test jobs
* Testing the rollout https://github.com/pytorch-labs/pytorch-gha-infra/pull/582

### Testing

Run `swapon` inside the container and the swapfile shows up correctly:

```
jenkins@259dfb0a314c:~/workspace$ swapon
NAME      TYPE SIZE USED PRIO
/swapfile file   3G 256K   -2
```
Pull Request resolved: #143316
Approved by: https://github.com/ZainRizvi, https://github.com/atalman
aditew01 pushed a commit to aditew01/pytorch that referenced this pull request Dec 18, 2024
A swapfile on Linux runner has been prepared by pytorch/test-infra#6058.  So this PR does 2 things:

* Start using the swapfile on all Linux build and test jobs
* Testing the rollout https://github.com/pytorch-labs/pytorch-gha-infra/pull/582

### Testing

Run `swapon` inside the container and the swapfile shows up correctly:

```
jenkins@259dfb0a314c:~/workspace$ swapon
NAME      TYPE SIZE USED PRIO
/swapfile file   3G 256K   -2
```
Pull Request resolved: pytorch#143316
Approved by: https://github.com/ZainRizvi, https://github.com/atalman
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants