Skip to content

Commit

Permalink
add profile runner deploy scripts
Browse files Browse the repository at this point in the history
Signed-off-by: lrq619 <[email protected]>
  • Loading branch information
leokondrashov authored and lrq619 committed Jan 18, 2024
1 parent 6f46bda commit a7b5284
Show file tree
Hide file tree
Showing 3 changed files with 142 additions and 11 deletions.
77 changes: 77 additions & 0 deletions scripts/github_runner/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
This is the deployment scripts of github self-hosted runners, used to execute some of the unit tests.

There are four self-hosted runners in total:
* cri-firecracker: Used for [firecracker cri tests](../../.github/workflows/integration_tests.yml)
* cri-gvisor: Used for [gvisor cri tests](../../.github/workflows/gvisor_cri_tests.yml)
* integ: Used for [integration tests](../../.github/workflows/integration_tests.yml)
* profile: Used for [profile unit tests](../../.github/workflows/unit_tests.yml), job: `profile-unit-test`

# Deploy Runners
Runners physical node configuration:
Four nodes with 4C-8G, 100GB storage. Suggested system image: `ubuntu-20.04-2nic`

How to deploy the four nodes:
1. Build the runner deployer
```
go build .
```
2. Modify the `conf.json`

Need to modify `conf.json`, the format is as following:
```
{
"ghOrg": "<GitHub account>",
"ghPat": "<GitHub PAT>",
"hostUsername": "<username>",
"runners": {
"<hostname-1>": {
"type": "cri",
"sandbox": "firecracker"
},
"<hostname-2>": {
"type": "cri",
"sandbox": "gvisor",
},
"<hostname-3>": {
"type": "integ",
"num": 2,
"restart": false
},
"<hostname-4>": {
"type": "profile"
}
}
}
```

Note that in `conf.json`, for `ghOrg`, it's `vhive-serverless`, for `ghPat`, it should be your own account's Personal Access Token, as long as your account has the correct permissions for `vhive-serverless` org.

`<username>:<hostname-1/2/3/4>` is the ssh username and hostname, so if you use `SCSE` cloud nodes as runners, `<hostname-1/2/3/4>` should be their `ip` addresses.

After modifying this, deploy the runners remotely by running:
```
./deploy_runners
```

If it gives out error like `“dial unix: missing address”`, use:
```
eval `ssh-agent`
ssh-add ~/.ssh/<private_key>
```
Here `<private_key>` should be the key that has the ssh permission to all four runners, typically it's `id_rsa`

# Restart Runners
On `SCSE` cloud, rebuild the four nodes and redeploy them.

# When Should Restart Runners
For firecracker and gvisor cri tests, when the test stuck in `helloworld is waiting for a Revision to be ready`
<img width="814" alt="bc67c34ef2308282b8285077534667f" src="https://github.com/vhive-serverless/vHive/assets/58351056/78cea3f8-b42f-4807-ad7a-10fea14a8eea">

This basically implies that the firecracker and gvisor cri runners need to be restart(You can also restart only one runner in that case)
But if the firecracker and gvisor cri test passed the `Setup vHive CRI test environment` step and failed in `Run vHive CRI tests` step, this typically is just sporadic failure and can be resolved by re-running the tests, just trigger the re-run button on github webpage is okay.

# Notice for Github PAT
Below are steps for generating github PAT:
1. On your personal github webpage, click `Developer settings` > `Personal access tokens` > `Tokens(classic)`, note that do not generate `Fine-grained tokens`
2. You can choose any expiration date. For PAT scopes, you can simply check the `repo`.
3. Note that **NEVER** push your PAT to github or any other public spaces, it's unsafe to your github account and also, when github scans that your PAT is open for public access, the PAT is deprecated.
16 changes: 5 additions & 11 deletions scripts/github_runner/deploy_runners.go
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,7 @@ func main() {
"num": 2
}
"pc75.cloudlab.umass.edu": {
"type": "integ",
"num": 6,
"restart": true
"type": "profile"
}
}
}
Expand Down Expand Up @@ -124,7 +122,7 @@ func deployRunner(host string, runnerConf RunnerConf, deployerConf *DeployerConf
}

log.Debugf("Cloning vHive repository on %s@%s", deployerConf.HostUsername, host)
out, err := client.Exec(fmt.Sprintf("rm -rf ./vhive ./runner && git clone --depth=1 https://github.com/%s/vhive", deployerConf.GhOrg))
out, err := client.Exec(fmt.Sprintf("rm -rf ./vhive ./actions-runner && git clone --depth=1 https://github.com/%s/vhive", deployerConf.GhOrg))
log.Debug(string(out))
if err != nil {
log.Fatalf("Failed to clone vHive repository on %s@%s: %s", deployerConf.HostUsername, host, err)
Expand All @@ -136,12 +134,6 @@ func deployRunner(host string, runnerConf RunnerConf, deployerConf *DeployerConf
log.Debugf("Adding redeploy crontab task to %s@%s", deployerConf.HostUsername, host)
setupCmd = fmt.Sprintf("cd vhive && ./scripts/github_runner/setup_bare_metal_runner.sh %s %s %s", deployerConf.GhOrg,
deployerConf.GhPat, runnerConf.Sandbox)
var redeploySetupCmd string = fmt.Sprintf("echo '10 4 * * * root rm -rf ./runners/ && %s' >> /etc/crontab", setupCmd)
out, err = client.Exec(redeploySetupCmd)
log.Debug(string(out))
if err != nil {
log.Fatalf("Failed to setup redeploy task on %s@%s: %s", deployerConf.HostUsername, host, err)
}
case "integ":
var restart string
if runnerConf.Restart {
Expand All @@ -160,8 +152,10 @@ func deployRunner(host string, runnerConf RunnerConf, deployerConf *DeployerConf

setupCmd = fmt.Sprintf("cd vhive && ./scripts/github_runner/setup_integ_runners.sh %d %s %s %s", runnerConf.Num,
deployerConf.GhOrg, deployerConf.GhPat, restart)
case "profile":
setupCmd = fmt.Sprintf("cd vhive && chmod +x ./scripts/github_runner/setup_profile_runner.sh && ./scripts/github_runner/setup_profile_runner.sh %s %s", deployerConf.GhOrg, deployerConf.GhPat)
default:
log.Fatalf("Invalid runner type: '%s', expected 'cri' or 'integ'", runnerConf.Type)
log.Fatalf("Invalid runner type: '%s', expected 'cri', 'integ' or 'profile'", runnerConf.Type)
}

log.Debugf("Setting up runner on %s@%s", deployerConf.HostUsername, host)
Expand Down
60 changes: 60 additions & 0 deletions scripts/github_runner/setup_profile_runner.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#!/bin/bash

# MIT License
#
# Copyright (c) 2023 Lai Ruiqi, Dmitrii Ustiugov and vHive team
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

# setup runner for profile unit test
GH_ORG=$1
GH_PAT=$2

sudo apt-get update
sudo apt-get install -y jq tmux

# Based on https://github.com/actions/runner/blob/0484afeec71b612022e35ba80e5fe98a99cd0be8/scripts/create-latest-svc.sh#L112-L131
RUNNER_TOKEN=$(curl -s -X POST https://api.github.com/repos/"$GH_ORG"/vhive/actions/runners/registration-token -H "accept: application/vnd.github.everest-preview+json" -H "authorization: token $GH_PAT" | jq -r '.token')
if [ "null" == "$RUNNER_TOKEN" ] || [ -z "$RUNNER_TOKEN" ]; then
echo "Failed to get a runner token"
exit 1
fi

cd $HOME
if [ ! -d "$HOME/actions-runner" ]; then
mkdir actions-runner && cd actions-runner
LATEST_VERSION=$(curl -s https://api.github.com/repos/actions/runner/releases/latest | grep 'browser_' | cut -d\" -f4 | grep 'linux-x64-[0-9\.]*.tar.gz')
curl -o actions-runner-linux-x64.tar.gz -L -C - $LATEST_VERSION
tar xzf "./actions-runner-linux-x64.tar.gz"
rm actions-runner-linux-x64.tar.gz
chmod +x ./config.sh
chmod +x ./run.sh
RUNNER_ALLOW_RUNASROOT=1 ./config.sh --url "https://github.com/$GH_ORG/vHive" \
--token "${RUNNER_TOKEN}" \
--name "profile-test-github-runner" \
--work "$HOME/actions-runner/_work" \
--labels "profile" \
--unattended \
--replace

fi

cd $HOME/actions-runner
tmux new-session -d -s session_name 'RUNNER_ALLOW_RUNASROOT=1 ./run.sh'
echo "SETUP PROFILE FINISHED"

0 comments on commit a7b5284

Please sign in to comment.