Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial GitHub Actions runner support #5970

Merged
merged 4 commits into from
Feb 28, 2024
Merged

Conversation

bduffany
Copy link
Member

@bduffany bduffany commented Feb 22, 2024

When enabled (behind flag), allows using BuildBuddy to run GitHub actions in a warm microvm. Users can install the BuildBuddy GitHub app, then change their GitHub workflow YAML to specify runs-on: buildbuddy. BuildBuddy will then spawn an ephemeral runner (as an RBE action) for each workflow_job.queued event, executing the runner within a warm microVM.

The ephemeral runner uses a "just-in-time config" which acts as a one-time-use token that allows the runner to execute a single job. Once the runner is done executing the job, it unregisters itself from the repo (i.e. it will no longer be usable for executing more jobs and won't show up in Settings > Actions > Runners). The config is scoped to a single repo but is not scoped to a particular job ID. So the scheduling model is less like "1 job ID => 1 runner ID" and more like "N jobs => N runners".

Limitations:

  • We don't yet actively monitor workflow job progress, so if a runner crashes or fails for some reason then there may be jobs stuck in "queued" state. If this becomes a major issue while testing in dev, we can explicitly track workflow jobs and runner executions, and start extra runner executions if the number of jobs exceeds the number of runner executions. See Option to limit runner to a particular run or job actions/runner#620 (comment) for more background.
  • We may occasionally spawn too many runners, e.g. if the user cancels a workflow job before it is accepted by a runner. This case is mitigated by applying an "idle timeout" to each runner. We execute the runner alongside a "monitor" process, which kills the runner after 5 minutes if it hasn't printed "Running job:" to its logs yet.
  • The runner image is based on our rbe-ubuntu20-04-workflows image, not GitHub's Ubuntu 20.04 machine image, so some system tools may be missing. This means that runs-on: buildbuddy does not yet "just work".
  • The runner image is pretty huge (3GB) and we don't warm it up yet during executor startup. We could either add the actions runner setup into our current workflows image (since we already warm up the workflows image), or find some way of speeding up image conversion, e.g. https://github.com/buildbuddy-io/buildbuddy-internal/issues/3137
  • Making use of a warm bazel workspace requires a bunch of manual setup in the action. In future PRs, we can provide a GitHub action like "buildbuddy-io/bazel-ci-setup" to make it easier. I did some testing in a separate private repo to make sure that it is feasible (and that we can reuse a bazel server instance etc.) - it took a while to sort out the issues but the results are looking good so far (I tested with buildbuddy-io/buildbuddy and got some runs that take only a few seconds, with the git repository and bazel server persisting between runs)
  • We don't yet set GIT_BRANCH env vars etc. in order to get the same quality of task => snapshot matching and executor routing that we have for BuildBuddy workflows. We can add this in later if the workflow_job event contains the necessary info, but for now I think we can live with actions being assigned to suboptimal snapshots just for GitHub Actions.

Related issues: N/A

@bduffany bduffany marked this pull request as ready for review February 22, 2024 16:25
@@ -204,7 +244,122 @@ func (a *GitHubApp) handleInstallationEvent(ctx context.Context, eventType strin
return nil
}

func (a *GitHubApp) handleWorkflowEvent(ctx context.Context, eventType string, event any) error {
func (a *GitHubApp) handleWorkflowJobEvent(ctx context.Context, eventType string, event *github.WorkflowJobEvent) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be more explicit about calling these githubWorkflow events.

For better or worse workflow jobs already means something to us all, so it's kind of confusing to change the meaning of that name now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed startWorkflowJob to startGitHubActionsRunnerTask.

I would like to keep this func handleWorkflowJob the same though - in handleWebhookEvent above, the function that handles *github.FooEvent is called handleFooEvent which I think is a nice convention. To make this more obvious, the latest commit also gets rid of handleBuildBuddyWorkflowEvent which was being used as a catch-all for the remaining event types, which seemed sort of unclear. Replaced that with explicit handlers for Push, PullRequest, and PullRequestReview events, and those handlers call maybeTriggerBuildBuddyWorkflow now

enterprise/server/githubapp/githubapp.go Outdated Show resolved Hide resolved
@bduffany bduffany merged commit 4d707a2 into master Feb 28, 2024
16 checks passed
@bduffany bduffany deleted the runs-on-buildbuddy branch February 28, 2024 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants