Skip to content

Release Management

Pete Tollestrup edited this page Mar 22, 2023 · 11 revisions

Environments

As described in Environment Management, there are three full environments for FAM called "DEV", "TEST" and "PROD". Each of these three environments contains the stack to support 3 different types of customer oidc clients (inconveniently named "DEV", "TEST" and "PROD"). This "nested" structure will be ignored for the purposes of this document. When we refer to "DEV", "TEST" and "PROD" in this document we are referring to the versions of the FAM stack that is deployed in the three corresponding AWS project environments.

Deployments

Deployments are currently triggered manually from GitHub Actions using workflows with the "workflow_dispatch" action. When triggering a workflow, the default for "Use workflow from" will be "Branch: main". The other options are to use a different branch or to specify a tag.

Note: The FAM team uses tag-based deployments for the TEST and PROD environments.

To run a deployment to the FAM "Development" environment:

  • Navigate to the Actions page of the repository.
  • Browse to the workflow labelled "DEPLOY Development Environment" (you may need to click on "Show more workflows" in order to see it).
  • Click "Run Workflow". The default for "Use workflow from" will be "Branch: main". This is normally the right choice.
  • Click "Run Workflow" (the green one).
  • You may have to refresh the page to see the new run. The workflow will be blocked pending approvals. (Status of "Waiting). In the DEV environment, the whole DEV team has access to run the workflows.
  • Click on the new workflow run.
  • Click "Review Deployments". (If you don't see "Review Deployments" it means you don't have permission.)
  • Select the "dev" checkbox and click "Approve and Deploy" to start the workflow run.
  • Currently the workflow is split into frontend and backend jobs. When the backend job is finished, the frontend job will wait for an approval to deploy (TODO: this should really be made atomic).

The currently available and supported deployment workflows are called:

  • DEPLOY Development Environment
  • DEPLOY Test Environment
  • Caution!! DEPLOY Production Environment

Note: The deploy workflows have a known race condition that causes them to intermittently fail the first time they are run after a destroy workflow. Running the workflow again corrects the problem.

Teardown

Teardown workflows exist for the backend independently. We leave the frontend alone because the CloudFront URL that is assigned to it is configured in a DNS entry and if it is destroyed and recreated the URL will change. Changing DNS entries is not in control of the Fingerprint team, so it's easier to leave the frontend and redeploy on top of it. Additionally, the frontend resources are cheap (static files on S3 and CloudFront) so it's not worth it to tear it down.

The currently available and supported **teardown **workflows are called:

  • DESTROY Development Backend
  • DESTROY Test Backend
  • Caution!! DESTROY Production Backend

Workflow Security

The FAM repository is public, but workflow_dispatch GitHub actions can only be invoked by team members. It is common to use workflows that are triggered on pull requests (our checkin tests are triggered this way). Care needs to be taken to not allow public GitHub members to run malicious actions (see this StackOverflow post.

Our workflows always target one of the GitHub environments that we created ("dev", "test", and "prod"). These environments are set up with environment protections that limit who can approve a workflow run into those environments (team members for "dev" and ops managers for "test" and "prod" as well as which branches can be deployed into these workflows ("main" for test and "prod"). Repository administrators can change these settings.

Additionally, each environment has its own set of GitHub Actions Secrets that must be present for the workflows to correctly deploy code or tear down an environment. Without these secrets it is impossible to set up a new workflow that will do anything destructive to an environment.

Versioning

Versioning is auto-managed by Release Please integrated into the GitHub Actions pipeline. This job calculates the version tag for each release by inspecting the commit messages of the commits that have been made to the main branch since the last release.

Typically, a version of the project will be eventually released to the test environment. At that time the main branch is tagged with a release version and the DEPLOY Test Environment GitHub Actions workflow can be triggered from that tag. When the tested version is ready for prod, the Caution!! DEPLOY Production Environment workflow can be triggered from that same tag, ensuring that the exact version that was tested is the one that is released.

For more information about the release creation process, please reference the FAM Ops Guide ("Creating a Release").

Future Improvements

Automatic Deployments

Automatic deployments can also be triggered by merging code. The Fingerprint team found that the Terraform runs were taking a long time (and cost our taxpayers money per run) so we split the Continuous Integration (CI) and the Continuous Deployment (CD) into two steps. Unit tests run on PR creation and on merge to main. Deployments are triggered manually. This can be adjusted if necessary.

Exception: We currently trigger deployments to the DEV environment on each merge to the main branch.

Pull Request Environments

Some teams like to manage multiple "test" environments that are created one per pull request and torn down when the pull request is merged. Our stack creates and destroys expensive AWS components (RDS Proxy, for example) that we don't want to leave lying around in multiples if not necessary. Additionally, we would require a multitude of Terraform Workspaces which need to be created in advance by the Cloud Pathfinder team in order to support this feature. We are living without it for now.

Automatic Environment Teardown

Teardowns (and deploys) can be configured in GitHub Actions to execute on a schedule. We could save our project some money by tearing down DEV and TEST during non-working hours and on weekends. These functions are currently manual.

Clone this wiki locally