Skip to content

Database Migration Management

Basil Vandegriend edited this page Dec 13, 2022 · 3 revisions

As per the FAM Architecture page, FAM for its database will use AWS RDS provisioned using Terraform. The FAM team manages database migrations in an automated fashion using Github Actions and Terraform.

Requirements

  1. The database structure (database objects like tables, views, etc.) can change over time, both during development and especially after production go-live.
  2. Data changes need to be handled. This can be reference data (e.g. code sets) or test data.
  3. Automation should be smart (aka idempotent, desired state configuration) in that it only applies changes that are not yet made for a particular environment. Rerunning the same automation a second time would result in no changes.
  4. A method of rolling back bad changes is required, ideally using a fail-safe approach like database snapshots.
  5. There is a general 'DevOps' requirement to be able to provision new environments in a fully-automated fashion and apply updates to these environments as part of the CD pipeline.
  6. Prefer the use of open source tooling that is actively supported and broadly used within the BC government.

Solution

Key Technical Considerations

RDS runs inside a VPC and is not meant to expose connections to the general internet. Therefore there is no database connectivity available from the Github Actions runner environment nor from the Terraform Cloud runner environment. So any database scripts must be run from a component that is inside the VPC and has privileges to connect to the database.

Options:

  1. Provision a temporary bastion host and connect via SessionManager
  2. Use an ECS task which runs a script
  3. Create a lambda function inside the VPC and give it permissions to connect to the database

After a fairly extensive spike, we decided to run a lambda function that could be invoked as a step in our normal pipeline.

Solution Overview

  • Terraform provisions RDS Aurora (PostgreSQL) using a sysadmin owner and a random password. The password is stored in an AWS Secret.
  • Terraform provisions a lambda function that lives in the VPC. The lambda function comes from an open source project called flyway-lambda. The lambda function has permissions to access the AWS Secret for the sysadmin username and password.
  • Terraform provisions a database secret that contains the username and a randomly generated password for the API user (this will be used later in sql scripts that create the user in the database).
  • The deploy pipeline creates a database snapshot from a terraform job each time code is pushed (deleting the one from the previous deployment).
  • The deploy pipeline invokes the flyway-lambda function with the right parameters each time code is pushed. In particular, the terraform job that invokes the lambda function passes any necessary variables to the sql scripts (example: username and password of the api database user).
  • The flyway-lambda function downloads the sql scripts from github and applies them in the right order in the database (replacing any variables as necessary).
  • The flyway-lambda function updates its own state table in the database to keep track of which migrations have been run.

Usage instructions

  1. Check out bcgov/nr-forests-access-management
  2. Change to the dev branch
  3. Create a new local branch from dev
  4. Add a new file in the directory "server/flyway/sql". Follow the flyway naming conventions. If the change adds a database table, you will need to add a grant statement to grant access to ${api_db_username}.
  5. Check in your change locally and push to a new branch in Github
  6. Create a pull request from the new Github branch back to the dev branch
  7. Wait for checks to pass and reviews to be approved
  8. Squash & merge the change to the dev branch
  9. Github actions will run a Terraform Cloud job which will invoke Flyway to execute your script
  10. Eventually, dev will be merged to test and prod with pull requests, and the same scripts will be run in those environments
Clone this wiki locally