There are two main backup actions to be taken:
- Backup the raw AdWords reports to s3
- Backup the database to s3
The first is simple and is working fine (for now…).
The second is proving more complicated:
- Need to restore roles/globals as well as data
- Smallest digital ocean box struggles a bit with the size of the sql file (docker images seem to eat a lot of disk space)
- Current solution is inelegant
Let’s think about what circumstances I want the system to be able to cope with. This will make it easier to come up with a viable solution.
- System is run for the first time (i.e. no backups exist to restore). System should setup appropriate databases/tables
- System is run for the first time on a new machine (i.e. backups exist and need to be restored). System should restore backups
- System is out of date. No backup needs to be restored, but database schema changes may need to be made.
There is overlap between steps one and three and between steps two and three.
- Use pg_dumpall - pipe to gzip to avoid big temp file
Could do even better using pg_dumpall –globals then pg_dump on each database but this seems complicated. Advantage of this is that both backup and restore could be parallisable.
The restore script must do the following:
- Check to see if database exists and has expected data etc.
- If no, check to see if backup exists
- If yes (to step 2), restore backup.
- Check for, and apply any schema updates