-
Notifications
You must be signed in to change notification settings - Fork 129
Disaster recovery instructions
These are brief instructions, for more thorough instructions see Restoring from VM Snapshot Guide.
Create a new machine using the snapshot. If you need to restore the database see below instructions, from step 3 "upload"
Once you have the VM created, SSH in and edit the .env
to have the right settings for the current domain/IP you're running from, which include:
-
BROKER_URL
andRABBITMQ_HOST
will need to contain the full dns name instead of rabbit SSL_ALLOWED_HOSTS
-
CODALAB_SITE_DOMAIN
should be something likecompetitions.codalab.org
-
CHAHUB_API_URL
andCHAHUB_API_KEY
Download backup from storage, this depends on your storage endpoint. Will be different for Minio, S3, Azure, Google Cloud Storage.
In our case, for production, backups are accessible via SSH into our minio server. Backups are stored in /data/prod-private/backups.
In your new server...
Setup docker for your platform and docker-compose
Clone the codalab-competitions github:
$ git clone https://github.com/codalab/codalab-competitions.git
$ cd codalab-competitions
Upload your SSL certificates to /path/to/codalab-competitions/certs
Make a docker-compose.override.yml
according to these instructions
Put it in /path/to/codalab-competitions/.env
Check with your administrator for a backed up .env
cp .env_production_sample .env
Setup the initial environment variables
Pay particular attention to:
-
BROKER_URL
andRABBITMQ_HOST
will need to contain the full dns name instead of rabbit -
RABBITMQ_DEFAULT_USER
andRABBITMQ_DEFAULT_PASS
FLOWER_BASIC_AUTH
-
EMAIL_HOST
,EMAIL_HOST_USER
, etc. email settings ADMINS
SSL_CERTIFICATE
SSL_CERTIFICATE_KEY
SSL_ALLOWED_HOSTS
-
CODALAB_SITE_DOMAIN
should be something likecompetitions.codalab.org
-
CHAHUB_API_URL
andCHAHUB_API_KEY
Make sure the following ports are open (respecting your .env
settings):
- 80
- 443
- 5555
- 5671/5672
- 15671/15672
docker-compose up -d
Upload your backup to /path/to/codalab-competitions/backups, this directory is accessible by the Postgres docker
Using DB_NAME
and other setings from .env
...
The backup we uploaded will be available from inside the container at /app/backups/<filename>.dump
# from in codalab-competitions directory
$ docker-compose exec postgres bash
container$ dropdb $DB_NAME -U $DB_USER
container$ createdb $DB_NAME
container$ pg_restore -U $DB_USER -d $DB_NAME -1 /app/backups/<filename>.dump
Last step, to fix a Rabbit problem, is to set the login password properly. Set the guest
account password to the same as the RABBITMQ_DEFAULT_PASS
variable from .env
. You do this in the RabbitMQ management portal: http://<your domain>:15672/#/users/guest
(or whatever your rabbitmq management port is)
Then in the "Update this user" section you can configure the password.
- Make sure all ports are open.
- Create a competition.
- Create a queue. (confirms rabbit connection, do not need to make compute worker)
- Make submission to competition
- Make sure all ports are open.
- Create a competition.
- Create a queue.
- Edit the competition to point to the queue.
- Create a compute worker machine somewhere pointing to this
BROKER_URL
.
$ sudo apt-get install -y python3-pip
$ pip3 install tqdm requests
$ python3 /path/to/codalab-competitions/scripts/test_restored_instance.py -w True -d <domain> -u <username> -p <password>