It is possible that docker hasn't correctly reloaded/restarted between chef runs or a container is stuck.
There are several services required to operate the runners cache.
The blackbox prober instances look like this:
- Docker registry:
runners-cache-X.gitlab.com:5000/v2
- Minio object storage:
runners-cache-X.gitlab.com:9000/minio/login
Nginx acts as a proxy for the registry, which is backed by minio.
Both registry and minio run as containers: run sudo docker ps -a
to check out their status.
We're assuming the hostname is runners-cache-1.gitlab.com
for the rest of this page.
- Log into the runners-cache instance that is alerting.
- Try to open https://runners-cache-1.gitlab.com/minio/login. If you receive 502 error, then cache is down. Bear in mind it could be down even if you get the login page.
- If you are not receiving anything, then check nginx with
sudo service nginx status
. If the state isActive: inactive
then start it bysudo service nginx start
. - Check that minio is up with
sudo docker ps | grep minio
. - Check if the registry container is receiving requests with
sudo docker logs --tail 1 registry
. If it's more than 10 minutes then you need to recycle the container.
Usually you need to restart the containers:
- Login to
runners-cache-1.gitlab.com
- Stop all the containers, running and not:
sudo docker rm -f minio registry
- Run
sudo chef-client
to restart them. - Check that they started correctly by inspecting the logs.