Most Dockerfiles start from a parent image. If you need to completely control the contents of your image, you at some moment understand, that you need your own base image. Application baseimage is a special Docker image that is configured for correct use within Docker containers and aware about your application requirements, expectations + set of additional tools.
Usually those include, but not limited to:
- Modifications for Docker-friendliness.
- Application specific libraries and frameworks
- Some administration tools, that are especially useful for troubleshouting inside docker
- Mechanisms for easily running multiple processes, without violating the Docker philosophy
Perhaps every company or agency dealing with dockerized applications have or implemented such one. Until now you could find it in a form of complex shell and makefiles, especially if baseimage is released for multiple OS-es.
The closer ansible-container
1.0.0 release is, the more chances that you would give a try to ansible to take
care on your next dockerized application base image.
Let's briefly dive into components you might use in such image.
You can launch building process using any tool you like. From my experience - I've come so far with the following approach: https://github.com/Voronenko/container-image-boilerplate
requirements.txt
- define any specific py tools & libs you want to use together with ansible-container.
.projmodules
- list of dependencies (roles, deployables, etc) needed to compile base image. Format similar to gitmodules, but without direct links to commit. Example:
[submodule "roles/sa-nginx-container"]
path = roles/softasap.sa-nginx-container
url = https://github.com/softasap/sa-nginx-container.git
[submodule "roles/sa-uwsgi-container"]
path = roles/softasap.sa-uwsgi-container
url = https://github.com/softasap/sa-uwsgi-container.git
[submodule "roles/sa-include"]
path = roles/softasap.sa-include
url = https://github.com/softasap/sa-include.git
[submodule "deployables/application"]
path = deployables/application
url = https://github.com/voronenko-p/django-sample.git
init.sh
&& init_quick.sh
- resolve external dependencies under specified locations (roles under roles, deployables under deployables, etc)
ansible.cfg
- by default empty, but you can adjust parameters for your build process according to documentation.
version.txt
- information file gitflow based releasing (x.y.z)
image.txt
- additional information about image constructed for tagging and pushing parts.
Makefile
&& docker-helper.sh
- orchestration utilities, discussed in the next chapter.
docker-compose.yml
- usually I also provide docker-compose configuration, so the image can be immediately tried.
container.yml
- definition of the image you are going to build
And, of course, .gitignore
- we definitely do not want to commit external resources. p-env
- virtual environment created during the build, ansible-deployment
- transient output produced by ansible-container itself.
roles/*
ansible-deployment/*
p-env/*
deployables/*
*.retry
Makefile implement following phases: initialize
Resets project to initial state by removing all build directories and artifacts
clean:
@rm -rf .Python p-env roles ansible-deployment
Parses .projmodules and checks out necessary roles and deployables
initialize:
@init_quick.sh
Internal task - creates and initializes virtual environment with ansible-container under p-env/ directory.
p-env/bin/ansible-container: p-env/bin/pip
@touch $@
p-env/bin/pip: p-env/bin/python
p-env/bin/pip install -r requirements.txt
p-env/bin/python:
virtualenv -p $(python) --no-site-packages p-env
@touch $@
Executes ansible-container for container.yml, providing path to roles. Project name influences how produced image will be named.
build: p-env/bin/ansible-container
@p-env/bin/ansible-container --debug --project-name $(ROLE_NAME) build --roles-path ./roles/ -- -vvv
@echo "Application docker image was build"
Potentially, ansible-container allows immediatelly to run and stop images in a way like docker-compose does. From my experience, that is not always working,
- container.yml is based on 2nd version of the docker-compose spec, while I usually need at least 3.1 in production. Anyway steps are provided - they might work in your case. I usually end with the separate docker-compose.yml v3.1+ in the directory.
run: p-env/bin/ansible-container
@echo p-env/bin/ansible-container --debug --project-name $(ROLE_NAME) run --roles-path ./roles/ -- -vvv
@p-env/bin/ansible-container --debug --project-name $(ROLE_NAME) run --roles-path ./roles/ -- -vvv
@echo "Application environment was started"
stop: p-env/bin/ansible-container
@echo p-env/bin/ansible-container --debug --project-name $(ROLE_NAME) stop
@p-env/bin/ansible-container --debug --project-name $(ROLE_NAME) stop
@echo "Application environment was stopped"
Usually as a result of successful build, you want to push artifact to some registry. This highly depends on your project, and you most likely adjust it for your needs.
Providing example: properly tags api and nginx images built and pushes them to docker hub.
tag:
@docker tag $(ROLE_NAME)-api:latest softasap/sa-container-box-examples:$(ROLE_NAME).api.$(ROLE_VERSION)
@docker tag $(ROLE_NAME)-api:latest softasap/sa-container-box-examples:$(ROLE_NAME).api.latest
@docker tag $(ROLE_NAME)-nginx:latest softasap/sa-container-box-examples:$(ROLE_NAME).nginx.$(ROLE_VERSION)
@docker tag $(ROLE_NAME)-nginx:latest softasap/sa-container-box-examples:$(ROLE_NAME).nginx.latest
push:
@docker push softasap/sa-container-box-examples:$(ROLE_NAME).api.$(ROLE_VERSION)
@docker push softasap/sa-container-box-examples:$(ROLE_NAME).api.latest
@docker push softasap/sa-container-box-examples:$(ROLE_NAME).nginx.$(ROLE_VERSION)
@docker push softasap/sa-container-box-examples:$(ROLE_NAME).nginx.latest
Launching, stoppling and dismounting docker-compose managed containers.
compose-up:
@docker-compose up --no-recreate
compose-stop:
@docker-compose stop
compose-down: compose-stop
@docker-compose rm --force
That's all. Back to primary target.
For base image, I usually want:
- agreed folder organization (I do not want to guess each time where and how I need to put logic to run),
- robust init system.
- optional support for running multiple processes per container (as long as container remains single logical unit - it does not contradict with Docker philosophy).
- Depending on project I want to be able to use different base images and do not want to dive into compilation specifics each time.
- Possibility to run logic under custom users inside container and synchronize, if necessary files and folders with host system.
And final point: as long as possible I want to keep my code organized better, than series of bash command that usually are hard to read and hard to adjust
in a way we usually do with programs. And this is where ansible-container
will help.
Although people say, that proper docker microservice should run single dedicated process, it is not always achievable in real world. More correct to say, that Docker suggests the philosophy of running a single logical service per container. A logical service can consist of multiple OS processes. Sometimes you really need to run more than one service in a docker container. This is especially true, if you are adapting some application that previously was running in standalone vps environment.
Why init process is important: running processes can be visualized are ordered in a tree: each process can spawn child processes, and each process has a parent except for the top-most process. This top-most process is the init process. It is started when you start your container and always has PID 1. This init process is responsible for starting the rest of the system, including starting your application. When some process terminates, it turns into smth referred as "defunct process", also known as a "zombie process" (https://en.wikipedia.org/wiki/Zombie_process). In simple words, these are ones that are terminated but have not (yet) been waited for by their parent processes.
But what if parent process terminates (intentionally, or unintentionally)? What happens then to its spawned processes? They no longer have a parent process, so they become "orphaned" (https://en.wikipedia.org/wiki/Orphan_process).
And this is where the init process kicks in. It becomes new parent (adopts) orphaned child processes, even though they were never created directly by the init process. The operating system kernel automatically handles adoption. Moreover: the operating system expects the init process to reap adopted children too.
What if not? As long as a zombie is not removed from the system via a wait, it will consume a slot in the kernel process table, and if this table fills, it will not be possible to create further processes in the host system itself. Also, init system implemented wrong often leads to incorrect handling of processes and signals, and can result in problems such as containers which can't be gracefully stopped, or leaking containers which should have been destroyed.
More reading on a topic:
- https://medium.com/@gchudnov/trapping-signals-in-docker-containers-7a57fdda7d86
- https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/
Upstart, Systemd, SysV usually are too heavy (overkill) to be used inside docker (+ not always easily possible). What are the options ?
At a time of article writing, most often used init approaches were:
as per docker documentation, https://docs.docker.com/engine/admin/multi-service_container/ Take a look on a Proof-of-concept example of such script below
#!/bin/bash
# Start the first process
./my_first_process -D
status=$?
if [ $status -ne 0 ]; then
echo "Failed to start my_first_process: $status"
exit $status
fi
# Start the second process
./my_second_process -D
status=$?
if [ $status -ne 0 ]; then
echo "Failed to start my_second_process: $status"
exit $status
fi
# Naive check runs checks once a minute to see if either of the processes exited.
# This illustrates part of the heavy lifting you need to do if you want to run
# more than one service in a container. The container will exit with an error
# if it detects that either of the processes has exited.
# Otherwise it will loop forever, waking up every 60 seconds
while /bin/true; do
ps aux |grep my_first_process |grep -q -v grep
PROCESS_1_STATUS=$?
ps aux |grep my_second_process |grep -q -v grep
PROCESS_2_STATUS=$?
# If the greps above find anything, they will exit with 0 status
# If they are not both 0, then something is wrong
if [ $PROCESS_1_STATUS -ne 0 -o $PROCESS_2_STATUS -ne 0 ]; then
echo "One of the processes has already exited."
exit -1
fi
sleep 60
done
Will work, but really does not guarantee reaping... Let's examine more robust alternatives.
dumb-init is a simple process supervisor and init system designed to run as PID 1 inside minimal container environments (such as Docker). It is deployed as a small, statically-linked binary written in C.
dumb-init enables you to simply prefix your command with dumb-init. It acts as PID 1 and immediately spawns your command as a child process, taking care to properly handle and forward signals as they are received.
Project repo: https://github.com/Yelp/dumb-init
Tini advertises itself as a tiny but valid init for containers. Promises:
- protection from software that accidentally creates zombie processes
- ensures the default signal handlers work for the software you run in your Docker image.
- easy to inject: Docker images that work without Tini will work with Tini without any changes.
Shipped as precompiled binary for hugh variety of platforms.
Project repo: https://github.com/krallin/tini
Runit is a cross-platform Unix init scheme with service supervision, a replacement for sysvinit, and other init schemes. It runs on GNU/Linux, BSD, can easily be adapted to other Unix operating systems. The program runit is intended to run as Unix process no 1, it is automatically started by the runit-init /sbin/init-replacement if this is started by the kernel.
Project website: http://smarden.org/runit/
S6 is project, actually sibling of the RUnit by http://skarnet.org/software/s6/overview.html. S6 contains collection of utilities revolving around process supervision and management, logging, and system initialization. More over, specifically for a docker exists helper project, so-called "s6-overlay" https://github.com/just-containers/s6-overlay
S6 provides:
- lightweight init process with support of initialization (cont-init.d), finalization (cont-finish.d) as well as fixing ownership permissions (fix-attrs.d).
- The s6-overlay provides proper PID 1 functionality inside docker container. Zombie processes will be properly cleaned up.
- Support for multiple processes in a single container ("services")
- Usable with all base images - Ubuntu, CentOS, Fedora
- Distributed as a single .tar.gz file, to keep your image's number of layers small.
- A whole set of utilities included in s6 and s6-portable-utils. They include handy and composable utilities.
- Log rotating out-of-the-box through logutil-service which uses s6-log under the hood.
Part of the Phusion baseimage project https://github.com/phusion/baseimage-docker, which currently targets Ubuntu 16:04, Ubuntu 14:04 OS-es. Consists of custom written py file and wrapped optional runit.
Provides
- protection from software that accidentally creates zombie processes - reaping is implemented as a part of control script.
- in addition supports startup files in init.d and rc.local directories.
- supports additional optional services inside container via runit: cron, ssh
- handles additional magic with environment
Requires: python inside your container. In present form limited to Ubuntu base system only.
This is known process manager usually used with python applications. I often saw people trying to use it as init system. But: supervisor explicitly mentions that it is not meant to be run as your init process. If you have some subprocess fork itself off, it won’t be cleaned up by Supervisor. Thus you anyway would need to choose different init system.
Good if you anyway used it with your application earlier.
If you know more candidates, please comment.
From mentioned above and at the same time lightweight, worse to mention:
Classic supervisor which does not even require root privileges. Ctl script that acts similar to systemd's systemctl. Supports processes restarting, as well as event handlers based on shell protocols.
Runit ships with swiss-knife set of utilities, one of them is runsvdir
(http://smarden.org/runit/runsvdir.8.html)
It allows defining set of "service definitions" in some directory, and makes care on launching them at startup.
Typical interaction examples:
/usr/bin/sv status /etc/service/
- get status of services listed in configuration folder
/usr/bin/sv -w 10 down /etc/service/*
- shutdown all services with timeout 10
Unlike supervisor, s6 uses a folder structure to control services, similar to RUnit. S6-overlay requires them under /etc/services.d before running init (it then copies them over to /var/run/s6/services).
services
└ nginx
└ run
Now, each of those run files is an executable that s6 executes to start the process.
#!/bin/bash
set -e
exec nginx -c /etc/nginx/nginx.conf
Services are controlled by s6-svc binary. It has number of options, but the main idea is that you give it the directory of the service. So for example, if I wanted to send SIGHUP to nginx, I would do s6-svc -h /var/run/s6/services/nginx. Note: that it’s /var/run and not /etc/services.d; This is hugh difference from RUnit. And lastly, the -h is for SIGHUP.
s6-overlay comes with a number of built versions, so you can download the one that matches your Linux setup. If you want to use s6 directly, users of Alpine and a few other flavors of Linux can just install it from their package manager. We’re running Debian and there’s no PPA for it, so we would have to compile s6 on our own
If you know more candidates, please comment.
By default, Docker containers run as the root user. This is bad because:
- Application might modify up things that it shouldn't be
- If application shares folder with base host, all created files will be owned by root
- If container is compromised - well, still it is bad if they're root.
If you want to understand, how uid and gid work in Docker containers, take a look on that article: https://medium.com/@mccode/understanding-how-uid-and-gid-work-in-docker-containers-c37a01d01cf
What are the options:
Set on the level of Dockerfile
# Create an app user so our program doesn't run as root.
RUN groupadd -r app &&\
useradd -r -g app -d /home/app -s /sbin/nologin -c "Docker image user" app
...
USER app
or pass as the params on docker run
docker run --rm --group-add audio --group-add nogroup --group-add 777 busybox id
uid=0(root) gid=0(root) groups=10(wheel),29(audio),99(nogroup),777
Success scenario, but not always possible, if container communicates with the host.
This is utility which is shipped with RUnit (http://smarden.org/runit/chpst.8.html) - you can easily use it if you already have RUnit as part of your init system.
If you have chosen phusion image as a basis, you have a setuser
in path
https://github.com/phusion/baseimage-docker/blob/master/image/bin/setuser
Well, you can install sudo inside container. Some people do. But do you want?
With information above you have already few ideas for constructing your base image.
Sorry for repeating, but let me emphasize once more: if you want alternative to custom makefiles and tons of hard-to-read shell files, give a try to
ansible-container. Ansible-container provides better alternative to the command && command && command
(and so on) syntax you’ve been struggling with to build containers. Since Ansible is at the heart of Ansible Container, you can make container builds completely predictable and repeatable, and more over readable.
I have to admit, that although it is already passed 0.9.1, ansible-container is still at it's early ages with some cumbersome side effects from time to time (https://medium.com/@V_Voronenko/evaluating-ansible-container-as-a-tool-for-custom-docker-containers-build-500a0395a4c8), but it really becomes more robust each minor release, thus if you try it now - your production will be ready for concept when it is released...
Previously running ansible inside container required installing a lot of unnecessary packages, which was causing bigger image sizes. Ansible-cointainer introduced different approach: combination of conductor (managing container with ansible and necessary tools installed) and target container. This allows to keep target image size small and this approach will work on any base image, which allows ansible and python to be installed.
Aim of the current proof of concept: build bootstrap role for building base application image, and thus simplify application play itself.
We want: select preferred init system: tini, dumb-init or init approach by Phusion (phusion-init)
container_init: "phusion-init" # "dumb-init" "tini-init"
container_init_directory: /etc/my_init.d
We want: have possibility to choose internal service layer: runit, supervisor or s6.
container_svc: "runit" # "supervisord"
Following the Phusion's base image concept, we want optional cron, sshd, and syslog services inside container.
option_container_cron: true
option_container_sshd: true
option_container_syslog_ng: true
If we ever wanted ssh inside container, let's provide keypair to trust.
container_ssh_private_key: "{{role_dir}}/files/keys/insecure_key"
container_ssh_public_key: "{{role_dir}}/files/keys/insecure_key.pub"
Resulting play is compact and readable like your usual ansible play.
- include: tasks_system_services.yml
- include: tasks_cron.yml
when: option_container_cron
- include: tasks_sshd.yml
when: option_container_sshd
- include: tasks_syslog_ng.yml
when: option_container_syslog_ng
More over - we are not limited to some single approach - we are free to select init system and service system from list of supported. You can always add the new one.
In order to provide some compatibility between systems, I usually
a) put services runners into /etc/service//run - this supports both running via shell, runit, s6. Supervisord might re-execute the same script. b) put image pre-initialization into /etc/myinit.d/*.sh c) always name entry point as docker-entrypoint.sh d) your role, thanks to ansible, migth detect and target multiple basesystems - thus you can choose, for example, often used Alpine or Jessie.
Take a look on role example, that might be used to build such base image: https://github.com/softasap/sa-container-bootstrap
Typical ansible-container play used to build application image using your base play one might look as:
version: "2"
settings:
conductor_base: ubuntu:16.04
volumes:
- temp-space:/tmp # Used to copy static content between containers
services:
tini-supervisord:
from: ubuntu:16.04
container_name: api
roles:
- {
role: "softasap.sa-container-bootstrap",
container_init: "tini-init",
container_svc: "supervisord"
}
- {
role: "softasap.sa-nginx-container",
container_init: "tini-init",
container_svc: "supervisord"
}
- {
role: "../custom-roles/app-nginx-stub-deploy",
container_init: "tini-init",
container_svc: "supervisord"
}
expose:
- '8000'
- '22'
volumes:
- temp-space:/tmp # Used to copy static content between containers
environment:
IN_DOCKER: "1"
volumes:
temp-space:
docker: {}
version: "2"
settings:
conductor_base: ubuntu:16.04
volumes:
- temp-space:/tmp # Used to copy static content between containers
services:
dumb-runit:
from: ubuntu:16.04
container_name: api
roles:
- {
role: "softasap.sa-container-bootstrap",
container_init: "dumb-init"
}
- {
role: "softasap.sa-nginx-container",
container_init: "dumb-init" # optionally uses runit for services management and upstart.
}
- {
role: "../custom-roles/app-nginx-stub-deploy",
container_init: "dumb-init"
}
expose:
- '8000'
- '22'
volumes:
- temp-space:/tmp # Used to copy static content between containers
environment:
IN_DOCKER: "1"
volumes:
temp-space:
docker: {}
version: "2"
settings:
conductor_base: ubuntu:16.04
volumes:
- temp-space:/tmp # Used to copy static content between containers
services:
phusion-runit:
from: ubuntu:16.04
container_name: api
roles:
- {
role: "softasap.sa-container-bootstrap"
}
- {
role: "softasap.sa-nginx-container",
container_init: "phusion-init" # uses runit for services management and upstart.
}
- {
role: "../custom-roles/app-nginx-stub-deploy"
}
expose:
- '8000'
- '22'
volumes:
- temp-space:/tmp # Used to copy static content between containers
environment:
IN_DOCKER: "1"
volumes:
temp-space:
docker: {}
More examples on https://github.com/Voronenko/devops-docker-baseimage-demo
Base system: ubuntu 16.04
Init system: phusion_init
Base image with all services: 268 MB / 101 MB on docker.hub
Base image with demo nginx app installed: 313 MB / 133 MB on docker hub
phusion/baseimage: 225 MB / 84 MB on docker hub
Ansible-container, once it reaches stable version hopefully this year, appears to be very promising tool for introducing complex docker based build pipelines. In addition, ansible community allows you to re-use number of available roles - that potentially can streamline your deployment implementation path.