What should you know before
implementing a docker-based solution?
You are doing a mission for a client who needs to optimize resources.
Docker seems a good fit, but do you know what it entails?
- Docker 2017
- What?
- What for/against?
- What beyond?
- Available at github.com
- Available at gitpitch.com
- Available at intranet.softeam.com:
"Docker chez un client? Coast Easy!" - Fully annotated
You will find in the shownotes, or directly in the markdown article on GitHub additional information with each slides.
Palette: http://paletton.com/#uid=1000u0k004h0jin01bM5n02dm0p
Daniel CHAFFIOL (Softeam)
- Since 1999
- Development architect
- BNP, SGCIB, HSBC, Amundi
- Full CV: https://stackoverflow.com/cv/vonc
VonC (Stack Overflow)
- Since 2008
- 4th all-time user
- Topics: Version Control (Git), Go, Docker
- Stack Overflow profile: https://stackoverflow.com/users/6309/vonc
- Isolation vs. VM
- Host vs. Guest
- ContainerD, runc, CNCF, OCI, CRI-O, and Co
Same machine, just isolate itself from file system, CPU, process, memory.
See "Docker containers vs. virtual machines: What’s the difference?" (https://blog.netapp.com/blogs/containers-vs-vms/)
- No preemptive resource allocation
- Use of namespaces
- Reuse OS kernel/libs
(direct system calls)
- Preemptive resource allocation
- OS container
- Use its own OS kernel (Hypervisor)
That means a Docker container does not have to be an OS
(Ubuntu, Alpine, Debian)
It can be as simple as
FROM scratch
COPY myprogram /
ENTRYPOINT myprogram
Then docker build -t myprogram . && docker run -it --rm hello
See also "CONTAINERS ARE NOT VMS" (https://blog.docker.com/2016/03/containers-are-not-vms/)
On VM:
See https://blog.risingstack.com/operating-system-containers-vs-application-containers/
VM are also called "OS Containers":
- Meant to be run as an OS, that is run multiple services
- No layered filesystems by default
- Built on cgrougs, namespace , native process resource isolation
- Exemple: LXC, OpenVZ, Linux VServer, BSD Jails, Solaris Zones
During your mission with a client, you will often see both.
Source https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-containers-overview
Key: you will need both, almost always.
That is: isolation within a VM on a machine host.
And that means determining the level of support offered
by the actual machine host!
If no support, you are on you own at the next "kernel panic" error message.
See also "CONTAINERS AND VMS TOGETHER" (https://blog.docker.com/2016/04/containers-and-vms-together/)
It is rare these days to have a physical machine dedicated to only one instance of Docker.
- If there is any security issue, and one local root process escape its container, it becomes root on the all server.
- There is simply too many core/memory for just one service: it is best to allocate part of those resources per VM
- Docker Guest (container)
- Docker Host (OS or VM)
- Machine Host
Warning: it is not an OS
It is just a collection of files needed for one (ideally) or more
programs to run.
Beware of the zombie process issue (link Stack Overflow)
Where is docker installed?
Up to 3 levels!
Your docker can be installed directly on your machine, or on a VM.
See "How to access tomcat running in docker container from browser?" (https://stackoverflow.com/a/27481832/6309) as an example.
A client want to access a container:
Where is the final client?
FROM scratch
COPY hello(.exe) /
ENTRYPOINT ["/hello(.exe)"]
In one word: Architecture (of Docker)
Cf. "Diving Through The Layers: Investigating runc, containerd, and the Docker engine architecture" (https://www.slideshare.net/PhilEstes/diving-through-the-layers-investigating-runc-containerd-and-the-docker-engine-architecture) Phil Estes
From 2013 to early 2015, the binary docker has everything:
- client
- daemon
- build tool
- registry client
Split of the Docker Engine!
Docker 1.11, April 2016
- containerd: A lightweight daemon for handling container lifecycle:
gRPC API, simple
for handling commands. - containerd-shim: A shim process for holding parent ownership (of a containerd process) to allow daemon and containerd to exit/restart without impact.
- runc: the OCI-compliant runtime for executing container processes given a filesystem bundle and OCI configuration.
The year of Moby
See "Demystifying Moby: From OCI to Linuxkit" (http://www.adelzaalouk.me/2017/moby-linuxkit/) from Adel Zaalouk (http://twitter.com/zanetworker)
Adel explains:
- The OCI is an initiative to define common standards for container frameworks
to build containers.
Container engines such as Docker and rkt provided an easy way to run containers by just providing the container image name and the version. - runC is a container runtime that knows how to deal with OCI defined speicifcations
- Containerd is daemon to manage the complete container lifecycle,
it abstracts runC details and provides a gRPC compataible API that can be used
natively or from a command terminal (
). - Moby is a project that gathers together all the tools used internally by the docker teams and makes it public for developers and contributers to share a common repository to innovate.
"batteries included but removable"
Vm vs. Container (isolation):
you need both -
Host vs. Guest:
a container make kernel system calls -
paas, saas, caas, clouds
vs. Moving target -
vs. Security (VM Network) -
Dev? Ops? DevOps
vs. Legacy
- Principle
- Hybride
- Serveless
References, found in April 2014:
- https://twitter.com/ivan_curkovic/status/496262057663029248/photo/1
- "Pizza as a Service - On Prem, IaaS, PaaS, and SaaS Explained through Pie (Not Pi)" (http://www.episerver.com/learn/resources/blog/fred-bals/pizza-as-a-service/) from Fred Bals, whi mentions:
I was finally able to find the original source, which turned out to be a LinkedIn post by Albert Barron, a software architect at IBM): https://www.linkedin.com/pulse/article/20140730172610-9679881-pizza-as-a-service
"Pizza as-a-Service is Misunderstood" (http://itknowledgeexchange.techtarget.com/cloud-computing-enterprise/pizza-as-a-service-is-misunderstood/) from Brian Gracely (https://twitter.com/bgracely)
Applied to Docker, this is mainly about what is provided (by the container) vs. what you have to manage/install yourself at each upgrade.
Container is a mix between IaaS and PaaS
https://www.slideshare.net/chanezon/programming-the-world-with-docker from Patrick Chanezon.
"Pizza as a Service 2.0" (https://www.linkedin.com/pulse/pizza-service-20-paul-kerrison) from Paul Kerrison (https://www.linkedin.com/in/paulkerrison)
- On-Premises - like a homemade pizza, made from scratch, you do everything yourself (no change so far). Example: Datacentre
- Infrastructure as a Service - You share a kitchen with others. The utilities and oven are provided, but you make and cook the pizza yourself. Example: EC2, AVM
- Containers as a Service - You bring the pizzas but someone else uses their facilities and cooks the pizza for you. Example: ECS, ACS
- Platform as a Service - You order a pizza for collection, the pizzeria make and cook the pizza using their facilities. Example: App Engine
- Function as a Service - You go to a pizzeria with some friends. You order and then eat pizza made by the restuarant. You order drinks from the bar and they're made for you. Example: AWS Lambda, Azure Functions, Google cloud Functions
- Software as a Service - You go to someone's house for a party, they provide the pizza and invite others round for you to meet. Conversation with the guests is still your responsibility! Example: Gmail, O365 Exchange Online
This is like https://www.infoq.com/news/2016/02/docker-datacenter-caas: some containers on premise, some on the cloud.
This is /not/ "hybrid-cloud" (as described in 2014 in https://sreeninet.wordpress.com/2014/04/20/hybrid-cloud/)
It can be consuming services on the cloud while the application remains on premise: https://www.ibm.com/blogs/bluemix/2015/06/ibm-containers-a-bluemix-runtime-leveraging-docker-technology/ or https://www.ibm.com/blogs/bluemix/2015/06/deploy-containers-premises-hybrid-clouds-ibm-docker/
Pro vs. Con: https://www.spiceworks.com/it-articles/iaas-and-saas-vs-onprem/, http://rancher.com/devops-containers-prem-cloud/
For transient function/application
- https://www.slideshare.net/BrianChristner/docker-serverless
- https://www.contino.io/insights/building-a-serverless-application-with-docker
There is always a server:
- goal: optimize resource
- (physical + VM + container)
- even for "serverless"
- Releases
- Compatibilities (syntax)
- Environments (Cloud)
- Before x.y
- After: 17.04, 17.05...
- stable release every 3 months (03, 06, 09, 12)
Cf. My TL;DR: Docker Version/Name Change Highlights +++
- Beware of version
- Dockerfile, compose file, ...
Many environments available:
You will need to manage:
- the delta
- the wrappers
- the support
One application per server: not optimal.
- Why
- With VM
- With VM and Containers
One application per server is not optimal.
Not all VMs are fully used.
Fine-grained resources limits.
- Hardware resources optimization
- Software resources independence
- People resources collaboration
- Patches
- Broken isolation
- Filesystem
- ifyou patch the OS, it can break everything
- if you patch the VM, you can break the VM
(and its container)
See also If it’s in a container it’s secure right ?
- one line exploit
FROM alpine
COPY root.sh /root.sh
CMD ["/bin/bash", "root.sh"]
chroot /hostOS /bin/sh
docker build -t rootplease .
docker run -v /:/hostOS -i -t rootplease
@1-3 @[7](Single line: root) @[10](Access to all files on host)
- https://stackoverflow.com/a/34715019/6309
- https://github.com/chrisfosterelli/dockerrootplease
- https://medium.com/@mccode/understanding-how-uid-and-gid-work-in-docker-containers-c37a01d01cf
Depends on OS Kernel. Means:
- if escape, root on host
- uid/gid must match between host and image
appxray@/appxray/users/appxray/.jfrog/xray:$ l
total 4.0K
drwxr-xr-x 3 appxray appxray 18 Nov 10 14:45 ..
drwxr-xr-x 5 root root 44 Nov 13 10:07 mongodb
drwxr-xr-x 3 999 root 42 Nov 13 10:07 rabbitmq
drwxr-xr-x 7 appxray appxray 87 Nov 13 10:07 .
drwxr-xr-x 2 1035 1035 128 Nov 13 10:07 xray-installer
drwxr-xr-x 11 1035 1035 137 Nov 13 10:07 xray
drwx------ 19 999 root 4.0K Nov 13 10:07 postgres
@[3](I want to use my account) @[4](But instead I find root) @[5,7](or some container-local user id/gid)
- More management (cgroup, userns)
- More control (who access what)
- More audit
It is a culture.
That is the end goal, but where do you start?
Source: https://martinfowler.com/bliki/DevOpsCulture.html
- Development environment
- Execution environment
- Limit the OS used |
- Trace sudo commands |
- Control the Docker images used |
Where do you start? On the developer's workstation?.
Not sure: that is still tricky and requires:
- convincing your Security department
- restricting Docker image access
Execution Environment...
But also:
- Docker Registry
- ACL (or RBAC) |
- deployment |
- Monitoring |
- Reporting |
ELK stack 5Elasticsearch, Logstash, Kibana) Typically Splunk or Centreon for log coll
Source: https://www.pinterest.com/pin/567242515558273751/
Change is hard.
You will have to convince:
- Security
- Administrators
- Business
- Still hard on the developer workstation
- Still tricky on the server side
- Still a lot of questions
Source: How to Automate Docker on Vagrant?
- More management
- More monitoring
- Better deployment (blue-green, canary release)
Notions of chain and delivery.
You will need to present and communicate:
- How you secure the all system?
- How you facilitate the administration?
- How you quicken the TTM (Time To Market)
- Example: RBAC in Kubernetes. Beware of the LDAP integration.
- Example: portainer.io, Kubernetes dashboard
- Container as a Service: Between IAAS and PAAS
BUT: MOVING TARGET! - Isolation: Resource optimization/collaboration
BUT: SECURITY! - Dev? Ops? DevOps: Culture change
- Orchestration (Swarm vs. Kubernetes)
- Persistent volumes
- Declarative approach (for everything)
- Swarm
- Kubernetes
- Now fully integrated?
From https://www.slideshare.net/Dev_Events/building-next-gen-applications-and-microservices
- Containers:
- Encapsulates services and are accessible by IP/port combination
- Service Discovery:
- Provide a way to know when Services have been added/removed and where they are located.
- Service Orchestration:
- Manages Service Topologies
- Ensures availability and utilization
- API Gateway:
- Security
- Routing
Source: Democratizing orchestration with Docker
- On top of docker, you need to monitor the orchestrator
- That has an impact on your architecture
- and you need to secure your persistence
See also "Kubernetes vs Docker Swarm vs DC/OS: May 2017 Orchestrator Shootout" (https://www.linkedin.com/pulse/kubernetes-vs-docker-swarm-dcos-may-2017-orchestrator-arvind-soni)
Source: https://thenewstack.io/containers-container-orchestration/
Swarm alone:
- Gets you started
- Invisible
Source: https://thenewstack.io/containers-container-orchestration/
Kubernetes alone
- More complex
- More complete
- More notions
- Probably the safest choice because it will cover every scenario
- but still under heavy evolutions
See "DockerCon Europe 2017: Docker EE and CE to Include Kubernetes Integration" (https://www.infoq.com/news/2017/10/docker-kubernetes-integration) from Daniel Bryant (https://twitter.com/danielbryantu)
- only since 17.10
- result of CRI-O and CNCF
- so not always available
- Principle
- stateless vs. stateful
- drivers
Presentation: Container Storage Best Practices in 2017 (RedHat)
- Bind mounts
- Volumes
- tmpfs (in memory)
Official documentation:
Copy On Write, but:
- Beware of some configuration/metadata
- in-memory domain state, session
- in-application state
The stateless design simplifies the server design because there is no need to
dynamically allocate storage.
See "The State Of In-Application State: What No One Is Talking About" https://medium.com/@SeanWalshEsq/the-state-of-in-application-state-what-no-one-is-talking-about-c30392033b08
First of all, nearly all applications have state, which is more or less just
the current known information about a domain entity.
State is really a function of what occurs an entity over time.
Volumes, but:
- needs to be replicated/backup
- needs to support PSI
See "About images, containers, and storage drivers" https://docs.docker.com/engine/userguide/storagedriver/imagesandcontainers/
Presentation: Storage as microservice (OpenEBS)
See also "Docker Reference Architecture: Design Considerations and Best Practices to Modernize Traditional Apps (MTA)" https://success.docker.com/article/Docker_Reference_Architecture-_Design_Considerations_and_Best_Practices_to_Modernize_Traditional_Apps_(MTA)_with_Docker_EE
DVDI: Docker Volume Drive Interface
- to be benchmarked
- to be managed (upgrade/rollback)
As of Docker 1.7, there was a Volume Driver API defined that allows Docker to work
with external storage providers.
The API documentation is available here: https://github.com/docker/docker/blob/master/docs/extend/index.md
Study: Container Storage Architectures: How Does Kubernetes, Docker, and Mesos Compare?
Example: Docker 1.12.1 Swarm Mode & Persistent Storage with DellEMC RexRay
- Depends on the nature of the application
- Depends on the IO throughput
- Depends on the stability
- Trend
- Why: ownership
- Declarative vs. prescriptive
- Build: Maven (pom.xml)
- Compilation: Jenkins 2.x (jenkinsfile)
- Integration: Docker Compose (docker-compose.yml)
- Deployment: Kubernetes (deploy.yaml)
Those (pom.xml, jenkinsfile, yaml) are text file, which means:
- they are versioned
- they are auditable
- they are independent
Typically, if Jenkins goes in flame, you just restart a new Jenkins
and can re-declare your job by pointing at the repository which
includes your jenkinsfile.
The same can by said of Docker or Kubernetes.
- Declarative: maven, Jenkins, Dockerfile
- Prescriptive: swarm, Kubernetes
- Mechanics vs. Dynamics.
‘declarative’ in that they describe WHAT it is you’re trying to provision
(as opposed to ‘prescriptive’ models that describe HOW you’re going to get there).
- Descriptive means: if something breaks, the all process fails and stop
- Prescriptive means: if something breaks, the system tries its hardest to restore its state to a prescribed optimal one.
- You can "read" what you want to do
- You don't depend on a running system
- Orchestrators:
Docker Swarm vs. Kubernetes - Persistence:
Volume Drivers are evolving - Descriptive or Prescriptive:
text file, better ownership
Not a new idea:
- What: Isolation because resource optimization |
- What for: Devops means more collaboration, quicker TTM |
- What against: Change and Security management |
- What beyond: Ownership (data, process) |
But remember: A lot to manage!
Image from http://lakehub.co.ke/2016/01/08/docker-calendar-q1/
- Don't install anything
- Online only
- For Docker, Swarm, Kubernetes!
Play with Docker: play-with-docker.com
Play with Moby: play-with-moby.com
Katacoda Swarm: www.katacoda.com
Play with Kubernetes: play-with-k8s.com
Actually redirect to PWD! (Since K8s is integrated)
My Mooc: www.my-mooc.com
Docker Events: events.docker.com