Make Docker images suitable for ESS/ECE #49926

pugnascotia · 2019-12-06T16:47:06Z

Cloud currently builds and runs a fully custom Elasticsearch image. The Stack image should provide an image suitable for direct use instead.

User / Group IDs

Cloud currently expects to be able to set the user and group IDs via environment variables. The entry point script changes the founduser user's UID and GID to those provided in the env vars, and performs a recursive chown on some directories in the container.

The Stack image takes a different approach, in that it follows the OKD image guidelines and uses GID 0 for all files.

The different in approaches may well be due to Cloud's multi-tenancy. I don't know how OKD handles this.

(There is a related open issue, "Pin USER in Dockerfile complying with Docker best practices" (#46166).)

setuid flags

Cloud ensures that there are no files with setuid, in order to mitigate "stackclash" attacks. Basically, they do this:

RUN find / -xdev -perm -4000 -exec chmod u-s {} +

This could be done in the Stack image.

Init process

Cloud runs Elasticsearch via a mini-init process in order to avoid zombie processes. There should be no harm in adopting this in the Stack image.

JVM options

Cloud comments out the heap settings and disables the HeapDumpOnOutOfMemoryError flag in jvm.options. Cloud is also passing the relevant heap options on the command line so it's not necessary to comment out the heap settings (this has been the case since 5.0). It does has the advantage that the resulting Elasticsearch command line flags is tidier. I don't know if that's important to Cloud.

Quota-aware filesystem

Cloud changes the default filesystem provider to an implementation that is aware of user quotas. This is achieved by providing a file to the container that contains the quota settings, which the provider then takes into account then answering capacity questions. The file has a default location, which can be changed via a system property. The filesystem provider is changed via a JVM option, set via ES_JAVA_OPTS.

Cloud has suggested that the Stack provide "a native and equivalent way to deal with this". There is a Stack Overflow question that suggests that Java lacks support for this at present. It ought to be possible to perform native calls on Linux (and possibly other operating systems) that return quota information, but it's unclear how feasible / time consuming that will be. Cloud can continue using their current solution without modifying the base image.

Plugins

The Cloud Docker build process downloads and adds plugins to the image, in a "plugin-archive" directory. Then, when an allocator run an Elasticsearch image, it runs the elasticsearch-plugin tool to install the active plugins from the archive directory. User-provided plugins are downloaded and installed dynamically.

There's nothing obvious that the Stack image needs to do here.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2019-12-06T16:47:11Z

Pinging @elastic/es-core-infra (:Core/Infra/Core)

pugnascotia · 2019-12-06T16:47:23Z

cc @dliappis / @nachogiljaldo for input.

alpar-t · 2019-12-13T09:18:58Z

@mieciu

droberts195 · 2019-12-13T16:40:08Z

Cloud runs Elasticsearch via a mini-init process in order to avoid zombie processes. There should be no harm in adopting this in the Stack image.

Please can whoever does this work ping me so we make sure the kill order of the "mini-init" doesn't cause problems for ML. This problem has recently affected Cloud and is being fixed for Cloud, but also the identical problem came up as the root cause of #46262, so I'd like to make sure it's not accidentally introduced in a third place.

pugnascotia · 2019-12-16T16:52:17Z

Talking to @mieciu, we do want the GID 0 approach for the user running ES, which is good. We may need to support UIDs other than 1000 - I'm trying to find out more.

pugnascotia · 2019-12-17T10:12:52Z

@droberts195 any idea who fixed it in Cloud, or a PR number maybe? Just so I can have a peek.

droberts195 · 2019-12-17T11:02:22Z

any idea who fixed it in Cloud

It was @mieciu. The PR is in a private repo.

Closes elastic#49926 and elastic#46166. * Add a mini-init process (copied from Cloud) * Don't use root at all when running the container * Add an explicity TERM handler * Ensure no files in the image have the setuid flag Also improve dependency tracking in the build.

Closes #49926 and #46166. Rework the Docker image so that it comes with a tiny init system, to ensure ML processes are correctly cleaned up, and to run ES as a regular user instead of root. Also: * Ensure no files in the image have the setuid/setgid flag * Also improve dependency tracking in the build * Remove TAKE_FILE_OWNERSHIP option and its documentation

pugnascotia added :Core/Infra/Core Core issues without another label v8.0.0 labels Dec 6, 2019

pugnascotia self-assigned this Dec 16, 2019

pugnascotia mentioned this issue Dec 17, 2019

Make the Docker build more re-usable in Cloud #50277

Merged

pugnascotia closed this as completed in #50277 Jan 23, 2020

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Docker images suitable for ESS/ECE #49926

Make Docker images suitable for ESS/ECE #49926

pugnascotia commented Dec 6, 2019

elasticmachine commented Dec 6, 2019

pugnascotia commented Dec 6, 2019

alpar-t commented Dec 13, 2019

droberts195 commented Dec 13, 2019

pugnascotia commented Dec 16, 2019

pugnascotia commented Dec 17, 2019

droberts195 commented Dec 17, 2019

Make Docker images suitable for ESS/ECE #49926

Make Docker images suitable for ESS/ECE #49926

Comments

pugnascotia commented Dec 6, 2019

User / Group IDs

setuid flags

Init process

JVM options

Quota-aware filesystem

Plugins

elasticmachine commented Dec 6, 2019

pugnascotia commented Dec 6, 2019

alpar-t commented Dec 13, 2019

droberts195 commented Dec 13, 2019

pugnascotia commented Dec 16, 2019

pugnascotia commented Dec 17, 2019

droberts195 commented Dec 17, 2019