Skip to content
This repository has been archived by the owner on Sep 7, 2018. It is now read-only.

Image improvements #146

Merged
merged 18 commits into from
Apr 13, 2018
Merged

Image improvements #146

merged 18 commits into from
Apr 13, 2018

Conversation

xlson
Copy link
Contributor

@xlson xlson commented Mar 23, 2018

Important: do not merge before we are ready to release 5.1.

Re-structured docker image which makes it easier to control what user Grafana runs as, removes default volumes and no longer chowns any folders or files on startup. The default user has been changed from grafana (id: 104) to grafana (id: 472) which will cause permission problems if used together with files created in a previous version. The main grafana docs has been updated to help solve these issues. Docs PR here grafana/grafana#11365.

Closes #141
Closes #140
Closes #50
Closes #124
Closes #102

xlson added 7 commits March 20, 2018 13:43
- chown removed
- based on tar.gz instead of deb pkg
- volumes for logs and config removed
- gosu removed

Related to #141
We don't write logs to disk by default,
but for user that wish to do so it needs
to be possible.
@xlson xlson self-assigned this Mar 23, 2018
@xlson xlson requested a review from bergquist March 23, 2018 12:31
* master:
  Fix permissions problem #141
Dockerfile Outdated
@@ -1,22 +1,33 @@
FROM debian:jessie
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we try something like debian:9.3-slim while at it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, forgot about that. Will look into it.

README.md Outdated
```

Add lines with `RUN grafana-cli ...` for each plugin you wish to install in your custom image. Don't forget to specify what version of Grafana you wish to build from (replace 5.0.0 in the example).
Add lines with `grafana-cli ...` for each plugin you wish to install in your custom image. Don't forget to specify what version of Grafana you wish to build from (replace master in the example).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add support for passing GF_PLUGINS as a build arg, then using RUN to iterate over them in the standard Dockerfile?

Copy link
Contributor Author

@xlson xlson Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that something you think we might use ourselves? Without someone with that explicit need I don't see a big advantage over just extending our Dockerfile instead as actually using this feature would require forking our repo instead of just creating a new file.

Copy link

@Spindel Spindel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except the permissions/user part, it looks okay.

http://jdebp.eu./FGA/dont-abuse-nobody-for-daemons.html

Dockerfile Outdated
curl -L https://github.com/tianon/gosu/releases/download/1.10/gosu-amd64 > /usr/sbin/gosu && \
chmod +x /usr/sbin/gosu && \
RUN apt-get update && apt-get install -qq -y wget tar sqlite libfontconfig curl ca-certificates && \
wget -O /tmp/grafana.tar.gz https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-$GRAFANA_VERSION.linux-x64.tar.gz && \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You install both curl and wget? Just use curl here and don't use wget.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of puting the tar in a file, use

mkdir -p $GF_PATHS_HOME && \
wget -qO- $GF_URL/grafana-$GRAFANA_VERSION.linux-x64.tar.gz |tar xfvz - --strip-components=1 --exclude=tools -C $GF_PATHS_HOME

Skips writing the file to a file and need to cleanup, as well as directly extracts in the GF_PATH_HOME.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. We need tools though. May change in the future.

Dockerfile Outdated
apt-get autoremove -y && \
rm -rf /var/lib/apt/lists/*

RUN mkdir -p /etc/grafana/provisioning/datasources && \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mkdir takes multiple directories as an argument

Copy link
Contributor Author

@xlson xlson Mar 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Not sure why I didn't do that originally.

Dockerfile Outdated
cp $GF_HOME/conf/ldap.toml /etc/grafana/ldap.toml && \
cp $GF_HOME/bin/grafana-server /usr/sbin/grafana-server && \
cp $GF_HOME/bin/grafana-cli /usr/sbin/grafana-cli && \
chown -R nobody:nogroup /var/lib/grafana && \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use nobody as a user for a service, even inside docker. Create a specific user for this service. You don't want someone to bind mount things and files to be owned by the user nobody . nobody is special in unix due to some nasty legacy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't seem to be a good way to solve this problem when you're trying to mount directories from the host filesystem as volumes, because you need to ensure that there is a matching uid & gid between the process running in the container and on the host.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes nobody and even worse option, since nobody has a very explicit meaning and you should never run services as nobody.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some discussion with @xlson the proposed solution is to explicitly set a uid and gid (exact value tbd) for the grafana user & group in the container, make all Grafana files world-readable in the container, and use USER grafana:grafana. Users looking to mount external volumes can either chown them to that uid or use the -u docker option to run as a different uid.

If the uid and gid are specified with ARG it would also be relatively straightforward to build a custom image with whatever uid/gid the end user required.


# start grafana
docker run \
-d \
-p 3000:3000 \
--name=grafana \
--volumes-from grafana-storage \
-v grafana-storage:/var/lib/grafana \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably have :z as an argument as well, in case of SELinux?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never seen that one before. Will look into it. Thanks for reviewing the container, much appreciated.

Copy link

@ChristianKniep ChristianKniep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduce file-system layer site and define ENV in Dockerfile.
Put my changes here: https://github.com/ChristianKniep/grafana-docker/blob/image-improvements-qnib/Dockerfile

Dockerfile Outdated
@@ -1,22 +1,33 @@
FROM debian:jessie

ARG DOWNLOAD_URL="https://s3-us-west-2.amazonaws.com/grafana-releases/master/grafana_latest_amd64.deb"
ARG GRAFANA_VERSION="latest"
ARG GF_HOME="/usr/share/grafana"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of defining defaults in the run.sh I reckon to set the ENV-defaults in the Dockerfile and use GF_PATH_HOME directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Should make it easier to change in the future.

Dockerfile Outdated
curl -L https://github.com/tianon/gosu/releases/download/1.10/gosu-amd64 > /usr/sbin/gosu && \
chmod +x /usr/sbin/gosu && \
RUN apt-get update && apt-get install -qq -y wget tar sqlite libfontconfig curl ca-certificates && \
wget -O /tmp/grafana.tar.gz https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-$GRAFANA_VERSION.linux-x64.tar.gz && \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of puting the tar in a file, use

mkdir -p $GF_PATHS_HOME && \
wget -qO- $GF_URL/grafana-$GRAFANA_VERSION.linux-x64.tar.gz |tar xfvz - --strip-components=1 --exclude=tools -C $GF_PATHS_HOME

Skips writing the file to a file and need to cleanup, as well as directly extracts in the GF_PATH_HOME.

Dockerfile Outdated
ENTRYPOINT ["/run.sh"]
USER nobody

ENTRYPOINT [ "/run.sh" ]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rather use CMD then ENTRYPOINT, as I find it always cumbersome to debug, as one can not simply use docker run -ti --rm grafana/grafana-docker bash.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm like that too. Will have to check if there are any good reasons to keep it as an entrypoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that CMD would have been a better option it would be a breaking change and as such I don't think its enough of an improvement to warrant it.

run.sh Outdated
@@ -2,50 +2,70 @@

: "${GF_PATHS_CONFIG:=/etc/grafana/grafana.ini}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put the defaults in the Dockerfile using ENV, by doing so one can overwrite the default and it's clear where they are set.

DanCech and others added 2 commits March 25, 2018 16:34
Adds functionality to easily pre-bake with custom plugins.
- cleanup of mkdir
- oneliner for downloading and extracting Grafana
- envs moved from run.sh to Dockerfile
@xlson
Copy link
Contributor Author

xlson commented Mar 25, 2018

@ChristianKniep I didn't see your changes until now. Will implement the one-layer ENV tomorrow and recude the RUN as well. We don't want a volume for /var/log/grafana. The deafult is to write to console in the container, any other logging setup is up to the user. Thanks for your feedback!

Copy link

@brancz brancz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At CoreOS we've been building our own images of Grafana for two reasons

  1. dangling volumes, unnecessarily created by the volume directive.
  2. the ability to run the Grafana container as any user.

Both of these are important aspects of running the Grafana container in a security-wise restricted environments on Kubernetes, where Kubernetes chooses the user and group of a container at random and cannot detect whether the underlying runtime will create dangling volumes. Some runtimes simply reject running containers, which would otherwise create dangling volumes.

We'd love if the re-working of this image could mean that we can get rid of our own image 🙂 .

Dockerfile Outdated
FROM debian:stretch-slim

ARG GRAFANA_VERSION="latest"
ARG GF_UID="472"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this needs to be hardcoded in the image? Why not nouser:nogroup? In Kubernetes when using PodSecurityPolicies or SecurityContextConstraints (OpenShift), then users are not able to choose a user, this would mean that the Grafana container must be run as a more permissive container than others, for no reason.

Copy link
Contributor Author

@xlson xlson Mar 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When starting the remake I used nobody:nogroup (by nouser, do you mean nobody?) but one of the other reviewers was skeptical and preferred a specific grafana user (see earlier review). I don't really have a strong opinion myself on specifically what user it should be. The important thing (as I see it) is that the it should be pinned id and that it should be possible to set a different id at boot time, which it wasn't previously.

With this container it's possible to boot as a different user than the one specified but you will have to make sure /var/lib/grafana is writable by that user (and /usr/share/grafana if you wish to map in aws credentials).

To be honest I don't fully understand how nobody:nogroup is better than grafana:grafana. Could you please explain it to me? :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nobody:nogroup is typically used when you explicitly want to make sure that anybody/nobody owns something, for example in "public" folders that should be shared by multiple users of the same computer.

In the case of the container ensuring that things are "owned" by nobody:nogroup, means that a user can pick the UID/GID at creation/runtime time of the container rather than it being hard coded in to the image.

In the Kubernetes case, Kubernetes makes sure that the volumes created for the container to be consumed have the UID/GID that the container will have.

From the security point of view it's that administrators want to disallow users from picking UID/GID, as if you allow users to pick any UID/GID they can choose 0 aka root, which would be a privilege escalation. Kubernetes has a mode where UID/GID is picked by Kubernetes, which however requires that the image indeed can run with any UID/GID. It seems odd at best if we have to add an exception to the otherwise restrictedly running containers for Grafana, simply because the UID and GID are hardcoded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nobody:nogroup is typically used when you explicitly want to make sure that anybody/nobody owns something, for example in "public" folders that should be shared by multiple users of the same computer.

In the case of the container ensuring that things are "owned" by nobody:nogroup, means that a user can pick the UID/GID at creation/runtime time of the container rather than it being hard coded in to the image.

I don't understand how nobody:nogroup solves the permissions issue. As far as I'm aware the only way to make sure you can set user easily at create/runtime is to chmod a+rw on both /var/lib/grafana and /usr/share/grafana/.aws at build time.

In the Kubernetes case, Kubernetes makes sure that the volumes created for the container to be consumed have the UID/GID that the container will have.

From the security point of view it's that administrators want to disallow users from picking UID/GID, as if you allow users to pick any UID/GID they can choose 0 aka root, which would be a privilege escalation. Kubernetes has a mode where UID/GID is picked by Kubernetes, which however requires that the image indeed can run with any UID/GID. It seems odd at best if we have to add an exception to the otherwise restrictedly running containers for Grafana, simply because the UID and GID are hardcoded.

But isn't setting the user and group to nobody:nogroup just another way to hardcode them? Is nobody handled differently in Kubernetes somehow? Or is using nobody:nogroup something that is beneficial specifially due to how you run kubernetes?

Perhaps you'd be willing to create a pull request that shows off how you would like the image to look like and that would make it easier for me to understand what the benefits are. When googling nobody:nogrup and containers people seem to have warying opinions (prometheus/prometheus#3441). I'm not really opposed to using nobody:nogroup, but I'd like to understand what the benefits are :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the container is built with nouser:nogroup, it means that whatever UID/GID a user chooses to run the container with can write the files/directories within the container, as the operating system specifically allows this for directories/files with nouser:nogroup. Volumes mounted from the outside are then prepared by the user to have exactly the UID/GID they choose instead of it having to be what is hard coded in the image.

At the end of the day it means whether people will have to create special cases for handling any container that hardcodes a UID/GID vs. a generic way of handling all containers, because they can all run with any user. Most containers out there can indeed run as any user.

Copy link
Contributor Author

@xlson xlson Mar 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any examples of this behaviour? It sounds great but I haven't been able to reproduce it. I tried switching to nobody:nogroup and running it on my Ubuntu 16.04 install and when switching to another user than nobody I can no longer write to /var/lib/grafana.

Example using the grafana container in the kube-prometheus repo on my machine:

$ docker run -p 3000:3000 --user 105 quay.io/coreos/monitoring-grafana:5.0.3
t=2018-03-26T12:24:07+0000 lvl=info msg="Starting Grafana" logger=server version=5.0.3 commit=91162f3 compiled=2018-03-16T16:11:16+0000
t=2018-03-26T12:24:07+0000 lvl=info msg="Config loaded from" logger=settings file=/grafana/conf/defaults.ini
t=2018-03-26T12:24:07+0000 lvl=info msg="Config loaded from" logger=settings file=/grafana/conf/config.ini
t=2018-03-26T12:24:07+0000 lvl=info msg="Path Home" logger=settings path=/grafana
t=2018-03-26T12:24:07+0000 lvl=info msg="Path Data" logger=settings path=/data
t=2018-03-26T12:24:07+0000 lvl=info msg="Path Logs" logger=settings path=/data/log
t=2018-03-26T12:24:07+0000 lvl=info msg="Path Plugins" logger=settings path=/data/plugins
t=2018-03-26T12:24:07+0000 lvl=info msg="Path Provisioning" logger=settings path=/grafana/conf/provisioning
t=2018-03-26T12:24:07+0000 lvl=info msg="App mode production" logger=settings
t=2018-03-26T12:24:07+0000 lvl=info msg="Initializing DB" logger=sqlstore dbtype=sqlite3
t=2018-03-26T12:24:07+0000 lvl=info msg="Starting DB migration" logger=migrator
t=2018-03-26T12:24:07+0000 lvl=eror msg="Fail to initialize orm engine" logger=sqlstore error="Sqlstore::Migration failed err: unable to open database file\n"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume I'm missing something obvious 🤷‍♂️

Copy link
Contributor

@DanCech DanCech Mar 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brancz as long as the files within the container that the container process needs to access all have their permissions set to be readable by all users (and executable as needed) then the container can be run as any user and still work. As far as I can see you don't gain anything by making the files within the container owned by nobody vs the proposed arrangement. The only thing that matters is that if you are using a volume to provide a writable filesystem then that needs to have appropriate permissions for whatever uid the process is running as inside the container. By specifying a uid that doesn't change as the default the user can do that either by using chown 472:472 or by running the container with whatever uid they prefer.

Copy link
Contributor Author

@xlson xlson Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brancz The image still uses grafana:grafana but permissions has been modified so that it can be started as any user. Does this iteration of the image solve your problem?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to give this a test run, but I believe it should work.

Dockerfile Outdated
done; \
fi

VOLUME [$GF_PATHS_DATA]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the points outlined in docker-library/mysql#255, it can be incredibly frustrating to handle volumes indirectly created through volume directives. Unless there are strong reasons to keep this I'd suggest not to specify it, and let the user decide whether a volume shall be used or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We should get rid of volumes completely.

xlson added 2 commits March 26, 2018 11:21
Users should create their own volumes
when they need to.
@brancz
Copy link

brancz commented Apr 3, 2018

Which image can I use to test this? grafana/grafana:master doesn't work for me with unprivileged non-grafana user.

@xlson
Copy link
Contributor Author

xlson commented Apr 3, 2018

@brancz Sorry, should have mentioned that we haven't published one officially yet. You can use xlson/grafana:5.0.4 for testing.

* origin/master:
  make a target docker repo an optional param
  Only pushes to the latest tag for releases.
  Add workdir to dockerfile
@brancz
Copy link

brancz commented Apr 3, 2018

xlson/grafana:5.0.4 seems to work nicely 🙂

@xlson
Copy link
Contributor Author

xlson commented Apr 3, 2018

@brancz great news! Thanks for verifying.

Dockerfile Outdated
ARG GRAFANA_VERSION="latest"
ARG GRAFANA_URL="https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-$GRAFANA_VERSION.linux-x64.tar.gz"
ARG GF_UID="472"
ARG GF_GID="472"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO you can add build argument for architecture - see https://github.com/monitoringartist/grafana-xxl/pull/8/files. Then you can build also arm images easily.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jangaraj It's a bit out of scope for this PR but that's definitely something to look into when we have inhouse ARM builds of Grafana.

@brancz
Copy link

brancz commented Apr 4, 2018

lgtm from my side, looking forward to switching to this!

@xlson xlson force-pushed the image-improvements branch from 5e5e7bd to 766ad7d Compare April 4, 2018 08:35
@smur89
Copy link

smur89 commented Apr 6, 2018

xlson/grafana:5.0.4 working well for me too. Needed to swtich to get around some user permissions issues with named volumes

@Chekov2k
Copy link

Chekov2k commented Apr 9, 2018

xlson/grafana:5.0.4 works like a charm on OpenShift Origin v1.5 now :-)

@brancz
Copy link

brancz commented Apr 9, 2018

Is there anything we can help with that's blocking these changes? 🙂 Looking forward to having this as the official image!

@xlson
Copy link
Contributor Author

xlson commented Apr 13, 2018

@brancz not really, thanks for offering though 🙂 The merge had to wait until we were ready to release 5.1 as there are breaking changes in the image (and we build all docker images from master atm). I'm merging the changes today so master builds will be based on the new image from now on and the next stable release (5.1) should be out within 2 weeks.

@xlson xlson merged commit dbabd62 into master Apr 13, 2018
@xlson xlson deleted the image-improvements branch April 13, 2018 13:16
fg2it added a commit to fg2it/grafana-on-raspberry that referenced this pull request Apr 13, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants