Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Default badger path is no longer writtable #4906

Closed
freef4ll opened this issue Oct 30, 2023 · 23 comments · Fixed by #4908
Closed

[Bug]: Default badger path is no longer writtable #4906

freef4ll opened this issue Oct 30, 2023 · 23 comments · Fixed by #4908
Labels

Comments

@freef4ll
Copy link

What happened?

With the release of 1.50, documented badger default path is no longer writable as it is owned by root after the changes in #4783

Steps to reproduce

docker run \
  -e SPAN_STORAGE_TYPE=badger \
  -e BADGER_EPHEMERAL=false \
  -e BADGER_DIRECTORY_VALUE=/badger/data \
  -e BADGER_DIRECTORY_KEY=/badger/key \
  -v /tmp/badger:/badger   -p 16686:16686   jaegertracing/all-in-one:1.50

{"level":"fatal","ts":1698680105.917989,"caller":"./main.go:110","msg":"Failed to init storage factory","error":"Error Creating Dir: "/badger/key" error: mkdir /badger/key: permission denied","stacktrace":"main.main.func1\n\t./main.go:110\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/[email protected]/command.go:940\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/[email protected]/command.go:1068\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/[email protected]/command.go:992\nmain.main\n\t./main.go:243\nruntime.main\n\truntime/proc.go:267"}

Expected behavior

Jaeger to start up

Relevant log output

No response

Screenshot

No response

Additional context

No response

Jaeger backend version

1.50

SDK

No response

Pipeline

No response

Stogage backend

No response

Operating system

No response

Deployment model

No response

Deployment configs

No response

@freef4ll freef4ll added the bug label Oct 30, 2023
@yurishkuro
Copy link
Member

yurishkuro commented Oct 30, 2023

Makes sense, but I don't think we're going to fix that. You have a manual workaround of changing permissions on the existing files to non-root user (see #4906 (comment)).

To play devil's advocate, how would you expect Jaeger to handle that? Changing to non-root user was the right choice from security perspective.

The other thing is we should call this out in the release notes as a breaking change and suggest remediation steps, like running chown - would you like to submit a PR for that?

@miparnisari
Copy link

miparnisari commented Oct 31, 2023

The other thing is we should call this out in the release notes as a breaking change and suggest remediation steps, like running chown

Yes please. This is a breaking change, it should be in https://github.com/jaegertracing/jaeger/releases/tag/v1.50.0. I would submit a PR but i don't know what command i should run.

@freef4ll
Copy link
Author

I think if the directory is added into the container and have permissions created for it like this:

FROM alpine
ARG USER_UID=10001
RUN mkdir /badger && chown ${USER_UID} /badger
USER ${USER_UID}
CMD ["/bin/ls", "-l", "/badger'"]

Then it works fine:

$ docker run -v badger-storage:/badger -ti test /bin/sh
~ $ cd /badger
/badger $ touch a
/badger $ ls -l
total 0
-rw-r--r--    1 10001    root             0 Oct 31 09:10 a

Minimal PR created #4910; if you're happy with this then can update all other Dockerfiles.

yurishkuro added a commit that referenced this issue Oct 31, 2023
## Which problem is this PR solving?
- Resolves #4906

## Description of the changes
- Add #4783 to breaking changes for v1.50.0

## Other
* [ ] once this PR is merged, update release notes on GitHub

Signed-off-by: Yuri Shkuro <[email protected]>
@yurishkuro
Copy link
Member

updated changelog and release notes

@yurishkuro
Copy link
Member

thanks for reporting!

@miparnisari
Copy link

miparnisari commented Nov 1, 2023

I'm not a Docker expert, how would I prevent this problem from occurring with my existing Docker-compose file?

services:
  jaeger:
    image: jaegertracing/all-in-one:latest
    container_name: jaeger-demo
    command: [ "--query.max-clock-skew-adjustment", "500ms" ]
    environment:
      - COLLECTOR_OTLP_ENABLED=true
      - SPAN_STORAGE_TYPE=badger
      - BADGER_EPHEMERAL=false
      - BADGER_DIRECTORY_VALUE=/badger/data
      - BADGER_DIRECTORY_KEY=/badger/key
    volumes:
      - jaeger_data:/badger
    ports:
      - "16686:16686" # UI
    networks:
      - default
volumes:
  jaeger_data:

@t-pohl
Copy link

t-pohl commented Dec 8, 2023

@miparnisari Have you found a way to fix this directly in the docker-compose file? I'm facing the same problem.

@miparnisari
Copy link

@t-pohl, no :( I just pinned the version of Jaeger.

@t-pohl
Copy link

t-pohl commented Dec 11, 2023

@miparnisari I managed to upgrade using the following instructions:

Since version 1.50.0 of Jaeger (see https://github.com/jaegertracing/jaeger/releases/tag/v1.50.0) there is a breaking change which requires the external volume to be owned by a different user than root (which is the docker default). The user id is 10001. This requires to change the ownership of the volume folder on the host to the user 10001:
Example:
Find out the location of the jaeger volume on the host using docker volume inspect

docker volume ls
# Find the name of the volume
docker volume inspect traefik_jaeger_data

Start a root shell:

sudo bash

Change the volume ownwership:

cd /var/lib/docker/volumes/traefik_jaeger_data/ # Change to directory of volume on host
 chown -R 10001:10001 _data/

Afterwards start jaeger and check the logs.

Maybe this can help you too.

@de-perotti
Copy link

Turns out you can set user: root in docker compose to get it permissions working without having to chown every time with an external script. I'm pretty sure that's not the best practice but for local development, that's dandy

@yurishkuro
Copy link
Member

I just tried an experiment, on MacOS with Docker Desktop (results could be different on different OS)

With version before change

$ docker run \
  -e SPAN_STORAGE_TYPE=badger \
  -e BADGER_EPHEMERAL=false \
  -e BADGER_DIRECTORY_VALUE=/badger/data \
  -e BADGER_DIRECTORY_KEY=/badger/key \
  -v $HOME/dev/badger-storage:/badger \
  -p 16686:16686 \
  jaegertracing/all-in-one:1.49

$ docker ps
CONTAINER ID   IMAGE                           COMMAND                  CREATED          STATUS          PORTS                                                                               NAMES
30eb40642743   jaegertracing/all-in-one:1.49   "/go/bin/all-in-one-…"   17 seconds ago   Up 16 seconds   5775/udp, 5778/tcp, 14250/tcp, 6831-6832/udp, 14268/tcp, 0.0.0.0:16686->16686/tcp   vigilant_brown

$ docker exec -it 30eb40642743 /bin/sh
/ # ps
PID   USER     TIME  COMMAND
    1 root      0:01 /go/bin/all-in-one-linux
   16 root      0:00 /bin/sh
   22 root      0:00 ps

/ # ls -lF /badger
total 0
drwx------    5 root     root           160 Feb 18 22:49 data/
drwx------    6 root     root           192 Feb 18 22:49 key/

/ # ls -lF /badger/data
total 1044
-rw-r--r--    1 root     root     2147483646 Feb 18 22:49 000001.vlog
-rw-r--r--    1 root     root       1048576 Feb 18 22:49 DISCARD
-rw-r--r--    1 root     root             2 Feb 18 22:49 LOCK

Here the container is running as root and files are owned by root

With newer version after UID change

$ docker run \
  -e SPAN_STORAGE_TYPE=badger \
  -e BADGER_EPHEMERAL=false \
  -e BADGER_DIRECTORY_VALUE=/badger/data \
  -e BADGER_DIRECTORY_KEY=/badger/key \
  -v $HOME/dev/badger-storage:/badger \
  -p 16686:16686 \
  jaegertracing/all-in-one:1.53

$ docker ps
CONTAINER ID   IMAGE                           COMMAND                  CREATED         STATUS         PORTS                                                                                                        NAMES
422d701c2cc5   jaegertracing/all-in-one:1.53   "/go/bin/all-in-one-…"   3 seconds ago   Up 2 seconds   4317-4318/tcp, 5775/udp, 5778/tcp, 9411/tcp, 14250/tcp, 14268/tcp, 6831-6832/udp, 0.0.0.0:16686->16686/tcp   crazy_leakey

$ docker exec -it 422d701c2cc5 /bin/sh
/ $ ps
PID   USER     TIME  COMMAND
    1 10001     0:02 /go/bin/all-in-one-linux
   16 10001     0:00 /bin/sh
   22 10001     0:00 ps

/ $ ls -lF /badger
total 0
drwx------    6 10001    root           192 Feb 18 22:51 data/
drwx------    7 10001    root           224 Feb 18 22:51 key/

/ $ ls -lF /badger/data
total 1048
-rw-r--r--    1 10001    root            20 Feb 18 22:51 000001.vlog
-rw-r--r--    1 10001    root     2147483646 Feb 18 22:51 000002.vlog
-rw-r--r--    1 10001    root       1048576 Feb 18 22:49 DISCARD
-rw-r--r--    1 10001    root             2 Feb 18 22:51 LOCK

/ $ whoami
whoami: unknown uid 10001

/ $ touch /badger/data/test-file
/ $ ls -lF /badger/data/test-file
-rw-r--r--    1 10001    root             0 Feb 18 22:53 /badger/data/test-file

Here the Jaeger process is running as 10001 and files are owned by 10001 (I am not sure how the latter happens). The directory is writable by that UID.

@ivanallen
Copy link

ivanallen commented Feb 19, 2024

@yurishkuro

I enter the docker container(on rocky linux):

docker run -it  -e SPAN_STORAGE_TYPE=badger   -e BADGER_EPHEMERAL=false   -e BADGER_DIRECTORY_VALUE=/badger/data   -e BADGER_DIRECTORY_KEY=/badger/key   -v /mnt/badger:/badger   -p 16686:16686 --entrypoint ""  jaegertracing/all-in-one:1.54 /bin/sh

all files' owned by root:root. but whoami show 10001.

image

@yurishkuro
Copy link
Member

@ivanallen did you try to chown it to the new user?

@ivanallen
Copy link

ivanallen commented Feb 19, 2024

@yurishkuro

No, I didn't do anything. I executed the command in a completely new environment (rocky Linux).
I tried it on my macOS and it was working fine. The problem may depend on the environment.

I temporarily rolled back to jaegertracing/all-in-one:1.49, which works fine in rocky Linux.

@MarkusGeigerDev
Copy link

While you can chown existing dirs from earlier installations, how would you do that if you start from scratch? I may be missing something, but isn't that some kind of a hen-and-egg problem? Shouldn't there be a /bagder/data directory which is already owned by 10001 in the container image from the beginning?

@yurishkuro
Copy link
Member

@MarkusGeigerDev if you are starting fresh without previous data there should be no issues at all. If you are still getting permission issues it's likely the target directory does not allow writes by the container irrespective of which user id is used inside the container.

Please provide reproducible example if you think otherwise, ie which OS, which containers runtime, which user it runs as, container command with volume mounting, and existing permissions on the parent directory on the host.

@MarkusGeigerDev
Copy link

MarkusGeigerDev commented Mar 11, 2024

@yurishkuro Thanks for getting back to me!
This is the relevant snippet from my docker-compose (running in WSL2/Windows)

  jaeger:
    image: jaegertracing/all-in-one:latest
    command: 
      - "--badger.ephemeral=false"
      - "--badger.directory-key=/badger/keys"
      - "--badger.directory-value=/badger/values"
      - "--badger.span-store-ttl=72h0m0s" # limit storage to 72hrs
    environment:
      - SPAN_STORAGE_TYPE=badger
    volumes:
      - jaeger_badger_data:/badger
    ports:
      - "16686:16686"
      - "14250"
      - "4317"
 #   user: root

volumes:
  jaeger_badger_data:

The problem seems to be that the user 10001 can't even create the internal directories /badger/keys and /badger/values on startup:

Failed to init storage factory","error":"Error Creating Dir: \"/badger/keys\" error: mkdir /badger/keys: permission denied

When I set --badger.ephemeral=true the service starts up (obviously) and I am able to docker exec -it into it.
I looked around if there is any data path owned by 10001 already set up but I neither found one, nor was I able to create one.

When I remove the #-comment from the user: root line, everything works fine - but of course that was only to prove my point, I really don't want to revert the otherwise good decision of running jaeger as non-root.

@yurishkuro
Copy link
Member

Did you create jaeger_badger_data before running? What permissions does it have?

I would recommend running a shell inside the container and trying to create files (perhaps using different users), to see what's causing the permissions issue.

NB: this could be Windows specific.

# user: root

What if you try some other user that is palatable to Windows? The default 10001 in Jaeger Dockerfile is just "a suggestion", you don't have to run with it. But it would be nice to get to the bottom of this and add documentation.

@MarkusGeigerDev
Copy link

@yurishkuro:
I think I found a solution:

I added another service to my docker-compose that sets up everything before jaeger starts up:

version: "3.9"

services:
[...]
  jaeger:
    image: jaegertracing/all-in-one:latest
    command: 
      - "--badger.ephemeral=false"
      - "--badger.directory-key=/badger/data/keys"
      - "--badger.directory-value=/badger/data/values"
      - "--badger.span-store-ttl=72h0m0s" # limit storage to 72hrs
    environment:
      - SPAN_STORAGE_TYPE=badger
    volumes:
      - jaeger_badger_data:/badger
    ports:
      - "16686:16686"
      - "14250"
      - "4317"
    depends_on:
      init-jaeger:
        condition: service_completed_successfully

  init-jaeger:
    user: root
    image: busybox
    command: "/bin/sh -c 'mkdir -p /badger/data && touch /badger/data/.initialized && chown -R 10001:10001 /badger/data'"
    volumes:
      - jaeger_badger_data:/badger

volumes:
  jaeger_badger_data:

The init-jaeger service still runs as root (to be able to chown something), but since it will be already terminated when jaeger starts up, it shouldn't add anything to the attack surface of the system. If you like the solution, feel free to add it to the docs. Perhaps you may want to exchange busybox for whatever operating system base layer jaeger is using to avoid the additional download.(reuse of the already pulled layer), but on the other hand, busybox isn't that big either.

@yurishkuro
Copy link
Member

@MarkusGeigerDev thanks! I suspect you can use the jaeger image itself, it has shell in it. Also, did you try to create just the /badger dir? I think it should be sufficient and less error prone (your solution seems to rely on some understanding of internal organization of badger sub-dirs)

@MarkusGeigerDev
Copy link

MarkusGeigerDev commented Mar 12, 2024

@yurishkuro

I suspect you can use the jaeger image itself, it has shell in it.

Haha, you're right, this works as well :)

  init-jaeger:
    user: root
    image: jaegertracing/all-in-one:latest
    entrypoint: "/bin/sh -c 'mkdir -p /badger/data && touch /badger/data/.initialized && chown -R 10001:10001 /badger/data'"
    volumes:
      - jaeger_badger_data:/badger

Also, did you try to create just the /badger dir? I think it should be sufficient and less error prone (your solution seems to rely on some understanding of internal organization of badger sub-dirs)

/badger is really just the mount point. It corresponds to the root dir in the volume, which I didn't want to change.
I have no further knowledge about badger, besides what I took from the jaeger CLI docs which seem to suggest that badger needs two directories, one for the keys and one for the values. Since I was too lazy to chown those two individually, I introduced /data/ as a common root for both, keys and values, and used the --badger.directory-key and --badger.directory-value CLI switches to push badger that one level down.

EDIT:

Oh, BTW, I totally forgot to answer your other suggestions - sorry for that:

What if you try some other user that is palatable to Windows? The default 10001 in Jaeger Dockerfile is just "a suggestion", you don't have to run with it. But it would be nice to get to the bottom of this and add documentation.

You cannot easily map users from a Linux container to Windows users. Windows uses a very different technical scheme for users and groups called SIDs. For example, the "root" group ("Administrators", actually) is S-1-5-32-544.
But in the end, all that doesn't matter, since we use the Linux dockerd anyways. All containers physically run in the Windows Subsystem for Linux, on a real Linux kernel and with an EXT4 file system. That's why in my solution the chown -R 10001:10001 works.

@yurishkuro
Copy link
Member

Documented in #5282

@genfemme
Copy link

genfemme commented May 7, 2024

@MarkusGeigerDev @yurishkuro
I am getting the same error when doing the install using helm.

This is my values.yaml file -

query:
  initContainers:
    name: init-jaeger
    image: jaegertracing/all-in-one:latest
    command: ["/bin/sh", "-c"]
    args: ["mkdir -p /badger/data && touch /badger/data/.initialized && chown -R 10001:10001 /badger/data"]
    volumeMounts:
      - name: jaeger-badger-data
        mountPath: /badger
    securityContext:
      fsGroup: 10001
volumes:
  - name: jaeger-badger-data
    emptyDir: {}`

helm chart - https://raw.githubusercontent.com/hansehe/jaeger-all-in-one/master/helm/charts
version - 0.1.12

I have tried using the root user as well and nothing changed.

Error details -

{"level":"fatal","ts":1715098835.9268656,"caller":"all-in-one/main.go:114","msg":"Failed to init storage factory","error":"Error Creating Dir: \"/badger/key\" error: mkdir /badger/key: permission denied","stacktrace":"main.main.func1\n\tgithub.com/jaegertracing/jaeger/cmd/all-in-one/main.go:114\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/[email protected]/command.go:983\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/[email protected]/command.go:1115\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/[email protected]/command.go:1039\nmain.main\n\tgithub.com/jaegertracing/jaeger/cmd/all-in-one/main.go:249\nruntime.main\n\truntime/proc.go:271"}`

What is the issue here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants