-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sudo: unknown uid 1012: who are you? #475
Comments
Hi, The long explanation is basically that there is a mismatch between how cwltool approaches a docker image and how we approached a Docker image when we were running PCAWG. cwltool sets the uid of the user inside the container to match the uid of the user outside the container. This works in most cases since if you run an unprivileged tool, it does not matter what user you are inside the container. For PCAWG, many of the workflows used existing workflow systems that were created before Docker like SeqWare or like Roddy. The strategy that we used was to use the USER keyword https://docs.docker.com/engine/reference/builder/#/user in a Dockerfile to set an executing user and then give that the privileges necessary to configure services like SeqWare, Roddy, or even SGE. In this specific workflow, it looks like we're just configuring file permissions with sudo. The conflict comes in when you use both, cwltool ends up overriding the user to a seemingly random uid and then the way we've setup the sudoers file probably fails. That's the long answer and we haven't implemented a solution yet. @briandoconnor @tetron feel free to chime in if I've mischaracterized anything. The short answer is that we've been testing these workflows by running in a brand new Ubuntu VM on OpenStack or AWS. This seems to work, probably because the default "ubuntu" user has a matching uid with the uid of the user cwltool overrides with. see also: common-workflow-language/cwltool#47 |
A fix on the cwltool side may be to switch from matching the user id to setting the gid sticky bit on the output dir. This enables the runner process to manipulate files written by a different user inside the container by having group ownership. |
@tetron Yeah, I think that would be a possibility as well. |
Reopening, seems that while the
Above it's a simple
Any hints on what could be going wrong with this run? Will continue digging on my own, but feedback/debugging tips on seqware/CGP/dockstore/cwltool are greatly appreciated. |
Hi, If the workflow successfully started, then you should see generated log files. That's one approach. Another approach is that earlier in the first log above, cwltool should spit out the |
One additional clue. |
@brainstorm
You'll want to modify that to something like
Then you'll be able to examine the environment, run the workflow, and see what is going on inside the container at your leisure. |
Thanks a lot @denis-yuen for the help and feedback! I've been looking at this problem for a while today and:
So I'll have to think/dig deeper and continue on this, but a bit unsure on what's really going on now. |
Hmmm, I'm kind of puzzled as well. Can you describe your environment as thoroughly as possible and we can try to reproduce the issue on our OpenStack cluster. In particular:
A futex java process sounds like the Java portion of the workflow inside the container is having trouble creating or writing to a file and the OS is making it wait forever. I'm not clear on why that would happen though since the user is root. |
Yes, in the cluster we are running:
I know, a bit legacy, we'll hopefully upgrade as soon as we can :/ The the commandlines that you asked for are:
|
@brainstorm Thanks, I think the last bit of info that would be useful, what was the generated |
Sorry where can I see that command? It's not showing up in after issuing the dockstore command...
|
Hi, However, I think the bigger problem is that we've seen problems like this in the past on a different project (PCAWG). We've seen issues with file locking and other oddities on the container file system with older versions of Docker and/or running on older kernels. https://docs.docker.com/engine/installation/binaries/#/check-kernel-dependencies indicates that you need a Linux kernel above 3.1 but it looks like you're running 2.6.32 https://docs.docker.com/engine/installation/linux/centos/ indicates you can probably get that kernel with CentOS 7. Could you maybe try again on a newer CentOS version before coming back to CentOS 6 to confirm? |
We'll be able to do so soon since we'll be migrating anyway, I'll keep you guys posted. Thanks a ton for all your support! |
For the issue in the original post, we also released a new version of the workflow ( https://github.com/ICGC-TCGA-PanCancer/CGP-Somatic-Docker/releases/tag/2.0.3 ) which uses gosu which should help with the unknown user issue. Feel free to re-open this issue or create a new one when you look at CentOS again for the second issue. |
Moving on from issue #469 et al, I'm hitting a wall now with docker uid: ... from what I can tell, it does not have to do with
DOCKSTORE_ROOT
env var at all, but I assume docker's--privileged
flag might help in here?Seems like cwltool is passing my hosts's UID to the container and of course cannot find it in the running docker container.
@denis-yuen Thanks a ton for following this up with me, any hints with this one? How are you mapping UID's between host and container now? Or are you just running it all as root?:
The text was updated successfully, but these errors were encountered: