Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlocks using the gitlab runner docker executor on the docker-compatible socket API #10090

Closed
thmo opened this issue Apr 20, 2021 · 44 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@thmo
Copy link

thmo commented Apr 20, 2021

/kind bug

Description

Trying to use gitlab-ci with the docker executor and the podman 3.x docker-compatible API socket very frequently yields to deadlock-like situations. These manifest in the Gitlab CI pipelines hanging for a while, and in addition, podman ps and other CLI commands hanging indefinitely. When this happens, only a systemctl restart podman helps.

The setup uses podman 3.1.0 from the container-tools module stream of CentOS8. The Gitlab runner itself is run as a podman container, using the latest docker.io/gitlab/gitlab-runner:alpine image, with /run/docker.sock mounted inside the runner. The socket is also mounted into a traefik container on the same machine.

As a side note, intermediate containers created by the CI are stopped, but not cleaned up.

Looking at the podman journal, I repeatedly see some messages like this:

Request Failed(Not Found): no container with name or ID \"...\" found: no such container"
Request Failed(Internal Server Error): can only kill running containers. ... is in state stopped: container state improper"
Request Failed(Internal Server Error): can only attach to created or running containers - currently in state stopped: container state improper"
Request Failed(Internal Server Error): container ... does not exist in database: no such container

not sure if these are really related or not.

Output of podman version:

Version:      3.1.0-dev
API Version:  3.1.0-dev
Go Version:   go1.16.1
Built:        Fri Mar 26 19:32:03 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.19.8
  cgroupManager: systemd
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.27-1.module_el8.5.0+733+9bb5dffa.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.27, commit: dc08a6edf03cc2dadfe803eac14b896b44cc4721'
  cpus: 16
  distribution:
    distribution: '"centos"'
    version: "8"
  eventLogger: file
  hostname: *****
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 4.18.0-301.1.el8.x86_64
  linkmode: dynamic
  memFree: 52090458112
  memTotal: 67455647744
  ociRuntime:
    name: crun
    package: crun-0.18-1.module_el8.5.0+733+9bb5dffa.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.18
      commit: 808420efe3dc2b44d6db9f1a3fac8361dde42a95
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_NET_RAW,CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    selinuxEnabled: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 20000530432
  swapTotal: 20000530432
  uptime: 89h 5m 49.06s (Approximately 3.71 days)
registries:
  search:
  - registry.access.redhat.com
  - registry.redhat.io
  - docker.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 22
    paused: 0
    running: 22
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageStore:
    number: 26
  runRoot: /var/run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.1.0-dev
  Built: 1616783523
  BuiltTime: Fri Mar 26 19:32:03 2021
  GitCommit: ""
  GoVersion: go1.16.1
  OsArch: linux/amd64
  Version: 3.1.0-dev

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.1.0-0.13.module_el8.5.0+733+9bb5dffa.x86_64
@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 20, 2021
@vrothberg
Copy link
Member

Thanks for reaching out and apologies for the longer silence, we've been very busy in the past weeks.

@mheon, do you have a suspicion on this one?

@mheon
Copy link
Member

mheon commented May 4, 2021

No. We need further debug information - full output of podman system service --log-level=debug with the executor run against it would be a good start.

@thmo
Copy link
Author

thmo commented May 15, 2021

Ok, here we go: this is a log wile running exactly one build in the gitlab pipeline.

The pipeline eventually finished. However, podman ps starts to hang around 18:17:02 - I run it in a loop once a second and the last one is from that time. It even hangs after the gitlab pipeline has finished, i.e. at the end of the logfile. Running podman ps with strace shows that it is waiting on a futex. A systemctl restart podman helps.

The containers 8d97d4de82b9, 3e596be4b232, 05c563522297, 1adba860a5c9, and 518ad874ec98 are still there, in Exited state. They should have been automatically cleaned up instead.

@thmo
Copy link
Author

thmo commented Jun 1, 2021

Problem persists with podman-3.1.2-1.el8.2.16.x86_64 from the Kubic project.

@vrothberg
Copy link
Member

Ok, here we go: this is a log wile running exactly one build in the gitlab pipeline.

Apologies for not getting back to this issue earlier. The archive seems to be empty. I unpacked it, but there's just an empty directory.

Do you still have the logs?

@thmo
Copy link
Author

thmo commented Jun 2, 2021

The archive seems to be empty. I unpacked it, but there's just an empty directory.

It's a single packed file. This worked for me:

curl -LO https://github.com/containers/podman/files/6487924/podman_log.gz
gunzip podman_log.gz

@vrothberg
Copy link
Member

Thanks! I'll look into it.

@vrothberg
Copy link
Member

What I see in the logs below is the the containers in question are not getting deleted. At least, the handlers do not finish deletion in the two minutes.

May 15 18:19:02 hostname podman[262957]: time="2021-05-15T18:19:02+02:00" level=debug msg="IdleTracker 0xc000010320:active 8m+0h/67t connection(s)"
May 15 18:19:02 hostname podman[262957]: time="2021-05-15T18:19:02+02:00" level=info msg="APIHandler(b0c6ba44-50e6-4f92-b4b2-99ee40bb82a3) -- DELETE /v1.25/containers/05c5635222973035c274c8d77e3c77e522c1e84907bc7c7a714664a3b5e67414?force=1&v=1 BEGIN"
May 15 18:19:02 hostname podman[262957]: time="2021-05-15T18:19:02+02:00" level=debug msg="APIHandler(b0c6ba44-50e6-4f92-b4b2-99ee40bb82a3) -- Header: User-Agent=[Go-http-client/1.1]"
May 15 18:19:02 hostname podman[262957]: time="2021-05-15T18:19:02+02:00" level=debug msg="IdleTracker 0xc000010328:active 9m+0h/67t connection(s)"
May 15 18:19:02 hostname podman[262957]: time="2021-05-15T18:19:02+02:00" level=info msg="APIHandler(6db5a07e-7cc8-4c9b-a22a-a225aa2e2d59) -- DELETE /v1.25/containers/1adba860a5c9ba053b9fd5f8e385af017dfec503dad1f9ba77b93ca74a32db17?force=1&v=1 BEGIN"
May 15 18:19:02 hostname podman[262957]: time="2021-05-15T18:19:02+02:00" level=debug msg="APIHandler(6db5a07e-7cc8-4c9b-a22a-a225aa2e2d59) -- Header: User-Agent=[Go-http-client/1.1]"
May 15 18:19:02 hostname podman[262957]: time="2021-05-15T18:19:02+02:00" level=info msg="APIHandler(b33fb8df-b560-4494-aade-f566f4cbbcf4) -- DELETE /v1.25/containers/3e596be4b2323601646067248de40783def65ca9dae260bbd407ec49a9b9ce31?force=1&v=1 BEGIN"
May 15 18:19:02 hostname podman[262957]: time="2021-05-15T18:19:02+02:00" level=debug msg="APIHandler(b33fb8df-b560-4494-aade-f566f4cbbcf4) -- Header: User-Agent=[Go-http-client/1.1]"
[...]
May 15 18:21:08 hostname podman[270006]: healthy

@vrothberg
Copy link
Member

It also seems that sigkills are failing to the container:

level=info msg="Request Failed(Internal Server Error): error sending signal to container 8d97d4de82b9df8deb88f5ab74e241833596a49388a13a9f2d13a2986cfae710: `/usr/bin/crun kill 8d97d4de82b9df8deb88f5ab74e241833596a49388a13a9f2d13a2986cfae710 9` failed: exit status 1"
level=debug msg="APIHandler(e5a95f68-af0f-43c4-90f8-a490862625e0) -- POST /v1.25/containers/8d97d4de82b9df8deb88f5ab74e241833596a49388a13a9f2d13a2986cfae710/kill?signal=SIGKILL END"

Before the deadlock, I see 5 consecutive requests to /networks causing quite some noise:

May 15 18:17:02 hostname podman[262957]: time="2021-05-15T18:17:02+02:00" level=info msg="APIHandler(0a3caf4f-9eee-4b14-87a6-e103b8840118) -- GET /v1.25/networks BEGIN"
May 15 18:17:02 hostname podman[262957]: time="2021-05-15T18:17:02+02:00" level=info msg="APIHandler(0e98c86a-5821-43b6-ab62-141374ceda5b) -- GET /v1.25/networks BEGIN"
May 15 18:17:02 hostname podman[262957]: time="2021-05-15T18:17:02+02:00" level=info msg="APIHandler(a2f15a00-da33-4d7d-828e-25d15e98d1d9) -- GET /v1.25/networks BEGIN"
May 15 18:17:02 hostname podman[262957]: time="2021-05-15T18:17:02+02:00" level=info msg="APIHandler(d6fd8ae6-d3ec-4d02-84e0-dae694b555f2) -- GET /v1.25/networks BEGIN"
May 15 18:17:02 hostname podman[262957]: time="2021-05-15T18:17:02+02:00" level=info msg="APIHandler(814f6766-3365-4e6e-b244-b97a3f3ae97d) -- GET /v1.25/networks BEGIN"

But only two of these requests return:

May 15 18:17:02 hostname podman[262957]: time="2021-05-15T18:17:02+02:00" level=info msg="APIHandler(92161cca-75c3-45c7-9f60-c732fcd77fca) -- GET /v1.24/containers/json?limit=0 BEGIN"
[...]
May 15 18:17:03 hostname podman[262957]: time="2021-05-15T18:17:03+02:00" level=debug msg="APIHandler(d6fd8ae6-d3ec-4d02-84e0-dae694b555f2) -- GET /v1.25/networks END"
[...]
May 15 18:17:03 hostname podman[262957]: time="2021-05-15T18:17:03+02:00" level=info msg="APIHandler(338d452c-3e95-4f10-be98-3eb90b8f412f) -- DELETE /v1.25/containers/518ad874ec9822776e6fbc0424b9b0037f5e285d70991608677f4da9efecba7e?force=1&v=1 BEGIN"
[...]
May 15 18:17:03 hostname podman[262957]: time="2021-05-15T18:17:03+02:00" level=debug msg="APIHandler(a2f15a00-da33-4d7d-828e-25d15e98d1d9) -- GET /v1.25/networks END"

The others do not return.

A lot to unpack.

@thmo
Copy link
Author

thmo commented Jun 22, 2021

What I see in the logs below is the the containers in question are not getting deleted. At least, the handlers do not finish deletion in the two minutes.

That corresponds to the observation that a bunch of temporary containers created by the CI are not cleaned up (also not after the two minutes) and have to be removed manually.

@vrothberg
Copy link
Member

May 15 18:17:02 hostname podman[262957]: time="2021-05-15T18:17:02+02:00" level=debug msg="netNames: "podman""

Only shows up 3 times but I expect to see 5.

@vrothberg
Copy link
Member

It seems like some logs are not displayed. The "shares network namespace, retrieving network info of container" suggest that all 5 network lists ran through.

@vrothberg
Copy link
Member

It seems like some logs are not displayed. The "shares network namespace, retrieving network info of container" suggest that all 5 network lists ran through.

Yet, looking at the logs, the only theory I came up with so far is that the network listings ran into some deadlock. @mheon @Luap99 is there anything in the networking code that would support this theory? Some functions look to be recursive (e..g, getContainerNetworkInfo()).

@Luap99
Copy link
Member

Luap99 commented Jun 22, 2021

There is definitely a lot of runtime locking but I do not see anything obvious where it could deadlock.
However I noticed that netNsCtr.syncContainer() is called without locking netNsCtr first.

func (c *Container) getContainerNetworkInfo() (*define.InspectNetworkSettings, error) {
if c.config.NetNsCtr != "" {
netNsCtr, err := c.runtime.GetContainer(c.config.NetNsCtr)
if err != nil {
return nil, err
}
// Have to sync to ensure that state is populated
if err := netNsCtr.syncContainer(); err != nil {
return nil, err
}
logrus.Debugf("Container %s shares network namespace, retrieving network info of container %s", c.ID(), c.config.NetNsCtr)
return netNsCtr.getContainerNetworkInfo()
}

According to the comment syncContainer() should only be called when the container is locked. @mheon Could this be the cause?
// Sync this container with on-disk state and runtime status
// Should only be called with container lock held
// This function should suffice to ensure a container's state is accurate and
// it is valid for use.
func (c *Container) syncContainer() error {

@mheon
Copy link
Member

mheon commented Jun 22, 2021

That probably won't deadlock, but it is definitely undefined behavior - could cause really weird results if 2 operations were happening to the container at the same time. Since our API handlers are multithreaded, that could well be happening.

@vrothberg
Copy link
Member

That probably won't deadlock, but it is definitely undefined behavior - could cause really weird results if 2 operations were happening to the container at the same time. Since our API handlers are multithreaded, that could well be happening.

I think that's the best shot we have so far. This code is executed concurrently (see logs) can yield undefined behavior which would reflect the image I got from staring at the logs. Maybe the deadlock is a symptom of that undefined behavior/state?

@Luap99, mind opening a fix for that?

@Luap99
Copy link
Member

Luap99 commented Jun 22, 2021

PR #10754
@thmo Can you build this and test if it solves this issue?

@thmo
Copy link
Author

thmo commented Jun 22, 2021

Copied this static binary over /usr/bin/podman from kubic's podman-3.2.1-1.el8.4.1.x86_64, doesn't change the overall situation (stray containers, hanging podman ps.

podman version 
Version:      3.3.0-dev
API Version:  3.3.0-dev
Go Version:   go1.16.4
Git Commit:   b34f5be344d3b6b9014bf71b8f49d232057277a3-dirty
Built:        Tue Jan  1 01:00:00 1980
OS/Arch:      linux/amd64

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Jul 23, 2021

@mheon @vrothberg @Luap99 Looks like this is still an issue.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Aug 23, 2021

@mheon @vrothberg @Luap99 any progress

@rhatdan rhatdan closed this as completed Oct 25, 2021
@J-Sorenson
Copy link

Be advised that this bug is a blocker to getting Podman supported by the GitLab Pipeline runner. Is there at least a work-around that could be suggested?
https://gitlab.com/gitlab-org/gitlab-runner/-/issues/27119

@rhatdan rhatdan reopened this Oct 28, 2021
@rhatdan
Copy link
Member

rhatdan commented Oct 28, 2021

Can you get us the stack trace? BTW We have had a release in the last month, has anyone tried this on podman 3.4?

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@runiq
Copy link

runiq commented Nov 29, 2021

Can you get us the stack trace? BTW We have had a release in the last month, has anyone tried this on podman 3.4?

I hope I'll be able to try this today at work.

@2xB
Copy link

2xB commented Dec 7, 2021

I observed such a deadlock-like behavior once recently, but couldn't reproduce it. Although I also saw two other issues that might be linked to this while I was using Podman inside a GitLab Runner Custom Executor:

  • GitLab Runner provided as TMPDIR a subfolder of /tmp, podman used this to build and frequently ran out of memory during committing. I could imagine if this was happening in an unfortunate moment, it could seemingly freeze. A solution was to execute podman commands with a custom TMPDIR like TMPDIR=/custom/dir podman build [...].
  • When building with Podman inside a custom GitLab Runner executor, storage goes full quite quickly, since for some reason ghost containers accumulate over time that do not occur in podman container list --external and are therefore not cleaned using podman container prune, as discussed in rmi: image in use by nonexistent container #12353 . As I understand it, this is also what this issue here already marks as a potentially related side note. I am not aware of such a storage-accumulating behavior in Docker, so this might also be a source of issues when using the Docker executor of GitLab Runner.

@github-actions
Copy link

github-actions bot commented Jan 7, 2022

A friendly reminder that this issue had no activity for 30 days.

@github-actions
Copy link

github-actions bot commented Feb 7, 2022

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Feb 7, 2022

@thmo @2xB Still seeing this issue?

@2xB
Copy link

2xB commented Feb 8, 2022

@rhatdan I haven't seen the deadlock again, but I am also now very aware of the two issues I mentioned above that cause too small storage. So if my hypothesis is right that for me the freeze came from too low storage space, I didn't yet have an opportunity to see it again since I'm now frequently running buildah rm --all to remove all accumulated containers including those "hidden" ones from cancelled podman builds (#12353) and taking care that TMPDIR set by GitLab Runner is large enough or I specify a custom TMPDIR. Cleaning containers manually, especially using buildah to ensure all containers including those hidden ones from cancelled podman builds are removed, may be a bit of a home-grown solution. But at least in my application of a custom GitLab executor running podman build in a known environment it works nicely.

Now that I think about this, the GitLab Docker executor @thmo is using probably never calls podman build, so the accumulated containers @thmo observes in their side note - that probably also stem from cancelled GitLab CI builds - are probably no unintuitive-to-clean Buildah containers that I observed, but can already be removed by simply executing e.g. podman container prune --force from time to time. Which would still need some thought on how to do this with the GitLab Docker executor, since as far as I know there is no equivalent thing needed for Docker in that direction.

@thmo
Copy link
Author

thmo commented Feb 21, 2022

@thmo @2xB Still seeing this issue?

Yes, at least with podman-3.4.1-3.module_el8.6.0+954+963caf36.x86_64 (waiting for 4.0 on c8s).

Attached is a SIGQUIT dump from the podman system service process in a the situation where podman ps hangs, in the hope that this will be helpful.

(NB: I am pretty sure I am not having a disk full problem.)

@Luap99
Copy link
Member

Luap99 commented Feb 22, 2022

If this was really an issue with the network endpoints then I am confident this is is fixed in 4.0

However if I interpret your dump correctly goroutine 4331 [syscall, 9 minutes, locked to thread] is waiting for a container lock so maybe it is not network related.
This one goroutine 4076 [semacquire, 9 minutes]: also hangs waiting for a container on a container rm call.

This part is suspicious it hangs in container_inspect because it is waiting for the runtime lock.

Feb 21 23:33:19 thehostname podman[53872]: goroutine 4405 [semacquire, 9 minutes, locked to thread]:
Feb 21 23:33:19 thehostname podman[53872]: runtime.gopark(0x563c854413d0, 0x563c860a2020, 0xc002591912, 0x4)
Feb 21 23:33:19 thehostname podman[53872]:         /usr/lib/golang/src/runtime/proc.go:336 +0xe6 fp=0xc000bcc4d0 sp=0xc000bcc4b0 pc=0x563c837b8166
Feb 21 23:33:19 thehostname podman[53872]: runtime.goparkunlock(...)
Feb 21 23:33:19 thehostname podman[53872]:         /usr/lib/golang/src/runtime/proc.go:342
Feb 21 23:33:19 thehostname podman[53872]: runtime.semacquire1(0xc000134dc8, 0x0, 0x3, 0x0)
Feb 21 23:33:19 thehostname podman[53872]:         /usr/lib/golang/src/runtime/sema.go:144 +0x1a7 fp=0xc000bcc530 sp=0xc000bcc4d0 pc=0x563c837ca007
Feb 21 23:33:19 thehostname podman[53872]: sync.runtime_SemacquireMutex(0xc000134dc8, 0x203000, 0x0)
Feb 21 23:33:19 thehostname podman[53872]:         /usr/lib/golang/src/runtime/sema.go:71 +0x49 fp=0xc000bcc560 sp=0xc000bcc530 pc=0x563c837e9ce9
Feb 21 23:33:19 thehostname podman[53872]: sync.(*RWMutex).RLock(...)
Feb 21 23:33:19 thehostname podman[53872]:         /usr/lib/golang/src/sync/rwmutex.go:63
Feb 21 23:33:19 thehostname podman[53872]: github.com/containers/podman/libpod.(*Runtime).GetContainer(0xc000134c40, 0xc000ac5500, 0x40, 0x0, 0x0, 0x0)
Feb 21 23:33:19 thehostname podman[53872]:         /builddir/build/BUILD/containers-podman-c15c154/_build/src/github.com/containers/podman/libpod/runtime_ctr.go:902 +0x14e fp=0xc000bcc5b8 sp=0xc000bcc560 pc=0x563c848a20ee
Feb 21 23:33:19 thehostname podman[53872]: github.com/containers/podman/libpod.(*Container).getContainerNetworkInfo(0xc000c83130, 0x0, 0x0, 0x0)
Feb 21 23:33:19 thehostname podman[53872]:         /builddir/build/BUILD/containers-podman-c15c154/_build/src/github.com/containers/podman/libpod/networking_linux.go:1006 +0x9f fp=0xc000bcc9b8 sp=0xc000bcc5b8 pc=0x563c8485ef5f
Feb 21 23:33:19 thehostname podman[53872]: github.com/containers/podman/libpod.(*Container).getContainerInspectData(0xc000c83130, 0xc000142800, 0xc000461140, 0x40, 0xc000461140, 0x0)
Feb 21 23:33:19 thehostname podman[53872]:         /builddir/build/BUILD/containers-podman-c15c154/_build/src/github.com/containers/podman/libpod/container_inspect.go:163 +0xb4f fp=0xc000bcccc8 sp=0xc000bcc9b8 pc=0x563c84808b0f
Feb 21 23:33:19 thehostname podman[53872]: github.com/containers/podman/libpod.(*Container).inspectLocked(0xc000c83130, 0x0, 0x0, 0x37ad1ec09debdee, 0xc001776828)
Feb 21 23:33:19 thehostname podman[53872]:         /builddir/build/BUILD/containers-podman-c15c154/_build/src/github.com/containers/podman/libpod/container_inspect.go:36 +0x21d fp=0xc000bccd58 sp=0xc000bcccc8 pc=0x563c84807c7d
Feb 21 23:33:19 thehostname podman[53872]: github.com/containers/podman/libpod.(*Container).Inspect(0xc000c83130, 0xc00093ad00, 0x0, 0x0, 0x0)
Feb 21 23:33:19 thehostname podman[53872]:         /builddir/build/BUILD/containers-podman-c15c154/_build/src/github.com/containers/podman/libpod/container_inspect.go:50 +0x65 fp=0xc000bccda8 sp=0xc000bccd58 pc=0x563c84807ec5
Feb 21 23:33:19 thehostname podman[53872]: github.com/containers/podman/pkg/api/handlers/compat.getNetworkResourceByNameOrID(0xc000bdcca0, 0x6, 0xc000134c40, 0xc000a21dd0, 0xc000bcd288, 0x1, 0x1)
Feb 21 23:33:19 thehostname podman[53872]:         /builddir/build/BUILD/containers-podman-c15c154/_build/src/github.com/containers/podman/pkg/api/handlers/compat/networks.go:130 +0x4d3 fp=0xc000bcd1c8 sp=0xc000bccda8

Because it is already inside inspect it should hold a container lock. Maybe a ABBA deadlock between a container and the runtime lock?

@mheon
Copy link
Member

mheon commented Feb 22, 2022

Honestly, the runtime in-memory lock doesn't really protect us against anything, given that it's purely a per-process lock and so much of what we want to protect against is multiprocess. I think we could tear it out completely without negative consequence.

@rhatdan
Copy link
Member

rhatdan commented Feb 22, 2022

Then tear it out.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@Luap99
Copy link
Member

Luap99 commented Mar 25, 2022

Given that @mheon removed the runtime lock (#13311) I would think this is fixed.
Can anyone try this with the main branch?

@Luap99 Luap99 closed this as completed Mar 25, 2022
@thmo
Copy link
Author

thmo commented Mar 25, 2022

In which release is this going to be?

@rhatdan
Copy link
Member

rhatdan commented Mar 25, 2022

Definitely 4.1

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

No branches or pull requests

9 participants