REST API is failing with errors when listing containers after being in an inconsistent state #15526

benoitf · 2022-08-29T08:37:38Z

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

After starting/stopping/deleting containers, now I'm in an inconsistent state
When listing containers, I've the error error getting container from store

curl --unix-socket /Users/benoitf/.local/share/containers/podman/machine/podman-machine-default/podman.sock "http:/v1.41/containers/json?all=true"
{"cause":"container not known","message":"error getting container from store \"82b69ee0d0c46770aa7843332fc40a6e109d8a734bf5471411a12bb1efdfd2f1\": container not known","response":500}

Steps to reproduce the issue:

I don't know how to reproduce but it was just by doing start, stop and delete on containers and pods.

Note: Using a UI, I'm sending multiple events at the same time, so it means, start/stop/delete actions are occurring concurrently

Describe the results you received:
Error

Describe the results you expected:
No error

Additional information you deem important (e.g. issue happens only occasionally):

while the REST API is not working (throwing error)
I've podman container ps -a working

$ podman container ps -a
CONTAINER ID  IMAGE                                    COMMAND     CREATED     STATUS         PORTS       NAMES
82b69ee0d0c4  localhost/podman-pause:4.2.0-1660228937              3 days ago  Removing                   6af38dfe23d8-infra
ba85fdf27813  docker.io/library/mariadb:10             mariadbd    3 days ago  Up 3 days ago              mariadb

and if I try to inspect the infra container, I've:

$ podman container inspect 82b69ee0d0c4
Error: error getting container from store "82b69ee0d0c46770aa7843332fc40a6e109d8a734bf5471411a12bb1efdfd2f1": container not known

Output of podman version:

(paste your output here)

Output of podman info:

(paste your output here)

Package info (e.g. output of rpm -q podman or apt list podman):

(paste your output here)

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes/No

Additional environment details (AWS, VirtualBox, physical, etc.):

The text was updated successfully, but these errors were encountered:

benoitf · 2022-08-29T12:12:17Z

Here is a small script to reproduce the issue on macOS

Create 10 pods
Remove containers (without force)
Remove pods (without force)

At the end my podman machine is not working

$ podman pod ls
POD ID        NAME        STATUS      CREATED        INFRA ID      # OF CONTAINERS
9f23485e890c  apps-10     Error       3 minutes ago  b4bd995ced93  2
bbf8b24c9e8d  apps-9      Degraded    3 minutes ago  7d41d7241dc0  2
1ded17911434  apps-8      Degraded    3 minutes ago  c0b65da446f2  2
f17257ff14b9  apps-7      Degraded    3 minutes ago  40e66b4075bc  2
e24df32659c9  apps-6      Degraded    3 minutes ago  54dc373ac387  2
754ce615e62f  apps-5      Degraded    3 minutes ago  05fead96e1fb  2
c8aedbd82332  apps-4      Degraded    3 minutes ago  26e328c5fe65  2
69cfdaf4ec3c  apps-3      Degraded    3 minutes ago  1c324f056286  2
14256d96bb35  apps-2      Degraded    3 minutes ago  8ec46163d487  2
5fdab12b52a5  apps-1      Degraded    3 minutes ago  638aa7515ea1  2

$ podman pod rm bbf8b24c9e8d
Error: error freeing lock for container 7d41d7241dc077ef9391ca8bf1658921543ad143f3086dd4fef4e961dbf522bf: no such file or directory

$ podman ps -a
CONTAINER ID  IMAGE                                    COMMAND     CREATED        STATUS            PORTS       NAMES
638aa7515ea1  localhost/podman-pause:4.2.0-1660228937              4 minutes ago  Removing                      5fdab12b52a5-infra
8ec46163d487  localhost/podman-pause:4.2.0-1660228937              4 minutes ago  Removing                      14256d96bb35-infra
240c931822f8  docker.io/library/mariadb:10             mariadbd    4 minutes ago  Up 4 minutes ago              mariadb2
1c324f056286  localhost/podman-pause:4.2.0-1660228937              4 minutes ago  Removing                      69cfdaf4ec3c-infra
97e0a49fc072  docker.io/library/mariadb:10             mariadbd    4 minutes ago  Up 4 minutes ago              mariadb3
26e328c5fe65  localhost/podman-pause:4.2.0-1660228937              4 minutes ago  Removing                      c8aedbd82332-infra
f6bc901555c6  docker.io/library/mariadb:10             mariadbd    4 minutes ago  Up 4 minutes ago              mariadb4
05fead96e1fb  localhost/podman-pause:4.2.0-1660228937              4 minutes ago  Removing                      754ce615e62f-infra
c3ea227919ed  docker.io/library/mariadb:10             mariadbd    4 minutes ago  Up 4 minutes ago              mariadb5
54dc373ac387  localhost/podman-pause:4.2.0-1660228937              4 minutes ago  Removing                      e24df32659c9-infra
a4f787757474  docker.io/library/mariadb:10             mariadbd    4 minutes ago  Up 4 minutes ago              mariadb6
40e66b4075bc  localhost/podman-pause:4.2.0-1660228937              4 minutes ago  Removing                      f17257ff14b9-infra
7742263ebfd0  docker.io/library/mariadb:10             mariadbd    4 minutes ago  Up 4 minutes ago              mariadb7
c0b65da446f2  localhost/podman-pause:4.2.0-1660228937              4 minutes ago  Removing                      1ded17911434-infra
7cc87b50b6c8  docker.io/library/mariadb:10             mariadbd    4 minutes ago  Up 4 minutes ago              mariadb8
7d41d7241dc0  localhost/podman-pause:4.2.0-1660228937              4 minutes ago  Removing                      bbf8b24c9e8d-infra
b4bd995ced93  localhost/podman-pause:4.2.0-1660228937              4 minutes ago  Removing                      9f23485e890c-infra
118b13b4fb3d  docker.io/library/mariadb:10             mariadbd    4 minutes ago  Removing                      mariadb10

podman inspect 638aa7515ea1
Error: error getting container from store "638aa7515ea15cd1b987df24dcdbb0699a6b3f6da0be5842c4d48ce3a119e810": container not known

Here is the script

#!/bin/bash
for i in {1..10}
do
  podman run --name "mariadb${i}" --pod "new:apps-${i}" -e MYSQL_RANDOM_ROOT_PASSWORD=yes -d docker.io/library/mariadb:10
done

# try to remove all containers
all_containers=$(podman ps -a -q)
for containerId in $all_containers
do
  podman rm "${containerId}"
done

# remove all pods
all_pods=$(podman pod ls -q)
for podId in $all_pods
do
  podman pod rm "${podId}"
done


# now, list all containers calling REST API
echo "Call REST API"
curl --unix-socket "$HOME/.local/share/containers/podman/machine/podman-machine-default/podman.sock" "http:/v1.41/containers/json?all=true"

At the end it should display: container not known error when trying to list all the containers

mheon · 2022-08-29T12:44:26Z

We've handled this race condition already in CLI podman ps, but evidently the fix did not make it into the API. Should not be difficult, just need to ignore errors for containers that do not exist and exclude them from the output.

benoitf · 2022-08-29T13:03:26Z

We've handled this race condition already in CLI podman ps, but evidently the fix did not make it into the API.

I can also reproduce with one pod using instructions in a shell sequentially.

$ podman run --name mariadb --pod new:apps -e MYSQL_RANDOM_ROOT_PASSWORD=yes -d mariadb:10

$ podman ps
CONTAINER ID  IMAGE                                    COMMAND     CREATED        STATUS            PORTS       NAMES
785f98902bd4  localhost/podman-pause:4.2.0-1660228937              3 seconds ago  Up 4 seconds ago              979f1ab2c5fe-infra
558ddb906df1  docker.io/library/mariadb:10             mariadbd    3 seconds ago  Up 4 seconds ago              mariadb

$ podman pod ps
POD ID        NAME        STATUS      CREATED         INFRA ID      # OF CONTAINERS
979f1ab2c5fe  apps        Running     28 seconds ago  785f98902bd4  2

$ podman pod rm 979f1ab2c5fe
Error: cannot remove container 558ddb906df1aafe541b2f8180e3204bdd2bd20f76de8200d94ec4763eb76d26 as it is running - running or paused containers cannot be removed without force: container state improper

$ podman pod rm -f 979f1ab2c5fe
Error: error freeing lock for container 785f98902bd480bb8d4fd08593a802f58f0cabd80cae7d0a56aa442a3728f601: no such file or directory

Now, everything is broken

mheon · 2022-08-29T13:58:32Z

Removing good first issue and self-assigning, that seems very serious.

edsantiago · 2022-08-29T16:00:52Z

Looks like #15367

mheon · 2022-08-29T17:38:49Z

Probably unrelated @edsantiago - no pods involved there.

Remote podman pod rm -f is removing the infra container, but not any other containers...

mheon · 2022-08-29T17:40:01Z

It's removing the infra container despite dependencies on it being present. Serious bug, possibly present in non-remote Podman.

mheon · 2022-08-29T17:43:33Z

Alright, identified the cause. It's 384c235

Container removal is unordered and normal checks to make sure that dependency containers and the infra container are not removed until the pod is removed are not enforced as we are attempting to remove the pod.

Solution here is probably not fun. Going to need to restructure pod removal to work in a graph-traversal fashion.

krystalcode · 2022-09-12T04:55:35Z

@mheon in reference to #15740 , is there something that I can do to remove the pod, or do I have to wait until the release that will contain the fix?

benoitf · 2022-09-12T11:36:14Z

my only workaround is to call podman system reset twice (loosing everything) and restarting podman

mheon · 2022-09-12T13:02:57Z

It is possible that a podman system renumber after a podman pod rm may bring things back into a sane state, but I have not personally verified this.

krystalcode · 2022-09-12T15:43:03Z

podman system renumber worked in my case, thanks.

mheon · 2022-09-13T19:18:07Z

#15757 should fix, but testing would be appreciated.

Originally, during pod removal, we locked every container in the pod at once, did a number of validity checks to ensure everything was safe, and then removed all the containers in the pod. A deadlock was recently discovered with this approach. In brief, we cannot lock the entire pod (or much more than a single container at a time) without causing a deadlock. As such, we converted to an approach where we just looped over each container in the pod, removing them individually. Unfortunately, this removed a lot of the validity checking of the earlier approach, allowing for a lot of unintended bad things. Infra containers could be removed while containers in the pod still depended on them, for example. There's no easy way to do validity checks while in a simple loop, so I implemented a version of our graph-traversal logic that currently handles pod start. This version acts in the reverse order of startup: startup starts from containers which depend on nothing and moves outwards, while removal acts on containers which have nothing depend on them and moves inwards. By doing graph traversal, we can guarantee that nothing is removed while something that depends on it still exists - so the infra container should be the last thing in a pod that is removed, for example. In the (unlikely) case that a graph of the pod's containers cannot be built (most likely impossible without database editing) the old method of pod removal has been retained to ensure that even misbehaving pods can be forcibly evicted from the state. I'm fairly confident that this resolves the problem, but there are a lot of assumptions around dependency structure built into the original pod removal code and I am not 100% sure I have captured all of them. Fixes containers#15526 Signed-off-by: Matthew Heon <[email protected]>

djnotes · 2022-10-05T13:42:47Z

Is this fully solved?
I'm still getting empty list of containers on Windows with the latest pre-release version.
Tried Reload and Force Reload with no success.

main ↪️ PluginSystem: received dom-ready event from the UI
2index.ts:835 main ↪️ error in engine Podman Error: (HTTP code 500) server error - error getting container from store "8c8946fd410672133bb499554bc401b6ad6c395920ed9119b147a9f018b58e2a": container not known 
    at C:\Users\user\AppData\Local\Programs\podman-desktop\resources\app.asar\packages\main\dist\index.js:20:172262
    at c (C:\Users\user\AppData\Local\Programs\podman-desktop\resources\app.asar\packages\main\dist\index.js:20:172593)
    at ha.buildPayload (C:\Users\user\AppData\Local\Programs\podman-desktop\resources\app.asar\packages\main\dist\index.js:20:172234)
    at IncomingMessage.<anonymous> (C:\Users\user\AppData\Local\Programs\podman-desktop\resources\app.asar\packages\main\dist\index.js:20:171784)
    at IncomingMessage.emit (node:events:539:35)
    at endReadableNT (node:internal/streams/readable:1345:12)
    at process.processTicksAndRejections (node:internal/process/task_queues:83:21)

openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 29, 2022

benoitf added the podman-desktop label Aug 29, 2022

mheon added Good First Issue This issue would be a good issue for a first time contributor to undertake. HTTP API Bug is in RESTful API labels Aug 29, 2022

benoitf added the pods label Aug 29, 2022

mheon removed the Good First Issue This issue would be a good issue for a first time contributor to undertake. label Aug 29, 2022

benoitf mentioned this issue Aug 29, 2022

Containers list is empty podman-desktop/podman-desktop#429

Closed

pohlt mentioned this issue Sep 5, 2022

Interrupting a podman pod rm -f PODNAME corrupts pod/container database #15629

Closed

mheon self-assigned this Sep 6, 2022

mheon mentioned this issue Sep 12, 2022

Pod stuck in error/removing state #15740

Closed

mheon mentioned this issue Sep 12, 2022

Introduce graph-based pod container removal #15757

Merged

openshift-merge-robot closed this as completed in #15757 Sep 15, 2022

mheon mentioned this issue Sep 19, 2022

Cannot remove pod after force remove of container within it #15847

Closed

This was referenced Sep 20, 2022

Error: error freeing lock for container podman-desktop/podman-desktop#483

Closed

RFE: Expose system reset to the API / remote CLI #15886

Closed

MemoryShadow mentioned this issue Oct 15, 2022

When running "podman pod rm -f " an error message: Error: error freeing lock for container xxxx no such file or directory #16187

Closed

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 13, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REST API is failing with errors when listing containers after being in an inconsistent state #15526

REST API is failing with errors when listing containers after being in an inconsistent state #15526

benoitf commented Aug 29, 2022 •

edited

Loading

benoitf commented Aug 29, 2022

mheon commented Aug 29, 2022

benoitf commented Aug 29, 2022 •

edited

Loading

mheon commented Aug 29, 2022

edsantiago commented Aug 29, 2022

mheon commented Aug 29, 2022

mheon commented Aug 29, 2022

mheon commented Aug 29, 2022

krystalcode commented Sep 12, 2022

benoitf commented Sep 12, 2022

mheon commented Sep 12, 2022

krystalcode commented Sep 12, 2022

mheon commented Sep 13, 2022

djnotes commented Oct 5, 2022 •

edited

Loading

REST API is failing with errors when listing containers after being in an inconsistent state #15526

REST API is failing with errors when listing containers after being in an inconsistent state #15526

Comments

benoitf commented Aug 29, 2022 • edited Loading

benoitf commented Aug 29, 2022

mheon commented Aug 29, 2022

benoitf commented Aug 29, 2022 • edited Loading

mheon commented Aug 29, 2022

edsantiago commented Aug 29, 2022

mheon commented Aug 29, 2022

mheon commented Aug 29, 2022

mheon commented Aug 29, 2022

krystalcode commented Sep 12, 2022

benoitf commented Sep 12, 2022

mheon commented Sep 12, 2022

krystalcode commented Sep 12, 2022

mheon commented Sep 13, 2022

djnotes commented Oct 5, 2022 • edited Loading

benoitf commented Aug 29, 2022 •

edited

Loading

benoitf commented Aug 29, 2022 •

edited

Loading

djnotes commented Oct 5, 2022 •

edited

Loading