Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildah images very slow #3369

Closed
kwshi opened this issue Jul 9, 2021 · 24 comments
Closed

buildah images very slow #3369

kwshi opened this issue Jul 9, 2021 · 24 comments

Comments

@kwshi
Copy link
Contributor

kwshi commented Jul 9, 2021

Description

My containers store has a lot of large images, totaling ~40-50 tagged images & ~1TB size. For some reason, this causes buildah images (as well as podman image ls) to run very slowly:

> time buildah images
... output ...
real	1m33.037s
user	2m23.364s
sys	0m4.053s

I'm not sure what the exact cause is, but I think it is abnormal/unusual for performance of merely listing images to deteriorate at sizes as small as ~50 images and ~1TB. Perhaps I'm doing something wrong?

Steps to reproduce the issue:

  1. Have a lot of images, maybe?
  2. time buildah images or time podman image ls

Describe the results you received:
Listing images takes ~2minutes.

> time buildah images
... output ...
real	1m33.037s
user	2m23.364s
sys	0m4.053s

(Perhaps relatedly, podman system df is also kind of slow, but much less so--only takes about 15s.)

Describe the results you expected:
Listing images should happen approximately instantly.

Output of rpm -q buildah or apt list buildah xbps-query buildah:

architecture: x86_64
changelog: https://github.com/containers/buildah/blob/master/CHANGELOG.md
filename-sha256: 25806183e98deae0653cf8c2faa61e61426343e5cb50fb3ca6f3088e61d2d6a3
filename-size: 13MB
homepage: https://github.com/containers/buildah
installed_size: 29MB
license: Apache-2.0
maintainer: Cameron Nemo <[email protected]>
pkgname: buildah
pkgver: buildah-1.20.1_1
repository: https://alpha.us.repo.voidlinux.org/current
run_depends:
	runc>=0
	containers.image>=0
	glibc>=2.32_1
	libgpgme>=1.12.0_2
	libassuan>=2.0.1_1
	libgpg-error>=1.6_1
	device-mapper>=2.02.110_1
shlib-requires:
	libpthread.so.0
	libgpgme.so.11
	libassuan.so.0
	libgpg-error.so.0
	libdl.so.2
	libdevmapper.so.1.02
	libc.so.6
short_desc: Dockerfile compatible OCI image building tool
source-revisions: buildah:b177108fa1

Output of buildah version:

Version:         1.20.1
Go Version:      go1.16.3
Image Spec:      1.0.1-dev
Runtime Spec:    1.0.2-dev
CNI Spec:        0.4.0
libcni Version:
image Version:   5.10.5
Git Commit:
Built:           Wed Dec 31 16:00:00 1969
OS/Arch:         linux/amd64

Output of podman version if reporting a podman build issue:

Version:      3.2.1
API Version:  3.2.1
Go Version:   go1.16.5
Built:        Wed Dec 31 16:00:00 1969
OS/Arch:      linux/amd64

Output of cat /etc/*release:

NAME="void"
ID="void"
DISTRIB_ID="void"
PRETTY_NAME="void"

Output of uname -a:

Linux vic 5.12.14_1 #1 SMP 1625102948 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf no conf, using defaults--here's buildah info:

{
    "host": {
        "CgroupVersion": "v1",
        "Distribution": {
            "distribution": "\"void\"",
            "version": "unknown"
        },
        "MemFree": 58960666624,
        "MemTotal": 67415769088,
        "OCIRuntime": "runc",
        "SwapFree": 0,
        "SwapTotal": 0,
        "arch": "amd64",
        "cpus": 12,
        "hostname": "vic",
        "kernel": "5.12.14_1",
        "os": "linux",
        "rootless": true,
        "uptime": "27m 38.37s"
    },
    "store": {
        "ContainerStore": {
            "number": 1
        },
        "GraphDriverName": "overlay",
        "GraphOptions": [
            "overlay.mount_program=/usr/bin/fuse-overlayfs"
        ],
        "GraphRoot": "/home/kshi/.local/share/containers/storage",
        "GraphStatus": {
            "Backing Filesystem": "extfs",
            "Native Overlay Diff": "false",
            "Supports d_type": "true",
            "Using metacopy": "false"
        },
        "ImageStore": {
            "number": 176
        },
        "RunRoot": "/var/tmp/containers-user-1000/containers"
    }
}
@rhatdan
Copy link
Member

rhatdan commented Jul 9, 2021

@vrothberg PTAL
Only thing I can think of is we are looking at size?

@kwshi
Copy link
Contributor Author

kwshi commented Jul 9, 2021

For reference, here is the actual output of buildah images: http://ix.io/3stl

@vrothberg
Copy link
Member

Thanks for reaching out. Starting with v1.21.0, Buildah uses a new backend for managing images which is much faster.

Can you run time podman images and share the results? Podman v3.2.x should be much faste.

@kwshi
Copy link
Contributor Author

kwshi commented Jul 10, 2021

Side by side:

> time buildah images
...
real	1m23.743s
user	2m8.243s
sys	0m3.705s

> time podman images
...
real	0m35.534s
user	1m4.349s
sys	0m2.419s

It is indeed substantially fast-er, but overall still quite a bit slower than preferable. Is it going through image contents in order to compute this metadata listing?

@kwshi
Copy link
Contributor Author

kwshi commented Jul 10, 2021

The weird thing is that this behavior isn't consistent: sometimes, after messed around with buildah enough (not sure what specifically--a couple of buildah images runs, then maybe make a container, mount it, do some random stuff), it suddenly becomes fast:

real	0m1.032s
user	0m0.943s
sys	0m0.479s

As if there's a cache that needs to get warmed up...

@vrothberg
Copy link
Member

Thanks for providing the data, @kwshi!

Is it going through image contents in order to compute this metadata listing?

Yes, mostly for computing the size of an image. The remainder should be fast.

It is indeed substantially fast-er, but overall still quite a bit slower than preferable.

Your observation aligns with the ones made in containers/podman#6288. However, we never managed to get a reliable reproducer to figure out why image listing is suddenly slow. Ultimately, it seems more like an I/O performance issue outside of Buildah/Podman.

@vrothberg
Copy link
Member

That being said: unless we know what causes the I/O degradation, I do not know what to do. A reliable reproducer would be ideal.

@kwshi
Copy link
Contributor Author

kwshi commented Jul 20, 2021

Update: I still don't have a reliable reproducer for the bad IO, but I've noticed that one seemingly reliable way to suddenly speed image listing up is to buildah unshare, buildah from, and buildah mount some random image, then unmount and remove. After doing that, buildah images is suddenly much faster.

@vrothberg
Copy link
Member

@rhatdan, could you talk with Vivek about this issue? There seems to be a pattern but we couldn't pin it down yet.

@rhatdan
Copy link
Member

rhatdan commented Jul 21, 2021

@rhvgoyal WDYT?

@rhvgoyal
Copy link

I am assuming you are using "stat()/fstat()" to get to file size. So that should not slow down significantly because of file size.

There is a small possibility that cloud provider is doing some I/O throttling. I mean what else is going on host? If you have done lot of I/O already and have run out of credits, then cloud provider might throttle and delay I/O.

Or it is possible that this command is waiting for some lock and that lock is held by other path which is doing I/O and that has been throttled. So effectively a simple action of listing images is blocked.

I think one needs to patch buildah/podman code and see what are we waiting on and first narrow it down a bit.

@kwshi
Copy link
Contributor Author

kwshi commented Jul 21, 2021

There is a small possibility that cloud provider is doing some I/O throttling. I mean what else is going on host? If you have done lot of I/O already and have run out of credits, then cloud provider might throttle and delay I/O.

I'll add that, in my own situation at least, the container is store is simply located on a separate local partition on an SSD, so the cloud-provider-throttling behavior shouldn't be a factor.

@septatrix
Copy link

Your observation aligns with the ones made in containers/podman#6288. However, we never managed to get a reliable reproducer to figure out why image listing is suddenly slow. Ultimately, it seems more like an I/O performance issue outside of Buildah/Podman.

I do not know what constitutes as slow today but I consistently get ~7s for 50 images (and ~12s in usr respectively). Using --all shaves ~1s (~3s in usr respectively).

@vrothberg
Copy link
Member

@septatrix, which version of Buildah are you running?

@vrothberg
Copy link
Member

vrothberg commented Aug 3, 2021

@edsantiago, FYI. Maybe something to look out for. If you ever find something that may contribute to a slow(er) storage access, feel free to drop it here.

@vrothberg
Copy link
Member

One thing I notice is that Buildah is surprisingly faster(sometimes half the time) than Podman (reproducer time buildah images > /dev/null; time podman images > /dev/null). I have 93 images in the storage (did some podman pull --all on alpine, busybox and fedora).

@septatrix
Copy link

septatrix commented Aug 3, 2021

@septatrix, which version of Buildah are you running?

I should have been more precise. My timings are for podman and not for buildah. (And I am using podman 3.2.3)

@github-actions
Copy link

github-actions bot commented Sep 3, 2021

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Sep 7, 2021

@kwshi We believe this issue is solved, reopen if I am mistaken.

@rhatdan rhatdan closed this as completed Sep 7, 2021
@vrothberg
Copy link
Member

I don't believe this is fixed. See my previous comment (#3369 (comment)): there's still things we need to investigate.

@vrothberg vrothberg reopened this Sep 8, 2021
@kwshi
Copy link
Contributor Author

kwshi commented Sep 8, 2021

I agree that this is not yet fixed; I just downloaded the latest release (v1.22.3) and am still having the same problem. I am still working on finding a reliable reproducer/pinning down the exact cause of the slowness.

@vrothberg
Copy link
Member

I found the cause (Podman issue containers/podman#11997) and am working on a fix.

@vrothberg
Copy link
Member

Here is a fix: containers/common#811

@vrothberg
Copy link
Member

Closing, will be implicitly fixed with the next vendor of c/common.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants