-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build_mapping_recurse
is very slow, causes ostree container builds to take a long time
#4880
Comments
Here's something fun:
I filed this bug for the "does Budgie really need all those icons?" side of this. |
See #4768 |
No, |
Ahh, I see. |
A test run of an Onyx container build with #4768 applied completed in 13 minutes(!) in a mock chroot on my own system. I haven't yet run a build without the patch to completion locally, but it was definitely going to take a lot longer than that. I've started one running overnight and will see how long it took when I get up tomorrow, but that PR is definitely a huge improvement. |
time to build an Onyx ostree container (in a mock chroot on my laptop) with #4768 :
without it:
|
While toying around with building my own custom FCOS builds, I noticed that running `cosa build container` with a package set similar to Silverblue's resulted in ~2hr builds, the vast majority of which was in the "Building package mapping" task. After this change, the runtime on my build shrank to ~15 mins. `$ time cosa build container` **Before** ``` real 10m47.769s user 52m14.763s sys 46m38.546s ``` **After** ``` real 15m37.333s user 2m38.751s sys 0m14.410s ``` The speedup is accomplished by avoiding the need to query the rpmdb for every file. Instead the rpmdb is walked to build a cache of the files to providing packages, so that when the ostree filesystem is walked later it can just check the cache. The cache is structured similarly to rpm's internals, where paths are maintained as separate basename and dirname entries. Additionally, like rpm, the paths are considered equivalent if the dirnames resolve to the same path (rpm uses `stat` to compare inodes, this implementation resolves the symlinks). This results in output that is effectively equivalent to the previous implementation while being substantially faster. To minimize memory overhead maintaining the file mapping, a simple string cache is also added. Closes: coreos#4880
#4768) While toying around with building my own custom FCOS builds, I noticed that running `cosa build container` with a package set similar to Silverblue's resulted in ~2hr builds, the vast majority of which was in the "Building package mapping" task. After this change, the runtime on my build shrank to ~15 mins. `$ time cosa build container` **Before** ``` real 10m47.769s user 52m14.763s sys 46m38.546s ``` **After** ``` real 15m37.333s user 2m38.751s sys 0m14.410s ``` The speedup is accomplished by avoiding the need to query the rpmdb for every file. Instead the rpmdb is walked to build a cache of the files to providing packages, so that when the ostree filesystem is walked later it can just check the cache. The cache is structured similarly to rpm's internals, where paths are maintained as separate basename and dirname entries. Additionally, like rpm, the paths are considered equivalent if the dirnames resolve to the same path (rpm uses `stat` to compare inodes, this implementation resolves the symlinks). This results in output that is effectively equivalent to the previous implementation while being substantially faster. To minimize memory overhead maintaining the file mapping, a simple string cache is also added. Closes: #4880
Describe the bug
This function seems to be a single-threaded, non-optimized recursive walk of the entire filesystem which checksums every single file it hits, and inexplicably checksums some of them twice. Is it really needed? If so, can it be optimized?
Reproduction steps
Expected behavior
A much shorter period of sad screen staring
Actual behavior
Literal hours of it, in the case of Onyx (which seems to have an inordinately large amount of icons that bog it down for ages)
System details
Additional information
@nirik and I both noticed that the slowest thing in Fedora composes currently is the ostree_container phase. In today's F40 compose it took 10704 seconds (just about exactly three hours). In today's Rawhide it took 11562 seconds. No other phase in Rawhide took more than 4098 seconds (image_build).
Poking into it a bit more, we noticed that the Onyx container seems to be an outlier; it takes around an hour and a half longer than the one that takes the second longest (Kinoite). Running it interactively, we found it spends a very long time at a point logged as "Building package mapping...". I dug into the code a bit and found that this is just running the
build_mapping_recurse
function (which seems to be copylib'ed from ostree-rs-ext, kinda, although the versions are somewhat different).This function seems to be a single-threaded, non-optimized recursive walk of the entire filesystem which checksums every single file it hits, and inexplicably checksums some of them twice, here:
why do we redo the checksum in the 'occupied' case, there? We just did it!
Anyhow, this seems like a pretty slow thing to have holding up all our composes. Is there any way this function could be optimized, reduced, nuked from orbit, or all of the above? I can't quite figure out why we need a mapping of file checksums to owning packages, anyway, and the commit message from what seems to be the initial version of this thing, over in ostree-rs-ext doesn't really enlighten me.
The text was updated successfully, but these errors were encountered: