-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/dnfjson: fix size calculation bug with multidir caches #246
Conversation
bc27278
to
3220d73
Compare
Something to note here: I'm trying to test the difference between main and this PR with some extreme cases and while I know there's a difference, I'm not seeing an issue with the existing things in
You'll see that some distro repos are deleted completely (because the limit is so low) and others (like RHEL 9.3 x86) stay at 4.6 MiB. It also gets a bit more interesting if you set the limit to 17 MiB. You'll see some distro caches reach 20-30 MiB and then get deleted to 14-15 MiB after the depsolve is done. Which to me means the cleanup is actually working (but with the per-distro limit), so I wonder what went wrong originally. |
Actually, |
The issue is that the service workers set an empty distribution: https://github.com/osbuild/osbuild-composer/blob/a1e428fc539271333d71902672dff2407947ae62/cmd/osbuild-worker/jobimpl-depsolve.go#L23 So these caches are not stored in a distribution subdirectory, but in a cache root, which breaks the whole algorithm. Maybe the fix could be to disallow passing an empty distribution (or use an |
If we use Given all this, it makes me wonder if the per-repo cache roots are even necessary in the end. Maybe it's fine to use a single cache for everything? A solution that fixes the current issue without changing behaviour is to drop the |
803c885
to
9883885
Compare
9883885
to
1838a06
Compare
1838a06
to
54a72ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A few non-functional nitpicks and questions :)
One more thing though:
I think we should remove the distro
here:
images/internal/dnfjson/dnfjson.go
Line 72 in 54a72ee
func (bs *BaseSolver) NewWithConfig(modulePlatformID, releaseVer, arch, distro string) *Solver { |
and instead use the other components to make a cache subdirectory here:
images/internal/dnfjson/dnfjson.go
Lines 123 to 130 in 54a72ee
func (s *Solver) GetCacheDir() string { | |
b := filepath.Base(s.distro) | |
if b == "." || b == "/" { | |
return s.cache.root | |
} | |
return filepath.Join(s.cache.root, s.distro) | |
} |
The other components (platform ID, os version, architecture) are mandatory for depsolving, so they can't be empty strings or the depsolver will fail. That way, we avoid running into any issues (like we already did) with the solver being initialised with an empty distro string, essentially pushing the distro caches one level up, and we still get distro-specific caches.
I don't know if there might be a conflict between RHEL 9 and CS9 (if osversion of rhel 9 is 9
, then they both get el9
platform ID and os version 9
).
Another alternative is to make the distro
mandatory and fail if it's an empty string.
I have some comments and a recommendation, but approving since they're not critical and can be done in a follow-up if we need this PR to be merged quickly. |
Part of the fix for COMPOSER-2068
update test data so that TestMultiDirCacheRead fails before the fix and succeeds after Golang's os.ReadDir() returns entries in alphabetical order and since for our test data, the alphabetical order of the entries matches the mtime-based sorting, so tests pass even with no sorting at all. This change breaks the alphabetical order of data.
instead of distro which is sometimes empty (on brew workers)
54a72ee
to
379f614
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Great work thanks for fixing this :)
COMPOSER-2068