Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/deploy: Use fallocate for early prune space check #2866

Merged
merged 5 commits into from
May 30, 2023

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented May 29, 2023

The f_bfree member of the statvfs struct is documented as the
"number of free blocks". However, different filesystems have different
interpretations of this. E.g. on XFS, this is truly the number of blocks
free for allocating data. On ext4 however, it includes blocks that
are actually reserved by the filesystem and cannot be used for file
data. (Note this is separate from the distinction between f_bfree and
f_bavail which isn't relevant to us here since we're privileged.)

If a kernel and initrd is sized just right so that it's still within the
f_bfree limit but above what we can actually allocate, the early prune
code won't kick in since it'll think that there is enough space. So we
end up hitting ENOSPC when we actually copy the files in.

Rework the early prune code to instead use fallocate which guarantees
us that a file of a certain size can fit on the filesystem. fallocate
requires filesystem support, but all the filesystems we care about for
the bootfs support it (including even FAT).

(There's technically a TOCTOU race here that existed also with the
statvfs code where free space could change between when we check
and when we copy. Ideally we'd be able to pass down that fd to the
copying bits, but anyway in practice the bootfs is pretty much owned by
libostree and one doesn't expect concurrent writes during a finalization
operation.)

jlebon added 5 commits May 27, 2023 10:38
Classic case of analysis getting confused by variables initialized by
a function.
Noticed this diagnostic in my editor with clangd hooked up.
`size_to_remove` looks cryptic in contrast to
`new_new_bootcsum_dirs_total_size`. Rename it in the style of the latter
for easier reading.
The `f_bfree` member of the `statvfs` struct is documented as the
"number of free blocks". However, different filesystems have different
interpretations of this. E.g. on XFS, this is truly the number of blocks
free for allocating data. On ext4 however, it includes blocks that
are actually reserved by the filesystem and cannot be used for file
data. (Note this is separate from the distinction between `f_bfree` and
`f_bavail` which isn't relevant to us here since we're privileged.)

If a kernel and initrd is sized just right so that it's still within the
`f_bfree` limit but above what we can actually allocate, the early prune
code won't kick in since it'll think that there is enough space. So we
end up hitting `ENOSPC` when we actually copy the files in.

Rework the early prune code to instead use `fallocate` which guarantees
us that a file of a certain size can fit on the filesystem. `fallocate`
requires filesystem support, but all the filesystems we care about for
the bootfs support it (including even FAT).

(There's technically a TOCTOU race here that existed also with the
`statvfs` code where free space could change between when we check
and when we copy. Ideally we'd be able to pass down that fd to the
copying bits, but anyway in practice the bootfs is pretty much owned by
libostree and one doesn't expect concurrent writes during a finalization
operation.)
/* This is a roundabout but more trustworthy way of doing a space check than
* relying on statvfs's f_bfree when you know the size of the objects. */
static gboolean
dfd_fallocate_check (int dfd, __off_t len, gboolean *out_passed, GError **error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what would be more efficient and cleaner here is to allocate a temporary fd per file we're going to copy, and then actually use the them to do the writes.

But that's also a lot more code changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I mentioned passing down the fd in the commit message, but yeah... churn.

@cgwalters cgwalters merged commit f903d6a into ostreedev:main May 30, 2023
@dustymabe
Copy link
Contributor

Backport to Fedora RPMs: https://src.fedoraproject.org/rpms/ostree/pull-request/33#

dustymabe added a commit to dustymabe/fedora-coreos-config that referenced this pull request May 31, 2023
It has a fix for an autopruning corner case.
ostreedev/ostree#2866
dustymabe added a commit to coreos/fedora-coreos-config that referenced this pull request May 31, 2023
It has a fix for an autopruning corner case.
ostreedev/ostree#2866
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this pull request Oct 10, 2023
It has a fix for an autopruning corner case.
ostreedev/ostree#2866
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this pull request Oct 10, 2023
It has a fix for an autopruning corner case.
ostreedev/ostree#2866
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants