-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce new type PAYLOAD_LINK #1443
Conversation
6b696c5
to
7eadeb1
Compare
I wonder if it'd be simpler to add a special xattr that we know about and also filter out. |
how could we use the xattr to lookup for the payload checksum? |
Yeah, that's an issue. We could scan all objects, but that'd get slow unless amortized. So...hmm. A big picture question here in my mind is the (still unsettled?) degree to which libostree provides low-level APIs for the OCI case, versus doing things at a higher level. If we provide high level APIs, we're more free to change/optimize the implementation details later. One other random thought here - how about only doing this for say files over 5MB (or some configurable threshold) instead? Trading off indexing overhead versus space savings. Another issue here is I think it needs to be opt-in; otherwise we're taking up extra space for "pure libostree" users who aren't doing containers/rpms/whatever on top. |
7eadeb1
to
bb1f3a6
Compare
yes good point. I've added a new commit that implements a minimum threshold configuration, the payload link will be created only for files that have a size greater or equal to the threshold. Are you fine with a default value of 3 MiB? If you are for disabling this feature by default, we can setup a much bigger value by default, that has the same effect. About OCI, we should probably have a bigger discussion around it. If we deal with it directly in OSTree, we would end up duplicating a lot of functionalities that are already in One feature I particularly like is: In addition to copy images to the OSTree storage, we can also copy them back to other storages, like to the Docker engine or to a registry. The OSTree storage part of |
Any feedback on the design? |
But that's still imposing an unnecessary cost on everyone who is doing "pure ostree" embedded devices, etc. We could have it be a repo flag like
or something. And for Project Atomic systems we'd enable that flag. Or perhaps have an But the configuration entrypoint aside, my main concern here is we're basically doubling the number of files we create; on this workstation:
And like I said before I'm already not totally happy with how many small metadata files we have. In fact it'd be interesting to investigate not hardlinking for small objects at all. Is it really worth it to do hardlinks for e.g. < 1k files? Less pressure on the filesystem journal to do inode updates? Another concern is that now pruning objects isn't atomic; it looks like you made the commit process ignore "dangling" payload links, but it should probably delete them so they can be recreated. |
Also big picture, we know this is an issue for SELinux systems, but what about SMACK/AppArmor? IIRC SMACK uses xattrs but I don't know how aggressive it is with distinct labels like SELinux is. TBH I don't think it's too worth your time digging into SMACK/AppArmor, but there are definitely ostree users who don't use SELinux, and in that case again we don't need the context indexing, right? |
the number of files >3MiB is a small subset of all the files, this is what I have in the my local repository:
but anyway, it still requires the support from the file system to be useful, so I will change the default to Yes I am not happy as well about the pruning algorithm, do you think we should recalculate everything? I didn't implement this way as it looks quite expensive (prune will be as expensive as fsck), maybe I can delete them all and recreate only those for files that are still present in the repository after the pruning. |
Oh wow, you're right...that's actually pretty amazing; on my current workstation the ratio of "3MB+" objects is |
And on a stock |
OK so there's a higher level issue here too I just thought of: this code only has an effect for commits created locally (as we do when importing OCI). When we're pulling objects directly via libostree ( So we'd have to either add this to pull (which in archive fetches would double the number of http requests, probably a non-starter), or add it to some lookaside data (messy). We can include it in static deltas easily enough though, and it should be straightforward to teach jigdo how to do it too. |
Though hmm...since we're computing checksums at pull time now anyways usually, we could just do two SHA-256 checksums simultaneously. |
9105146
to
7b4815d
Compare
I checked "ostree pull" and this code path also creates the I am still chasing down the failure in the CI, although I've done some changes in the PR:
|
2bf8fea
to
6242cc5
Compare
tests pass again ✌️ |
src/libostree/ostree-repo-commit.c
Outdated
file_input = (GInputStream*)checksum_input; | ||
else | ||
{ | ||
checksum_payload_input = ot_checksum_instream_new ((GInputStream*)checksum_input, G_CHECKSUM_SHA256); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably worth a comment here like:
/* The payload checksum-input reads from the full object checksum-input; this
* means it skips the header.
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
src/libostree/ostree-repo-prune.c
Outdated
@@ -233,6 +260,7 @@ repo_prune_internal (OstreeRepo *self, | |||
g_autoptr(GHashTable) reachable_owned = g_hash_table_ref (options->reachable); | |||
data.reachable = reachable_owned; | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spurious extra newline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropped in the new version
Yeah, I had it backwards; it's the "trusted" cases that aren't; basically whenever we aren't redoing the SHA-256. So for example |
6242cc5
to
5c6862f
Compare
1dd6833
to
9995c8e
Compare
☔ The latest upstream changes (presumably 733c049) made this pull request unmergeable. Please resolve the merge conflicts. |
@cgwalters are you fine to merge this? |
The test is being skipped right now though: https://s3.amazonaws.com/aos-ci/ghprb/ostreedev/ostree/14c00f4e8527d8b81ffc6d06cca854b3fbd187e8.0.1520269933286777277/artifacts/gdtr-results/libostree_test-payload-link.sh.test.txt I know our current testing setup is very confusing. I'm working on addressing that (well, some things will be more confusing, others less so) in #1462 Anyways so what you want is to add a case to |
The "dev flow" I have for the |
it was removed with: commit 8609cb0 Author: Colin Walters <[email protected]> Date: Thu Apr 21 15:14:51 2016 -0400 repo: Simplify internal has_object() lookup code Signed-off-by: Giuseppe Scrivano <[email protected]>
It will be used by successive commits to keep track of the payload checksum for objects stored in the repository. The goal is that files having the same payload but different xattrs can take advantage of reflinks where supported. Signed-off-by: Giuseppe Scrivano <[email protected]>
I've pushed a new version with the installed test script. I hope it will pass the CI :-) |
c2ddd8f
to
e9ecbdc
Compare
When a new object is added to the repository, create a $PAYLOAD-SHA256.payload-link symlink file as well. The target of the symlink is the checksum of the object that was added the repository. Whenever we add a new object file, in addition to lookup if the file is already present with the same checksum we also check if an object with the same payload is in the repository. If a file with the same payload is already present in the repository, we copy it with `glnx_regfile_copy_bytes` that internally attempts to create a reflink (ioctl (..., FICLONE, ..)) to the target file if the file system supports it. This enables to have objects that share the payload but have a different inode and xattrs. By default the payload-link-threshold value is G_MAXUINT64 that disables the feature. Signed-off-by: Giuseppe Scrivano <[email protected]>
@cgwalters finally managed to get all the tests happy again :-) |
cd ${test_tmpdir} | ||
|
||
touch a | ||
if cp --reflink a b; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm debating a bit if we should instead just assert that this works; I mean if we give reflink=1
to XFS it had better support it. But eh. We can tweak that later.
Thanks for all of your work on this! |
⚡ Test exempted: merge already tested. |
It will be used by successive commits to keep track of the payload checksum for objects stored in the repository. The goal is that files having the same payload but different xattrs can take advantage of reflinks where supported. Signed-off-by: Giuseppe Scrivano <[email protected]> Closes: #1443 Approved by: cgwalters
When a new object is added to the repository, create a $PAYLOAD-SHA256.payload-link symlink file as well. The target of the symlink is the checksum of the object that was added the repository. Whenever we add a new object file, in addition to lookup if the file is already present with the same checksum we also check if an object with the same payload is in the repository. If a file with the same payload is already present in the repository, we copy it with `glnx_regfile_copy_bytes` that internally attempts to create a reflink (ioctl (..., FICLONE, ..)) to the target file if the file system supports it. This enables to have objects that share the payload but have a different inode and xattrs. By default the payload-link-threshold value is G_MAXUINT64 that disables the feature. Signed-off-by: Giuseppe Scrivano <[email protected]> Closes: #1443 Approved by: cgwalters
This regressed flatpak, see #1524 for fix |
It is used to keep track of the payload checksum for files stored in the repository.
The goal is that files having the same payload but different xattrs can take advantage of reflinks where supported.
More details here: https://mail.gnome.org/archives/ostree-list/2018-January/msg00012.html