Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ostree admin fails if /ostree/deploy has mtime 0 #3022

Closed
fwilhe opened this issue Aug 30, 2023 · 6 comments · Fixed by #3243
Closed

ostree admin fails if /ostree/deploy has mtime 0 #3022

fwilhe opened this issue Aug 30, 2023 · 6 comments · Fixed by #3243

Comments

@fwilhe
Copy link
Contributor

fwilhe commented Aug 30, 2023

Hi everyone, I have an issue that seems hard to describe, I'll do my best to explain what I did:

I'm building a debian-based OSTree system.
I'm using the ostree package from debian testing.

My system does boot successfully and it looks good on the first sight.

I can run ostree commands like:

root@e931861b90bf:~# ostree refs
gardenlinux/today/arm64
ostree/1/1/0
root@e931861b90bf:~# ostree log gardenlinux/today/arm64
commit 54d0898411d9a709685bc1ff6b3b4fafbdc16402c8c4e014ad84488b7e39fb59
ContentChecksum:  9a7c454a3150e9d42e9ccf1743c5b51c4e588799423e13235d4c95882f4bc993
Date:  2023-08-28 14:42:53 +0000

    Gardenlinux build Mon Aug 28 14:42:50 UTC 2023

root@e931861b90bf:~# ostree ls 54d0898411d9a709685bc1ff6b3b4fafbdc16402c8c4e014ad84488b7e39fb59
d00755 0 0      0 /
l00777 0 0      0 /bin -> usr/bin
l00777 0 0      0 /home -> var/home
l00777 0 0      0 /lib -> usr/lib
l00777 0 0      0 /ostree -> sysroot/ostree
l00777 0 0      0 /root -> var/roothome
l00777 0 0      0 /sbin -> usr/sbin
... (output abbreviated)

But as soon as I'm trying to run any of the ostree admin subcommands, I get an error because of a failed assertion:

root@e931861b90bf:~# ostree admin status
**
OSTree:ERROR:src/libostree/ostree-sysroot.c:1333:ostree_sysroot_get_deployments: assertion failed: (self->loadstate == OSTREE_SYSROOT_LOAD_STATE_LOADED)
Bail out! OSTree:ERROR:src/libostree/ostree-sysroot.c:1333:ostree_sysroot_get_deployments: assertion failed: (self->loadstate == OSTREE_SYSROOT_LOAD_STATE_LOADED)
Aborted

This does seem to happen with any sub-command of ostree admin.

root@e931861b90bf:~# ostree admin cleanup
**
OSTree:ERROR:src/libostree/ostree-sysroot.c:1284:ostree_sysroot_get_booted_deployment: assertion failed: (self->loadstate == OSTREE_SYSROOT_LOAD_STATE_LOADED)
Bail out! OSTree:ERROR:src/libostree/ostree-sysroot.c:1284:ostree_sysroot_get_booted_deployment: assertion failed: (self->loadstate == OSTREE_SYSROOT_LOAD_STATE_LOADED)
Aborted

It looks like loadstate is only assigned the OSTREE_SYSROOT_LOAD_STATE_LOADED value in a single location in ostree-sysroot.c

I'm not sure where this is supposed to be called.
I've tried to trace the function calls, and the most likely chain seems to be this:

ot_admin_instutil_builtin_set_kargs
	ostree_sysroot_deployment_set_kargs
		ostree_sysroot_write_deployments
			ostree_sysroot_write_deployments_with_options
				write_deployments_finish
				ostree_sysroot_stage_tree_with_options
					ostree_sysroot_load
						ostree_sysroot_load_if_changed
							sysroot_load_from_bootloader_configs
								self->loadstate = OSTREE_SYSROOT_LOAD_STATE_LOADED

I would expect this to happen at image build time when I call ostree admin deploy, but it does not look like this.

Verbose output of ostree admin deploy does not show any hint of an issue, here is an abbreviated output of that:

OT: Keeping needed object cc8a8101ce21826e2e05412343c2f5a47390d7af242efed38857766dc8d207d7.file
OT: Pushing lock non-blocking with timeout 30
OT: Push lock: state=exclusive, depth=2
OT: Repo already locked exclusive, maintaining state
OT: Popping lock non-blocking with timeout 30
OT: Pop lock: state=exclusive, depth=3
OT: Maintaining lock state as exclusive
OT: Popping lock non-blocking with timeout 30
OT: Pop lock: state=exclusive, depth=2
OT: Maintaining lock state as exclusive
OT: Popping lock non-blocking with timeout 30
OT: Pop lock: state=exclusive, depth=1
OT: Unlocking repo

Can someone help me figure out why this assertion is failing? At what point is loadstate supposed to be set to OSTREE_SYSROOT_LOAD_STATE_LOADED?

@fwilhe
Copy link
Contributor Author

fwilhe commented Aug 30, 2023

Maybe relevant to mention: It's booting using systemd-boot (despite this not being officially supported, as documented here).
This might be an issue when loading the repository, I guess.

@fwilhe
Copy link
Contributor Author

fwilhe commented Sep 19, 2023

I've made some progress in debugging this:

inside ostree_sysroot_load_if_changed there is an early exit based on timestamps: self->loaded_ts.tv_sec == stbuf.st_mtim.tv_sec && self->loaded_ts.tv_nsec == stbuf.st_mtim.tv_nsec. This condition seems to be true in my case, when I rebuild the package with this block disabled, I can run any ostree admin command. I don't know why this is true in my case when it should not be true.

@jlebon
Copy link
Member

jlebon commented Sep 19, 2023

What does stat /ostree/deploy say? I'm not sure if that's what's happening here, but regardless we should probably tweak this logic so it can handle the case where the timestamp of /ostree/deploy is epoch. (Likely from being normalized at build time or something.)

@fwilhe
Copy link
Contributor Author

fwilhe commented Sep 19, 2023

Hi @jlebon, oh, yeah, has not occurred to me.

root@garden:~# stat /ostree/deploy
  File: /ostree/deploy
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: 254,2	Inode: 28          Links: 3
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 1970-01-01 00:00:00.000000000 +0000
Modify: 1970-01-01 00:00:00.000000000 +0000
Change: 1970-01-01 00:00:00.000000000 +0000
 Birth: 2023-09-19 14:33:31.000000000 +0000

I think this is because in Garden Linux we want images to be reproducible which is why the timestamp is set to 0 on purpose when creating the filesystems.. Probably that's why the check is true..

fwilhe added a commit to fwilhe/ostree-debian-builder that referenced this issue Sep 20, 2023
@fwilhe
Copy link
Contributor Author

fwilhe commented Sep 20, 2023

Okay, I've verified that this is the issue. I've learned that our tooling for making filesystems can set an arbitrary timestamp, and if I set it to a non-zero timestamp the issue does not occur.

This is the commit that mitigates the issue

The commit before that allows to reproduce the issue, if that is relevant to anyone.

That's good enough for my poc use-case, but maybe we should still leave this issue open if it makes sense to make this condition more robust for such cases?

fwilhe added a commit to gardenlinux/gardenlinux that referenced this issue Sep 20, 2023
@jlebon jlebon changed the title assertion failed: (self->loadstate == OSTREE_SYSROOT_LOAD_STATE_LOADED) when running any ostree admin command ostree admin fails if /ostree/deploy has mtime 0 Sep 20, 2023
@jlebon
Copy link
Member

jlebon commented Sep 20, 2023

That's good enough for my poc use-case, but maybe we should still leave this issue open if it makes sense to make this condition more robust for such cases?

It makes sense to me. I've retitled the issue.

cgwalters added a commit to cgwalters/ostree that referenced this issue May 15, 2024
Ironically we break if the timestamp there is zero.

Closes: ostreedev#3022
Signed-off-by: Colin Walters <[email protected]>
cgwalters added a commit to cgwalters/ostree that referenced this issue May 15, 2024
Ironically we break if the timestamp there is zero.

Closes: ostreedev#3022
Signed-off-by: Colin Walters <[email protected]>
cgwalters added a commit to cgwalters/ostree that referenced this issue May 15, 2024
Ironically we break if the timestamp there is zero.

Closes: ostreedev#3022
Signed-off-by: Colin Walters <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants