Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Composing tree with --unified-core flag adds additional folder #3270

Open
Topfi opened this issue Dec 13, 2021 · 12 comments
Open

[BUG] Composing tree with --unified-core flag adds additional folder #3270

Topfi opened this issue Dec 13, 2021 · 12 comments

Comments

@Topfi
Copy link

Topfi commented Dec 13, 2021

I have previously opened bug reports concerning this issue both at the Anaconda bugzilla and at the lorax repository. It seems however that with all the information I have been able to collect since first encountering this issue, that the issue is most likely related to rpm-ostree compose, rather than one of the other two projects.
Describe the bug
I am using a script (see below) to build a bootable .iso image based on a repo composed using the official fedora-silverblue.yaml files. This should yield an identical image to the official Fedora Silverblue 35 release. The resulting .iso image can indeed be booted.

However, during installation, Anaconda crashes displaying the error "The command ‘cp -r -r /mnt/sysroot/usr/lib/ostree-boot/loader /mnt/sysimage/boot’ exited with the code 1."

This seems to be related to some addition being made during "rpm-ostree compose" as there is no attempt by Anaconda to copy this specific directory ("/mnt/sysroot/usr/lib/ostree-boot/loader") when using an official released Silverblue image (also detailed in the Anaconda bugreport).

This is the only significant difference in the Anaconda log files between an official image and an image based on a locally build repo.

This behaviour also does not happen when using a rpm-ostree repo hosted online. These repos are build using the same fedora-silverblue.yaml file and can be installed without encountering this bug.

tl;dr: Repos build locally with "rpm-ostree compose" lead to Anaconda trying to copy a non-existent folder, causing a crash. Using an online hosted repo based on the same source files does not cause this behaviour.

To Reproduce
I have attached a script I use to automate the described steps below.
Steps to reproduce the behavior:

  1. Install Lorax and RPM-OSTree via dnf or rpm on a fresh installation of Fedora 35 Workstation
  2. Clone https://pagure.io/workstation-ostree-config.git and https://pagure.io/fedora-lorax-templates.git into a local folder
  3. Initialize and Compose a new repo based on a .yaml file inside the workstation-ostree-config folder (this appears to be were the issue is caused as online hosted repos based on the same .yaml file do not cause the error)
  4. Execute Lorax with ostree_install_repo and ostree_update_repo pointing to the created local repository
  5. After Lorax has finished the build process, try installing the created .iso in a hypervisor or phisical device
  6. Encounter error (“The command ‘cp -r -r /mnt/sysroot/usr/lib/ostree-boot/loader /mnt/sysimage/boot’ exited with the code 1.”, see also screenshot below)

Expected behavior
The .iso should install as expected when basing a new image on a repository generated via the official ostree-config files.

Script
The following script does all described steps aside from the installation of OSTree and Lorax building the .iso based on the official Fedora 35 Silverblue GNOME edition. It leads to the described behavior.

mkdir iso && cd iso
git clone -b f35 https://pagure.io/workstation-ostree-config.git
git clone -b f35 https://pagure.io/fedora-lorax-templates.git
mkdir ostree_repo
ostree init --repo=ostree_repo
rpm-ostree compose --unified-core tree --repo=$(pwd)/ostree_repo $(pwd)/workstation-ostree-config/fedora-silverblue.yaml

exec lorax  --product=Fedora \
                --version=35 \
                --release=20211104 \
                --source=https://kojipkgs.fedoraproject.org/compose/35/latest-Fedora-35/compose/Everything/x86_64/os/ \
                --variant=Silverblue \
                --nomacboot \
                --volid=Fedora-SB-ostree-x86_64-35 \
                --add-template=$(pwd)/fedora-lorax-templates/ostree-based-installer/lorax-configure-repo.tmpl \
                --add-template=$(pwd)/fedora-lorax-templates/ostree-based-installer/lorax-embed-repo.tmpl \
                --add-template=$(pwd)/fedora-lorax-templates/ostree-based-installer/lorax-embed-flatpaks.tmpl \
                --add-template-var=ostree_install_repo=file://$(pwd)/ostree_repo \
                --add-template-var=ostree_update_repo=file://$(pwd)/ostree_repo \
                --add-template-var=ostree_osname=fedora \
                --add-template-var=ostree_oskey=fedora-35-primary \
                --add-template-var=ostree_contenturl=mirrorlist=https://ostree.fedoraproject.org/mirrorlist \
                --add-template-var=ostree_install_ref=fedora/35/x86_64/silverblue \
                --add-template-var=ostree_update_ref=fedora/35/x86_64/silverblue \
                --add-template-var=flatpak_remote_name=fedora \
                --add-template-var=flatpak_remote_url=oci+https://registry.fedoraproject.org \
                --add-template-var=flatpak_remote_refs="runtime/org.fedoraproject.Platform/x86_64/f35 app/org.gnome.gedit/x86_64/stable" \
                --logfile=$(pwd)/lorax.log \
                --tmp=$(pwd)/temp \
                --rootfs-size=8 \
                $(pwd)/finished

Screenshots

2648c07629fdde39e5152e04c1c113f78eee7246

583e4e8f00107f15fa5382a8b4ea34cbbad89fea

@jlebon
Copy link
Member

jlebon commented Dec 15, 2021

In the first screenshot, the command which failed is findmnt, and looks non-fatal. In the second screenshot, the command which failed is cp. It might be in the program-log tab instead.

A major difference between official FSB composes and your script is the use of --unified-core, which FSB does not yet use. Worth trying without that, though I don't think it's the root issue.

@Topfi
Copy link
Author

Topfi commented Dec 15, 2021

Thank you so much.

Concerning the screenshots, sorry, I did not attach the full Anaconda logs in the initial post, this includes the information from all tabs combined into one file: https://bugzilla-attachments.redhat.com/attachment.cgi?id=1840905

Both you and Dusty Mabe pointed me towards looking at --unified-core and, as it turns out, that is the culprit causing the appearance of the mysterious "loader" directory.

Here are the contents of the "ostree-boot" directory when composing with --unified-core:

$ ostree --repo='/home/user/build/iso/ostree_repo'  ls fedora/35/x86_64/silverblue /usr/lib/ostree-boot
d00755 0 0      0 /usr/lib/ostree-boot
d00700 0 0      0 /usr/lib/ostree-boot/efi
d00700 0 0      0 /usr/lib/ostree-boot/grub2
d00755 0 0      0 /usr/lib/ostree-boot/loader

And here without it:

$ ostree --repo='/home/user/build/iso/ostree_repo2'  ls fedora/35/x86_64/silverblue /usr/lib/ostree-boot
d00555 0 0      0 /usr/lib/ostree-boot
d00700 0 0      0 /usr/lib/ostree-boot/efi
d00700 0 0      0 /usr/lib/ostree-boot/grub2

Turns out, --unified-core is the culprit. I was under the impression that --unified-core simply uses newer, but functionally compatible code for compose and should serve as a drop-in replacement for the old. Is my understanding incorrect and there are some more significant differences between the two causing this?

Also, where could the "loader" directory be coming from?

@dustymabe
Copy link
Member

I'm still not really sure why --unified-core is causing this, though. IIUC FCOS is built with --unifed-core and doesn't have /usr/lib/ostree-boot/loader, though it's a different package set.

@Topfi
Copy link
Author

Topfi commented Dec 19, 2021

I decide to look into this a bit further and find out, whether the differences between a repo build using the old and new code path are limited to this one folder. That seems to be the case, oddly enough. I can't explain its origins, but this issue seems to be limited to this one "loader" folder, the rest of /usr/lib, /usr/bin, etc. appear identical between the two repos.
I have changed the title of this bug report to better reflect the specific issue and added a quick summary for reference.

Summary:

Bug is caused when composing a repository using rpm-ostree with the "unified-core" flag. This causes the addition of a folder (/usr/lib/ostree-boot/loader) that should not be added.

@Topfi Topfi changed the title [BUG] Repos build locally with rpm-ostree cause failure during Anaconda install [BUG] Composing tree with --unified-core flag adds additional folder Dec 19, 2021
@nothingneko
Copy link

Hey, we're experiencing the same issue, any updates?

@Topfi
Copy link
Author

Topfi commented Jan 3, 2022

As it stands, the best course of action is to compose the tree without the "--unified-core" flag applied.

@nothingneko
Copy link

As it stands, the best course of action is to compose the tree without the "--unified-core" flag applied.

That's what we ended up doing. Everything is working nicely.

@goshansp
Copy link

running rpm-ostree compose tree on workstation-ostree-config/fedora-silverblue.yaml has stopped working for me without --unified-core and I get error: Finalizing rootfs: Converting /var to tmpfiles.d: Processing var content /var/lib/gssproxy/rcache: Reading symlink: a path led outside of the filesystem

ps: my repo is online and I also get 'The command cp -r -r /mnt/sysroot/usr/lib/ostree-boot/loader /mnt/sysimage/boot’ exited with the code 1 when using '--unified-core'.

can someone please confirm that composing without --unified-core has stopped to workaround the installation issue?

/var/lib/gssproxy/rcache is provided by gssproxy and required by nfs-utils.

workaround: commenting nfs-utils from fedora-common-ostree-pkgs.yaml

@cgwalters
Copy link
Member

error: Finalizing rootfs: Converting /var to tmpfiles.d: Processing var content /var/lib/gssproxy/rcache: Reading symlink: a path led outside of the filesystem

Should be fixed by #3771 (shipped in e.g. https://bodhi.fedoraproject.org/updates/FEDORA-2022-bae08a2e98 )

@cgwalters cgwalters reopened this Jul 13, 2022
@goshansp
Copy link

@cgwalters thank you so much! on rpm-ostree-2022.10-3.fc36.x86_64 i am once again able to compose without --unified-core. your help was crucial as libvirt requires gssproxy and without your help i would have not been able to restore my dev environment.

is there a way to remove the conflicting boot/loader directory from the tree by some post-compose hack like ostree --repo='/home/user/build/iso/ostree_repo' rm fedora/35/x86_64/silverblue /usr/lib/ostree-boot/loader. or how can the grub2-efi package be made compatible with rpm-ostree/--unified-core/anaconda install?

@jcdickinson
Copy link
Contributor

I am successfully working around this issue by using: https://coreos.github.io/rpm-ostree/compose-server/#granular-tree-compose-with-installpostprocesscommit

Between postprocess and commit my build script simply: rm -rvf "$SYSROOT_PATH/rootfs/usr/lib/ostree-boot/loader"

@champtar
Copy link

champtar commented Jan 3, 2024

Fedora fix is to just delete the 'loader' folder in postprocessing https://pagure.io/workstation-ostree-config/c/9dc9105ddbcfe18c83bf4b0c6a75a2d3e794b385

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants