Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rhel10 and daily-iso timeout with "WARNING anaconda:anaconda: display: Wayland startup failed: systemd exited with status 1" #1261

Open
rvykydal opened this issue Jul 22, 2024 · 14 comments
Labels
rhel10 test flake A test flake worth tracking (for more investigation, potenital bug, or to see if more come...)

Comments

@rvykydal
Copy link
Contributor

rvykydal commented Jul 22, 2024

First occurence 07/20/2024

23:22:46,919 WARNING anaconda:anaconda: display: Wayland startup failed: systemd exited with status 1
23:22:46,922 INFO anaconda:program: Running... chvt 1
23:22:46,948 DEBUG anaconda:program: Return code: 0
23:22:46,949 DEBUG anaconda:anaconda: anaconda: changing display mode from GUI to TUI
23:22:47,054 INFO systemd:Starting rtkit-daemon.service - RealtimeKit Scheduling Policy Service...
23:22:47,080 DEBUG rtkit-daemon:Successfully called chroot.
23:22:47,081 DEBUG rtkit-daemon:Successfully dropped privileges.
23:22:47,082 DEBUG rtkit-daemon:Successfully limited resources.
23:22:47,082 DEBUG rtkit-daemon:Running.
23:22:47,082 INFO systemd:Started rtkit-daemon.service - RealtimeKit Scheduling Policy Service.
23:22:47,084 DEBUG rtkit-daemon:Watchdog thread running. 
23:22:47,084 DEBUG rtkit-daemon:Canary thread running.
23:22:47,123 INFO dbus-broker-launch:avc:  denied  { send_msg } for  scontext=system_u:system_r:kernel_t:s0 tcontext=unconfined_u:unconfined_r:unconfined_t:s0 tclass=dbus permissive=1
23:22:49,179 DEBUG anaconda:simpleline: Scheduling screen AskRDSpoke 
23:22:49,181 DEBUG anaconda:simpleline: New signal RenderScreenSignal enqueued with source ScreenScheduler
23:22:49,181 DEBUG anaconda:simpleline: Starting main loop
23:22:49,182 DEBUG anaconda:simpleline: Processing signal RenderScreenSignal
23:22:49,183 DEBUG anaconda:simpleline: Processing screen ScreenData(AskRDSpoke,None,False)
23:22:49,187 DEBUG anaconda:simpleline: Input is required by ScreenData(AskRDSpoke,None,False) screen

3 occurrences last week
virt-install.log
virt-install.log
virt-install.log

@rvykydal rvykydal added test flake A test flake worth tracking (for more investigation, potenital bug, or to see if more come...) rhel10 labels Jul 22, 2024
@rvykydal
Copy link
Contributor Author

lorax-packages.log diff (compared to daily 2 days before):

< zlib-ng-compat-2.1.6-2.el10.x86_64
< libuuid-2.40.1-4.el10.x86_64
---
> zlib-ng-compat-2.1.6-3.el10.x86_64
> libuuid-2.40.2-4.el10.x86_64
127,129c127,128
< libblkid-2.40.1-4.el10.x86_64
< xz-libs-5.4.6-2.el10.x86_64
< libzstd-1.5.5-5.el10.x86_64
---
> libblkid-2.40.2-4.el10.x86_64
> xz-libs-5.6.2-1.el10.x86_64
130a130
> libzstd-1.5.5-5.el10.x86_64
133c133
< sqlite-libs-3.45.1-2.el10.x86_64
---
> sqlite-libs-3.45.1-3.el10.x86_64
135c135
< libxkbcommon-1.6.0-2.el10.x86_64
---
> libxkbcommon-1.7.0-3.el10.x86_64
138d137
< libxcrypt-4.4.36-6.el10.x86_64
139a139
> libxcrypt-4.4.36-9.el10.x86_64
144c144
< nspr-4.35.0-20.el10.x86_64
---
> nspr-4.35.0-23.el10.x86_64
148c148
< nss-util-3.97.0-1.el10.x86_64
---
> nss-util-3.101.0-2.el10.x86_64
162c162
< libnghttp2-1.61.0-1.el10.x86_64
---
> libnghttp2-1.61.0-2.el10.x86_64
167,168c167,168
< libselinux-3.7-1.el10.x86_64
< libmount-2.40.1-4.el10.x86_64
---
> libselinux-3.7-2.el10.x86_64
> libmount-2.40.2-4.el10.x86_64
173d172
< lz4-libs-1.9.4-7.el10.x86_64
177c176
< libfdisk-2.40.1-4.el10.x86_64
---
> libfdisk-2.40.2-4.el10.x86_64
181c180
< libsmartcols-2.40.1-4.el10.x86_64
---
> libsmartcols-2.40.2-4.el10.x86_64
206c205
< systemd-libs-255.3-1.el10.x86_64
---
> systemd-libs-256-4.el10.x86_64
209c208
< util-linux-core-2.40.1-4.el10.x86_64
---
> util-linux-core-2.40.2-4.el10.x86_64
219,221c218,220
< nettle-3.9.1-10.el10.x86_64
< gnutls-3.8.5-5.el10.x86_64
< glib2-2.80.2-1.el10.x86_64
---
> nettle-3.9.1-11.el10.x86_64
> gnutls-3.8.5-7.el10.x86_64
> glib2-2.80.4-1.el10.x86_64
230c229
< libnotify-0.8.3-4.el10.x86_64
---
> libnotify-0.8.3-5.el10.x86_64
243c242
< libselinux-utils-3.7-1.el10.x86_64
---
> libselinux-utils-3.7-2.el10.x86_64
254c253
< xz-5.4.6-2.el10.x86_64
---
> xz-5.6.2-1.el10.x86_64
262c261
< efivar-libs-39-1.el10.x86_64
---
> efivar-libs-39-2.el10.x86_64
276c275,276
< pciutils-libs-3.13.0-2.el10.x86_64
---
> lz4-libs-1.9.4-7.el10.x86_64
> pciutils-libs-3.13.0-3.el10.x86_64
282,284c282,284
< ibus-libs-1.5.30-6.el10.x86_64
< iso-codes-4.16.0-3.el10.noarch
< vulkan-loader-1.3.283.0-2.el10.x86_64
---
> ibus-libs-1.5.30-8.el10.x86_64
> iso-codes-4.16.0-5.el10.noarch
> vulkan-loader-1.3.283.0-3.el10.x86_64
291c291
< libxkbcommon-x11-1.6.0-2.el10.x86_64
---
> libxkbcommon-x11-1.7.0-3.el10.x86_64
308,309c308,309
< nss-softokn-freebl-3.97.0-1.el10.x86_64
< nss-softokn-3.97.0-1.el10.x86_64
---
> nss-softokn-freebl-3.101.0-2.el10.x86_64
> nss-softokn-3.101.0-2.el10.x86_64
315c315
< kexec-tools-2.0.28-13.el10.x86_64
---
> kexec-tools-2.0.28-14.el10.x86_64
319c319
< webrtc-audio-processing-0.3.1-12.el10.x86_64
---
> webrtc-audio-processing-1.3-4.el10.x86_64
324c324
< iproute-6.7.0-2.el10.x86_64
---
> iproute-6.7.0-4.el10.x86_64
366c366
< oniguruma-6.9.9-3.el10.x86_64
---
> oniguruma-6.9.9-5.el10.x86_64
377d376
< libreport-filesystem-2.17.14-2.el10.noarch
406c405
< gnome-control-center-filesystem-46.2-3.el10.noarch
---
> gnome-control-center-filesystem-46.3-1.el10.noarch
409c408
< coreutils-common-9.4-7.el10.x86_64
---
> coreutils-common-9.5-2.el10.x86_64
412,413c411,412
< openssl-libs-3.2.2-6.el10.x86_64
< coreutils-9.4-7.el10.x86_64
---
> openssl-libs-3.2.2-7.el10.x86_64
> coreutils-9.5-2.el10.x86_64
417c416
< krb5-libs-1.21.2-7.el10.x86_64
---
> krb5-libs-1.21.3-2.el10.x86_64
422d420
< cairo-gobject-1.18.0-4.el10.x86_64
423a422
> cairo-gobject-1.18.0-4.el10.x86_64
448c447
< libwinpr-3.5.1-2.el10.x86_64
---
> libwinpr-3.6.2-1.el10.x86_64
455c454
< freerdp-libs-3.5.1-2.el10.x86_64
---
> freerdp-libs-3.6.2-1.el10.x86_64
475,476c474
< libcurl-8.6.0-6.el10.x86_64
< librepo-1.17.1-2.el10.x86_64
---
> libcurl-8.6.0-8.el10.x86_64
478c476
< curl-8.6.0-6.el10.x86_64
---
> curl-8.6.0-8.el10.x86_64
482c480
< cryptsetup-libs-2.7.3-1.el10.x86_64
---
> cryptsetup-libs-2.7.3-2.el10.x86_64
489,491c487,489
< systemd-pam-255.3-1.el10.x86_64
< util-linux-2.40.1-4.el10.x86_64
< systemd-255.3-1.el10.x86_64
---
> systemd-pam-256-4.el10.x86_64
> util-linux-2.40.2-4.el10.x86_64
> systemd-256-4.el10.x86_64
498,499c496,497
< initscripts-service-10.21-2.el10.noarch
< iputils-20240117-5.el10.x86_64
---
> initscripts-service-10.25-1.el10.noarch
> iputils-20240117-6.el10.x86_64
513,514c511,512
< pipewire-1.0.7-2.el10.x86_64
< pipewire-libs-1.0.7-2.el10.x86_64
---
> pipewire-1.0.7-3.el10.x86_64
> pipewire-libs-1.0.7-3.el10.x86_64
524c522
< systemd-udev-255.3-1.el10.x86_64
---
> systemd-udev-256-4.el10.x86_64
528,529c526,527
< kernel-modules-core-6.10.0-0.rc5.12.el10.x86_64
< kernel-core-6.10.0-0.rc5.12.el10.x86_64
---
> kernel-modules-core-6.10.0-15.el10.x86_64
> kernel-core-6.10.0-15.el10.x86_64
532c530
< kernel-modules-6.10.0-0.rc5.12.el10.x86_64
---
> kernel-modules-6.10.0-15.el10.x86_64
539c537
< mdadm-4.3-2.el10.1.x86_64
---
> mdadm-4.3-2.el10.2.x86_64
553c551
< pciutils-3.13.0-2.el10.x86_64
---
> pciutils-3.13.0-3.el10.x86_64
558,559c556,557
< perl-Data-Dumper-2.188-504.el10.x86_64
< perl-libnet-3.15-504.el10.noarch
---
> perl-Data-Dumper-2.189-504.el10.x86_64
> perl-libnet-3.15-505.el10.noarch
585c583
< perl-Pod-Perldoc-3.28.01-504.el10.noarch
---
> perl-Pod-Perldoc-3.28.01-505.el10.noarch
601c599
< perl-parent-0.241-503.el10.noarch
---
> perl-parent-0.241-504.el10.noarch
603c601
< perl-Getopt-Long-2.57-4.el10.noarch
---
> perl-Getopt-Long-2.58-1.el10.noarch
612,613c610,611
< python3-3.12.4-2.el10.x86_64
< python3-libs-3.12.4-2.el10.x86_64
---
> python3-3.12.4-3.el10.x86_64
> python3-libs-3.12.4-3.el10.x86_64
618c616
< python3-libselinux-3.7-1.el10.x86_64
---
> python3-libselinux-3.7-2.el10.x86_64
624,625c622,623
< brlapi-0.8.5-13.el10.x86_64
< brltty-6.6-13.el10.x86_64
---
> brlapi-0.8.5-15.el10.x86_64
> brltty-6.6-15.el10.x86_64
627,628c625,626
< nss-sysinit-3.97.0-1.el10.x86_64
< nss-3.97.0-1.el10.x86_64
---
> nss-sysinit-3.101.0-2.el10.x86_64
> nss-3.101.0-2.el10.x86_64
642c640
< gstreamer1-plugins-bad-free-libs-1.22.12-2.el10.x86_64
---
> gstreamer1-plugins-bad-free-libs-1.22.12-3.el10.x86_64
648d645
< python3-librepo-1.17.1-2.el10.x86_64
650c647
< gnutls-dane-3.8.5-5.el10.x86_64
---
> gnutls-dane-3.8.5-7.el10.x86_64
668c665
< python3-libmount-2.40.1-4.el10.x86_64
---
> python3-libmount-2.40.2-4.el10.x86_64
679,680c676,677
< initscripts-rename-device-10.21-2.el10.x86_64
< openssl-3.2.2-6.el10.x86_64
---
> initscripts-rename-device-10.25-1.el10.x86_64
> openssl-3.2.2-7.el10.x86_64
688c685
< mokutil-0.6.0-9.el10.x86_64
---
> mokutil-0.6.0-10.el10.x86_64
691a689
> librepo-1.18.0-2.el10.x86_64
693c691
< libdnf-0.73.1-4.el10.x86_64
---
> libdnf-0.73.1-5.el10.x86_64
696,699c694,697
< selinux-policy-40.13.4-1.el10.noarch
< selinux-policy-targeted-40.13.4-1.el10.noarch
< python3-libdnf-0.73.1-4.el10.x86_64
< python3-hawkey-0.73.1-4.el10.x86_64
---
> selinux-policy-40.13.5-1.el10.noarch
> selinux-policy-targeted-40.13.5-1.el10.noarch
> python3-libdnf-0.73.1-5.el10.x86_64
> python3-hawkey-0.73.1-5.el10.x86_64
706,707c704,706
< createrepo_c-libs-1.1.2-2.el10.x86_64
< createrepo_c-1.1.2-2.el10.x86_64
---
> python3-librepo-1.18.0-2.el10.x86_64
> createrepo_c-libs-1.1.2-3.el10.x86_64
> createrepo_c-1.1.2-3.el10.x86_64
735,736c734,735
< gnome-desktop3-44.0-12.el10.x86_64
< gnome-desktop4-44.0-12.el10.x86_64
---
> gnome-desktop3-44.0-16.el10.x86_64
> gnome-desktop4-44.0-16.el10.x86_64
758c757
< initscripts-10.21-2.el10.x86_64
---
> initscripts-10.25-1.el10.x86_64
763c762
< nss-tools-3.97.0-1.el10.x86_64
---
> nss-tools-3.101.0-2.el10.x86_64
767,768c766,767
< kernel-6.10.0-0.rc5.12.el10.x86_64
< kernel-modules-extra-6.10.0-0.rc5.12.el10.x86_64
---
> kernel-6.10.0-15.el10.x86_64
> kernel-modules-extra-6.10.0-15.el10.x86_64
781,782c780,781
< cryptsetup-2.7.3-1.el10.x86_64
< nmap-ncat-7.93-8.el10.x86_64
---
> cryptsetup-2.7.3-2.el10.x86_64
> nmap-ncat-7.93-7.el10.x86_64

lorax-packages.diff.txt

@M4rtinK
Copy link
Contributor

M4rtinK commented Jul 22, 2024

So I've checked the logs & the main issue is that there is not much to work with - there is a slog message about Wayland being started & then a bit later a message about Wayland filing to start. :P

Hard to do something about this unless we have a reproducer - but I have a theory. Maybe this is related to the issue we are seeing for quite a while, where the output from GNOME Kiosk clobbers the TTY1 console. Maybe that is where all the relevant error message send up in this case ?

Might be a good idea to re-check if we can't run the GNOME tools via systemd units again, as that could possibly result in all the log messages ending up in Journal instead of all over the place on TTY1. :)

@rvykydal
Copy link
Contributor Author

0 occurrences last week.

@rvykydal
Copy link
Contributor Author

4 occurences 06-08-2024 - 12-08-2024

@rvykydal
Copy link
Contributor Author

0 occurrences last week.

Actually there were one occurence 08-05-2024

@rvykydal
Copy link
Contributor Author

After porting the Wayland migration to Fedora (rhinstaller/anaconda#5829) we are seeing the flake also on daily-iso.

@rvykydal rvykydal changed the title rhel10 timeout with "WARNING anaconda:anaconda: display: Wayland startup failed: systemd exited with status 1" rhel10 and daily-iso timeout with "WARNING anaconda:anaconda: display: Wayland startup failed: systemd exited with status 1" Oct 11, 2024
@jkonecny12
Copy link
Member

Hi @jexposit, any idea how we could debug this Wayland failure?

@jexposit
Copy link

There is not much information in the log. It looks like run-in-new-session starts correctly:

23:22:40,314 INFO anaconda:program: Running... /usr/libexec/anaconda/run-in-new-session --user root --service anaconda --session-type wayland --session-class user --vt 6 gnome-kiosk --sm-disable --wayland --no-x11 --wayland-display wl-sysinstall-0

A systemd warning is reported:

23:22:46,919 WARNING anaconda:anaconda: display: Wayland startup failed: systemd exited with status 1

But GNOME Kiosk seems to start:

23:23:28,143 NOTICE rtkit-daemon:Successfully demoted thread 2927 of process 2916 (/usr/bin/gnome-kiosk).

I'm not sure if there could be more information in the TTY or if there is a way to check it.

The only errors I see in the logs are:

23:22:18,681 INFO systemd:Starting udisks2.service - Disk Manager...
23:22:18,779 NOTICE udisksd:udisks daemon version 2.10.1 starting
23:22:18,869 WARNING udisksd:Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to '/dev/sr0': Unexpected sense data returned:#0120000: 70 00 05 00  00 00 00 0a  00 58 00 01  21 04 00 00    p........X..!...#0120010: 00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00    ................#012 (g-io-error-quark, 0)
23:22:18,869 WARNING udisksd:Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to '/dev/sr1': Unexpected sense data returned:#0120000: 70 00 05 00  00 00 00 0a  00 58 00 01  21 04 00 00    p........X..!...#0120010: 00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00    ................#012 (g-io-error-quark, 0)
23:22:18,883 NOTICE udisksd:Acquired the name org.freedesktop.UDisks2 on the system message bus
23:22:18,895 INFO systemd:Started udisks2.service - Disk Manager.
23:22:18,957 NOTICE udisksd:Loading module iscsi ...
23:22:18,966 WARNING udisksd:Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to '/dev/sr0': Unexpected sense data returned:#0120000: 70 00 05 00  00 00 00 0a  00 58 00 01  21 04 00 00    p........X..!...#0120010: 00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00    ................#012 (g-io-error-quark, 0)
23:22:18,966 WARNING udisksd:Error probing device: Error sending ATA command IDENTIFY PACKET DEVICE to '/dev/sr1': Unexpected sense data returned:#0120000: 70 00 05 00  00 00 00 0a  00 58 00 01  21 04 00 00    p........X..!...#0120010: 00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00    ................#012 (g-io-error-quark, 0)
23:22:18,983 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:blivet:failed to get initiator name from iscsi firmware: Failed to call GetFirmwareInitiatorName method on /org/freedesktop/UDisks2/Manager with None arguments: GDBus.Error:org.freedesktop.UDisks2.Error.ISCSI.NoFirmware: No firmware found
23:22:19,026 WARNING org.fedoraproject.Anaconda.Modules.Storage:WARNING:py.warnings:/usr/lib/python3.12/site-packages/gi/overrides/__init__.py:108: DeprecationWarning: 'pkgutil.get_loader' is deprecated and slated for removal in Python 3.14; use importlib.util.find_spec() instead

But I'm not sure if those could be causing the problem. Checking if that error is present in a successful log could help.

I checked in a local VM, latests master and Rawhide image and I couldn't find those logs, but the environment is probably different.

@jkonecny12
Copy link
Member

Hi @vojtechtrefny, I have a feeling I saw these errors regularly but I might be wrong. Do you know something about the udisksd errors?

@jkonecny12
Copy link
Member

jkonecny12 commented Oct 16, 2024

@jexposit based on the code we are getting this error here:
https://github.com/rhinstaller/anaconda/blob/master/pyanaconda/display.py#L365-L367

because socket for Wayland wasn't created:
https://github.com/rhinstaller/anaconda/blob/master/pyanaconda/display.py#L365-L367

which means that the Wayland socket wasn't created.

I don't think udisks is can have much of an impact to this. Is there a way how to get debug data from this? We can create Anaconda PR and run all the tests there to find out the cause. What we can do is to enabled some debug mode when inst.debug is used or something like that for these cases.

@vojtechtrefny
Copy link
Contributor

Hi @vojtechtrefny, I have a feeling I saw these errors regularly but I might be wrong. Do you know something about the udisksd errors?

Yes, that's a 6.10 kernel bug. We also have a workaround in the latest udisks: storaged-project/udisks#732

@jexposit
Copy link

@vojtechtrefny could that kernel bug prevent the creation of files? In this case, it looks like the Wayland socket can not be created.

@vojtechtrefny
Copy link
Contributor

@vojtechtrefny could that kernel bug prevent the creation of files? In this case, it looks like the Wayland socket can not be created.

No. This is a problem with response to some ATA commands and technically speaking the kernel is not broken, it just changed the format of the answer which the standard allows, but some userspace tools (like udisks) were not ready for that. It has no effect on actually reading/writing data from/to the disk.

@jexposit
Copy link

Is there a way how to get debug data from this? We can create Anaconda PR and run all the tests there to find out the cause. What we can do is to enabled some debug mode when inst.debug is used or something like that for these cases.

@jkonecny12 mutter includes several debug options:
https://gitlab.gnome.org/GNOME/mutter/-/blob/main/doc/debugging.md

From a quick look at src/core/util.c -> meta_debug_keys:

static const GDebugKey meta_debug_keys[] = {
  { "focus", META_DEBUG_FOCUS },
  { "workarea", META_DEBUG_WORKAREA },
  { "stack", META_DEBUG_STACK },
  { "sm", META_DEBUG_SM },
  { "events", META_DEBUG_EVENTS },
  { "window-state", META_DEBUG_WINDOW_STATE },
  { "window-ops", META_DEBUG_WINDOW_OPS },
  { "geometry", META_DEBUG_GEOMETRY },
  { "placement", META_DEBUG_PLACEMENT },
  { "ping", META_DEBUG_PING },
  { "keybindings", META_DEBUG_KEYBINDINGS },
  { "sync", META_DEBUG_SYNC },
  { "startup", META_DEBUG_STARTUP },
  { "prefs", META_DEBUG_PREFS },
  { "groups", META_DEBUG_GROUPS },
  { "resizing", META_DEBUG_RESIZING },
  { "shapes", META_DEBUG_SHAPES },
  { "edge-resistance", META_DEBUG_EDGE_RESISTANCE },
  { "dbus", META_DEBUG_DBUS },
  { "input", META_DEBUG_INPUT },
  { "wayland", META_DEBUG_WAYLAND },
  { "kms", META_DEBUG_KMS },
  { "screen-cast", META_DEBUG_SCREEN_CAST },
  { "remote-desktop", META_DEBUG_REMOTE_DESKTOP },
  { "backend", META_DEBUG_BACKEND },
  { "render", META_DEBUG_RENDER },
  { "color", META_DEBUG_COLOR },
  { "input-events", META_DEBUG_INPUT_EVENTS },
  { "eis", META_DEBUG_EIS },
  { "kms-deadline", META_DEBUG_KMS_DEADLINE },
  { "session-management", META_DEBUG_SESSION_MANAGEMENT },
};

I think that startup (MUTTER_DEBUG="startup") could be useful. If you manage to collect more information, ask upstream. You'll find more help there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rhel10 test flake A test flake worth tracking (for more investigation, potenital bug, or to see if more come...)
Projects
None yet
Development

No branches or pull requests

5 participants