shell: Use the "load" event instead of polling to catch ready frames #21061

mvollmer · 2024-10-01T14:48:13Z

TestPages flakes with Firefox

mvollmer · 2024-10-10T09:42:39Z

The two testFrameReload tests are suspicious: I don't know what requirement or feature they are supposed to test. They still pass, but maybe only by accident.

What they do:

modify the DOM directly to remove the "data-ready" attribute of a iframe
modify the DOM to change the "src" attrr of the iframe
wait for the "data-ready" attribute to come back

Step 2) might simulate the code inside the iframe changing its own location, but that is not something Cockpit supports beyond changing the hash. Both the old and new Shell will just react to the "load" event on the iframe, run their hash tracking stuff, and then put the correct "src" attribute back. The old Shell would run its polling loop, but the new Shell just brings the DOM into sync with its own reality.

So, without knowing what testFrameReload is actually testing, I don't know whether the new Shell broke something.

martinpitt · 2024-10-10T10:18:52Z

FTR, I'm afraid I don't know what that data-ready hackery in the test should achieve. It goes way back to 2015:

commit 4b391a4 (PR test: Fix race in check-pages and check-multi-machine #2856) fixed some race
commit 8dc6fce fixed frame unloading
commit 37aedec is the original one that introduced it

It seems to me that we should test this using the "Active pages" killer in the session menu (with Alt) and then loading it back from the menu -- otherwise it's just internal hackery and meaningless IMHO.

mvollmer · 2024-10-10T10:25:56Z

It seems to me that we should test this using the "Active pages" killer in the session menu (with Alt) and then loading it back from the menu -- otherwise it's just internal hackery and meaningless IMHO.

Hmm, there is also the "Reload this frame" action in the browser menu. I wonder if we survive that.

37aedec is worried about the router, and that a page needs to be unregistered when it is reloaded. (so that the channels open for the previous incarnation of the frame are closed,) I think this happens now via the "unload" event that the router listens to. I'll experiment a bit with "Reload this frame" as well.

mvollmer · 2024-10-10T10:50:59Z

I'll experiment a bit with "Reload this frame" as well.

Alright, reloading the frame would correctly unregister it from the router, but when it was loaded, we didn't do the necessary setup, like attaching the event handler for the idle timeout.

So let's attach our own permanent "load" and "unload" handlers and do the setup on every load. There is now also a teardown that unsets the ready flag, which helps with avoiding flicker in dark mode.

The testFrameReload tests should not touch the data-ready attribute, I'd say, but should instead check that open channels are closed properly. It would be best to reload the frame explicitly instead of messing with the "src" attribute. I would guess bidi can do that, right?

martinpitt · 2024-10-10T11:02:13Z

@mvollmer: Yes, bidi has a reload() API, we already use it in testlib.py's Browser.reload() -- except that this always does self.switch_to_top() first, i.e. it always reloads the entire page. You could just add an option to it to not do that, and then just have it reload the current frame.

mvollmer · 2024-10-10T11:37:33Z

You could just add an option to it to not do that, and then just have it reload the current frame.

I tried, but it seems to reload everything always. Also, I got testlib.Error: unknown error: navigation canceled. Kicking the "src" attribute is not wrong, according to StackOverflow, so...

test/verify/check-shell-multi-machine

mvollmer · 2024-10-10T11:39:25Z

Let's get this in as a followup to #21012.

mvollmer · 2024-10-11T06:20:28Z

pkg/shell/frames.jsx

+            if (frame.ready) {
+                frame.ready = false;
+                state.update();


Hmm.. they should. I'll check.

mvollmer · 2024-10-14T07:34:36Z

Step 2) might simulate the code inside the iframe changing its own location, but that is not something Cockpit supports beyond changing the hash.

Just to clarify: step 2 is very likely only supposed to reload the frame without changing the url, not simulate loading a different url inside the frame.

And only show the frame once it is loaded. This side-steps the problems with showing a white not-yet-loaded iframe in dark mode. The two testFrameReload tests have been changed to more directly check for what we want: The channels open by the previous incarnation of the frame should be closed when it is reloaded. This also means we don't need the "data-ready" attribute anymore in the DOM.

mvollmer · 2024-10-15T09:35:55Z

pkg/shell/frames.jsx

+            if (frame.ready) {
+                frame.ready = false;
+                state.update();


Should be fixed. We need to add the "unload" handler once every "load" event, since it sits on the contentWindow, which is a different one on every load.

This might remove a flake, and it also removed the OSTree special case, somehow.

mvollmer · 2024-10-16T06:33:36Z

TestPages.testBasic and TestPages.testHistory are flaking pretty consistently on Firefox now. I'll check this out more.

mvollmer · 2024-10-16T06:34:37Z

TestPages.testBasic and TestPages.testHistory are flaking pretty consistently on Firefox now. I'll check this out more.

Also testMenuSearch. I hope we have only uncovered existing races in the tests.

martinpitt · 2024-10-16T08:06:12Z

TestPages.testHistory likely runs into https://bugzilla.mozilla.org/show_bug.cgi?id=941146.

-> we need a naughty for this.

Perhaps just skip the test (or that part of the test) on Firefox?

mvollmer · 2024-10-16T08:23:34Z

Same fix as for testBasic also works for testMenuSearch: https://cockpit-logs.us-east-1.linodeobjects.com/pull-21061-0c1cea44-20241016-075719-fedora-40-firefox-expensive/log.html

mvollmer · 2024-10-16T08:26:29Z

Perhaps just skip the test (or that part of the test) on Firefox?

Nah, it's a good test for tricky business (getting browser history right) and we really should run it on Firefox. It does reach the end successfully. There is just the occasional oops. We could allow oopses, but that would be too broad.

We could invent machinery that ala allow_journal_messages, that allows specific oopses, but this here really is a bug in Firefox that might conceivably be fixed some day.

mvollmer · 2024-10-16T08:43:43Z

Amplified testHistory run: https://cockpit-logs.us-east-1.linodeobjects.com/pull-21061-6ffdfcd3-20241016-082854-fedora-40-firefox-expensive/log.html

martinpitt · 2024-10-16T09:16:35Z

The naughty landed -- so a retry of the amplified run should go green now, with the naughty hitting?

martinpitt · 2024-10-16T09:22:54Z

test/verify/check-pages

+        b.go("/system")
+        b.enter_page("/system")
+        b.switch_to_top()


That is really weird -- can you please mark this as # HACK? its really not obvious why this happens. open_lang_modal() already does a switch_to_top(), and that is an entirely trivial operation (just setting an internal variable), so that line should go. More importantly, it's not clear why the language switcher would work from one frame but not the other one. Is that some "mouse click gets blocked by some element" situation again? how does it look like?

Good point about switch_to_top, I'll remove that.

The problem was doing the go("/system") after changing the language. It sometimes didn't have any effect. Changing the language always works.

I can debug this further, but I could never reproduce it locally, so that's going to be a bit of a pain...

(Ahh, I think I know. The router starts working a bit later since the shell rewrite, because I though there can be no messages before we create iframes. But tests actually send messages to implement b.go, and if they come to early after a reload, we'll miss them. So let's wait for the shell to be fully loaded again before navigating.)

Success: https://cockpit-logs.us-east-1.linodeobjects.com/pull-21061-e8608f32-20241016-095240-fedora-40-firefox-expensive/log.html

This removes a flake.

mvollmer · 2024-10-16T10:19:18Z

try to debug test: Disable preload to stabilize TestPages.testBasic #18766 once more

Moved to #21125

mvollmer · 2024-10-16T10:21:49Z

The naughty landed -- so a retry of the amplified run should go green now, with the naughty hitting?

In principle yes, but the naughty pattern is too tight and doesn't trigger on a test named testHistory6...

martinpitt

A work of art! 🙇

mvollmer · 2024-10-16T11:44:07Z

Hmm, TestPages.testMenuSearch on ubuntu also flakes. Some service error show up later in the test although they have been cleared at the beginning. Let's fix that as well...

mvollmer · 2024-10-16T12:01:28Z

Hmm, TestPages.testMenuSearch on ubuntu also flakes. Some service error show up later in the test although they have been cleared at the beginning. Let's fix that as well...

It's fwupd-refresh.service that runs on a timer. It probly fails because it has no network connection.

Let's just reset the failure state immediately before the screenshot, that should make the time window impossible small to hit....

On ubuntu-stable, the fwupd-refresh service would otherwise sometimes manage to fail between resetting the failures at the start of the test and the time we take the screenshot.

martinpitt · 2024-10-16T12:05:39Z

Thanks for tracking that down. That also seems fine to disable in test/vm.install.

mvollmer · 2024-10-16T12:07:20Z

That also seems fine to disable in test/vm.install.

Yeah, but then it's a different service that fails later on...

martinpitt · 2024-10-16T12:09:59Z

Let's just reset the failure state immediately

Ah, you meant a global systemctl reset-failed. Yes, that seems right.

mvollmer · 2024-10-16T13:12:19Z

Oh no, the naughty pattern doesn't match, probably because the text it matches against doesn't actually contain the # testHistory header... Without that header, the pattern is very unspecific, hmm.

mvollmer · 2024-10-16T14:44:14Z

Oh no, the naughty pattern doesn't match, probably because the text it matches against doesn't actually contain the # testHistory header... Without that header, the pattern is very unspecific, hmm.

Luckily @martinpitt figured it out. Extra whitespace, fixed.

martinpitt

Ship it! 🚀

mvollmer added the no-test label Oct 1, 2024

mvollmer mentioned this pull request Oct 1, 2024

shell: Make it all reacty #21012

Merged

mvollmer force-pushed the shell-no-frame-polling branch from 73cab72 to b919586 Compare October 8, 2024 13:26

mvollmer removed the no-test label Oct 10, 2024

mvollmer force-pushed the shell-no-frame-polling branch 3 times, most recently from 28f9e82 to 2d23583 Compare October 10, 2024 09:36

mvollmer force-pushed the shell-no-frame-polling branch from 2d23583 to 7455a9e Compare October 10, 2024 11:35

github-advanced-security bot found potential problems Oct 10, 2024

View reviewed changes

test/verify/check-shell-multi-machine Fixed Show fixed Hide fixed

mvollmer added the blocked label Oct 10, 2024

mvollmer marked this pull request as ready for review October 10, 2024 11:38

mvollmer force-pushed the shell-no-frame-polling branch 2 times, most recently from 341b8d4 to 06995fd Compare October 10, 2024 13:06

mvollmer removed the blocked label Oct 10, 2024

mvollmer commented Oct 11, 2024

View reviewed changes

mvollmer force-pushed the shell-no-frame-polling branch from 6e6015b to 196edc0 Compare October 15, 2024 09:34

mvollmer commented Oct 15, 2024

View reviewed changes

test: Wait for everything to be there before keyboard navigation

Loading
Loading status checks…

0bdc149

This might remove a flake, and it also removed the OSTree special case, somehow.

mvollmer force-pushed the shell-no-frame-polling branch from 196edc0 to 0bdc149 Compare October 15, 2024 13:01

mvollmer force-pushed the shell-no-frame-polling branch from 0c1cea4 to 6ffdfcd Compare October 16, 2024 08:28

martinpitt reviewed Oct 16, 2024

View reviewed changes

test: Wait for Shell to be initialized after changing language

Loading
Loading status checks…

f21b647

This removes a flake.

mvollmer force-pushed the shell-no-frame-polling branch from 6ffdfcd to e8608f3 Compare October 16, 2024 09:52

mvollmer removed the no-test label Oct 16, 2024

mvollmer force-pushed the shell-no-frame-polling branch from e8608f3 to f21b647 Compare October 16, 2024 10:02

martinpitt previously approved these changes Oct 16, 2024

View reviewed changes

mvollmer mentioned this pull request Oct 16, 2024

test: Remove preload workaround in TestPages.testBasic #21125

Merged

test: Clear service errors right before taking the screenshot

Loading
Loading status checks…

54a3ccd

On ubuntu-stable, the fwupd-refresh service would otherwise sometimes manage to fail between resetting the failures at the start of the test and the time we take the screenshot.

mvollmer added the no-test label Oct 16, 2024

mvollmer dismissed martinpitt’s stale review via 36bf06f October 16, 2024 12:06

mvollmer removed the no-test label Oct 16, 2024

mvollmer force-pushed the shell-no-frame-polling branch from 36bf06f to 54a3ccd Compare October 16, 2024 12:17

mvollmer requested a review from martinpitt October 16, 2024 14:44

martinpitt approved these changes Oct 16, 2024

View reviewed changes

martinpitt merged commit fe19608 into cockpit-project:main Oct 16, 2024
80 of 82 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shell: Use the "load" event instead of polling to catch ready frames #21061

shell: Use the "load" event instead of polling to catch ready frames #21061

mvollmer commented Oct 1, 2024 •

edited

Loading

mvollmer commented Oct 10, 2024

martinpitt commented Oct 10, 2024

mvollmer commented Oct 10, 2024

mvollmer commented Oct 10, 2024

martinpitt commented Oct 10, 2024

mvollmer commented Oct 10, 2024

mvollmer commented Oct 10, 2024

mvollmer Oct 11, 2024

mvollmer commented Oct 14, 2024

mvollmer Oct 15, 2024

mvollmer commented Oct 16, 2024

mvollmer commented Oct 16, 2024

martinpitt commented Oct 16, 2024

mvollmer commented Oct 16, 2024

mvollmer commented Oct 16, 2024 •

edited

Loading

mvollmer commented Oct 16, 2024

martinpitt commented Oct 16, 2024

martinpitt Oct 16, 2024

mvollmer Oct 16, 2024 •

edited

Loading

mvollmer Oct 16, 2024

mvollmer Oct 16, 2024

mvollmer commented Oct 16, 2024

mvollmer commented Oct 16, 2024

martinpitt left a comment

mvollmer commented Oct 16, 2024

mvollmer commented Oct 16, 2024

martinpitt commented Oct 16, 2024

mvollmer commented Oct 16, 2024

martinpitt commented Oct 16, 2024

mvollmer commented Oct 16, 2024

mvollmer commented Oct 16, 2024 •

edited

Loading

martinpitt left a comment

shell: Use the "load" event instead of polling to catch ready frames #21061

shell: Use the "load" event instead of polling to catch ready frames #21061

Conversation

mvollmer commented Oct 1, 2024 • edited Loading

mvollmer commented Oct 10, 2024

martinpitt commented Oct 10, 2024

mvollmer commented Oct 10, 2024

mvollmer commented Oct 10, 2024

martinpitt commented Oct 10, 2024

mvollmer commented Oct 10, 2024

mvollmer commented Oct 10, 2024

mvollmer Oct 11, 2024

Choose a reason for hiding this comment

mvollmer commented Oct 14, 2024

mvollmer Oct 15, 2024

Choose a reason for hiding this comment

mvollmer commented Oct 16, 2024

mvollmer commented Oct 16, 2024

martinpitt commented Oct 16, 2024

mvollmer commented Oct 16, 2024

mvollmer commented Oct 16, 2024 • edited Loading

mvollmer commented Oct 16, 2024

martinpitt commented Oct 16, 2024

martinpitt Oct 16, 2024

Choose a reason for hiding this comment

mvollmer Oct 16, 2024 • edited Loading

Choose a reason for hiding this comment

mvollmer Oct 16, 2024

Choose a reason for hiding this comment

mvollmer Oct 16, 2024

Choose a reason for hiding this comment

mvollmer commented Oct 16, 2024

mvollmer commented Oct 16, 2024

martinpitt left a comment

Choose a reason for hiding this comment

mvollmer commented Oct 16, 2024

mvollmer commented Oct 16, 2024

martinpitt commented Oct 16, 2024

mvollmer commented Oct 16, 2024

martinpitt commented Oct 16, 2024

mvollmer commented Oct 16, 2024

mvollmer commented Oct 16, 2024 • edited Loading

martinpitt left a comment

Choose a reason for hiding this comment

mvollmer commented Oct 1, 2024 •

edited

Loading

mvollmer commented Oct 16, 2024 •

edited

Loading

mvollmer Oct 16, 2024 •

edited

Loading

mvollmer commented Oct 16, 2024 •

edited

Loading