Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cookie-store] Improve cleanup logic #14364

Closed

Conversation

jugglinmike
Copy link
Contributor

While researching for a WPT infrastructure project, I found these tests to be behaving erratically. In a handful of cases, the cleanup invocation was not properly awaited. Since we've been meaning to take advantage of the new "async cleanup" functionality, I took the opportunity to use that API. I'm submitting this in two commits to make the bug fix easier to identify.

To validate this change set, I ran the tests in Chromium (70.0.3538.77) on master (3960fff) and on this branch via the command

 $ xvfb-run --auto-servernum ./wpt run --no-pause --log-mach - --log-mach-level DEBUG --channel dev --no-manifest chrome cookie-store/

The results were identical:

On master:

web-platform-test
~~~~~~~~~~~~~~~~~
Ran 338 checks (35 tests, 303 subtests)
Expected results: 328
Unexpected results: 10
  subtest: 10 (10 fail)

With patch applied:

web-platform-test
~~~~~~~~~~~~~~~~~
Ran 338 checks (35 tests, 303 subtests)
Expected results: 328
Unexpected results: 10
  subtest: 10 (10 fail)

Ensure that the test is not considered "complete" until after the
cleanup logic has been executed.
Use the recently-implemented asynchronous cleanup functionality of
testharness.js to ensure that cleanup logic executes regardless of the
passing/failing status of the test.

Rejected promises during cleanup will not prompt test failure, but they
will cause the harness to report an error.
@jugglinmike
Copy link
Contributor Author

It looks like these tests are still unstable even with my patch applied. I tried to reproduce the instability locally using the following command:

./wpt run --no-pause --log-mach - --channel dev --verify chrome cookie-store

...but that didn't catch anything on my local system (Chromium 70.0.3538.77) or in the Docker image used by Taskcluster (Google Chrome 72.0.3622.0 dev). Even the exact command reported in the Taskcluster logs (too lengthy to post here) didn't demonstrate the instability. Those CRASHes are suspicious, though. I don't think it's a memory limitation because the utilization is documented in the logs:

2018-12-05 01:37:06,210 - tc-run - INFO - Identifying tests affected in range 'HEAD^'...
mem avail: 29462 MiB (96 %), swap free: 0 MiB (0 %)
2018-12-05 01:37:34,339 - tc-run - INFO - Identified 17 affected tests

[...]

PID 1140 | Only local connections are allowed.
mem avail: 27550 MiB (89 %), swap free: 0 MiB (0 %)
PID 1452 | Starting ChromeDriver 2.44.609551 (5d576e9a44fe4c5b6a07e568f1ebc753f1214634) on port 4444

[...]

PID 10044 | Only local connections are allowed.
mem avail: 28787 MiB (93 %), swap free: 0 MiB (0 %)
PID 10320 | Starting ChromeDriver 2.44.609551 (5d576e9a44fe4c5b6a07e568f1ebc753f1214634) on port 4444

It's strange that the container has no swap memory, but I'm not sure that's a problem. It doesn't seem as though we'd need to do any paging with 28 gigabytes of free memory. Unfortunately, I can't replicate that condition via Docker running on my system due to kernel limitations.

@Hexcles do you have thoughts on any of this?

@Hexcles
Copy link
Member

Hexcles commented Dec 5, 2018

Right, our Docker containers don't have swap, which shouldn't have any negative affects. We should never use anything even close to the total available memory (guaranteed to be at least 8GB).

Copy link
Contributor

@pwnall pwnall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Thank you very much for the cleanup! I don't know what's up with the bots, but this looks like a step forward for the tests.

@jugglinmike
Copy link
Contributor Author

This was addressed in Chromium directly and merged to WPT via gh-17949. I've verified that patch addressed all of the instances of asynchronous cleanup that we identified here. It's good that the change eventually made it through stability checks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants