Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow patial update to wpt manifest #47588

Merged
merged 2 commits into from
Jan 8, 2025
Merged

Conversation

WeizhongX
Copy link
Contributor

@WeizhongX WeizhongX commented Aug 13, 2024

To run WPTs locally, developers usually only run a few of tests each time. Updating WPT manifest for the whole WPT Repo is not necessary and took ~3 seconds on a performant develop machine. And this can take forever if the developer is using a remote file system.

This change allows updating WPT manifest for tests only intended to be run. This brings down the test time for a few tests to ~0.5 seconds.

With this patch applied, we did not see a time increase when updating manifest for the entire repo.

Additional work is needed to plug in this into the test runner.

@WeizhongX
Copy link
Contributor Author

Folks, can you take a look?

Tests are running fine in chromium (https://chromium-review.googlesource.com/c/chromium/src/+/5991965).

Copy link
Contributor

@jonathan-j-lee jonathan-j-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think this lines up with @jgraham's feedback in #47350.

@@ -79,6 +81,9 @@ def create_parser() -> argparse.ArgumentParser:
parser.add_argument(
"--no-parallel", dest="parallel", action="store_false", default=True,
help="Do not parallelize building the manifest")
parser.add_argument('tests',
nargs='*',
help='Paths of test files or directories to update.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help='Paths of test files or directories to update.')
help=('Test files or directories to update. '
'Omit to update all items under the test root.'))

for filename, path_stat in filenames:
path = os.path.join(subdir,
os.path.join(dirpath, filename).decode("utf8"))
if mtime_cache is None or mtime_cache.updated(path, path_stat):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Put the shared file case in a internal helper (e.g., yield self._make_file_info(path, path_stat))?

jgraham
jgraham previously requested changes Nov 7, 2024
Copy link
Contributor

@jgraham jgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs some tests, and some profiling to indicate no perf regression in the case where we don't pass in any paths.

tools/manifest/vcs.py Show resolved Hide resolved
@@ -79,6 +81,9 @@ def create_parser() -> argparse.ArgumentParser:
parser.add_argument(
"--no-parallel", dest="parallel", action="store_false", default=True,
help="Do not parallelize building the manifest")
parser.add_argument('tests',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably I'm missing something, but I don't see where this command line argument is being used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said in the description, Additional work is needed to plug in this into the test runner. This change only make patial update in wpt manifest command possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably I'm missing something, but I don't see where this command line argument is being used.

Line 49, above:

paths_to_update=kwargs['tests'],

@WeizhongX
Copy link
Contributor Author

I think this needs some tests, and some profiling to indicate no perf regression in the case where we don't pass in any paths.

Is there a reason why this could cause perf regression? I can get the data though.

@WeizhongX WeizhongX changed the title Allow patially update to wpt manifest Allow patial update to wpt manifest Nov 7, 2024
@WeizhongX WeizhongX force-pushed the partial-update-manifest branch 2 times, most recently from f3bc9aa to b4aa684 Compare November 13, 2024 21:19
@WeizhongX
Copy link
Contributor Author

I tried to run time ./wpt manifest -v --no-download --tests-root=. --url-base=/ with and without this change, and get results as below:

before:
real 0m9.799s
user 1m36.979s
sys 0m13.197s

After:
real 0m9.919s
user 1m36.533s
sys 0m13.205s

So real time increased about 0.12s. If this is consistent, this probably is due to the instroduce of _make_file_info, which is called inside the loop, but guess this is OK?

I think this needs some tests, and some profiling to indicate no perf regression in the case where we don't pass in any paths.

Is there a reason why this could cause perf regression? I can get the data though.

@@ -79,6 +81,9 @@ def create_parser() -> argparse.ArgumentParser:
parser.add_argument(
"--no-parallel", dest="parallel", action="store_false", default=True,
help="Do not parallelize building the manifest")
parser.add_argument('tests',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably I'm missing something, but I don't see where this command line argument is being used.

Line 49, above:

paths_to_update=kwargs['tests'],

@@ -45,6 +46,7 @@ def update_from_cli(**kwargs: Any) -> None:
manifest.load_and_update(tests_root,
path,
kwargs["url_base"],
paths_to_update=kwargs['tests'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The rest of this file seems to use double-quotes. Please match the local style.

manifest_path: Optional[Text] = None,
rebuild: bool = False) -> None:
self.tests_root = tests_root
self.url_base = url_base
self.paths_to_update = paths_to_update or ['']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting to [''] seems slightly surprising. Do we not want to default to [self.tests_root]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is relevant to the other comments in __iter__. Similar to the response there, I'd like to keep this relative to the tests root.


def __iter__(self) -> Iterator[Tuple[Text, Optional[Text], bool]]:
for path_to_update in self.paths_to_update:
path = os.path.join(self.tests_root, path_to_update)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to be joining to tests_root, or do we want to be joining to the current working directory? Certainly they'll frequently be the same, but this feels maybe slightly surprising?

From the CLI, I'd expect to be able to pass an absolute path in here and have it work — or give an error if it wasn't within the test root.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like path_to_update to be relative to tests_root.

If we want to accept absolute paths from CLI, I think we can handle that when process the command line arguments. But after that we only need to keep the part relative to tests_root, as paths out of test root can not be handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We eventually want to yield something that is relative to tests_root, so keep path_to_update relative to the tests_root seems more natural?

def __iter__(self) -> Iterator[Tuple[Text, Optional[Text], bool]]:
def _make_file_info(self,
path: Text,
path_stat) -> Tuple[Text, Optional[Text], bool]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
path_stat) -> Tuple[Text, Optional[Text], bool]:
path_stat: stat_result) -> Tuple[Text, Optional[Text], bool]:

To run WPTs locally, developers usually only run a few of tests
each time. Updating WPT manifest for the whole WPT Repo is not
necessary and took ~3 seconds on a performant develop machine.
And this can take forever if the developer is using a remote file
system.

This change allows updating WPT manifest for tests only intended
to be run. This brings down the test time for a few tests to ~0.5
seconds.

Additional work is needed to plug in this into the test runner.
@WeizhongX WeizhongX force-pushed the partial-update-manifest branch from b4aa684 to bd7e8be Compare December 31, 2024 00:16
@WeizhongX WeizhongX force-pushed the partial-update-manifest branch from bd7e8be to f367c07 Compare January 2, 2025 18:47
@WeizhongX
Copy link
Contributor Author

I think this needs some tests

I tried to use pyfakefs, then test manifest.load_and_update. This turns out to be difficult because pyfakefs always returns str (or whatever is used when creating the fake files), but the real code uses both str and bytes.

Do you have any suggestion on this? @jgraham

@WeizhongX WeizhongX requested a review from jgraham January 6, 2025 19:02
@WeizhongX
Copy link
Contributor Author

@gsnedders can you take a look at the updates? Thanks!

@WeizhongX WeizhongX dismissed jgraham’s stale review January 8, 2025 17:43

Coments addressed

@WeizhongX WeizhongX merged commit 947ece2 into master Jan 8, 2025
33 checks passed
@WeizhongX WeizhongX deleted the partial-update-manifest branch January 8, 2025 17:44
sadym-chromium pushed a commit that referenced this pull request Jan 14, 2025
* Allow patially update to wpt manifest

To run WPTs locally, developers usually only run a few of tests
each time. Updating WPT manifest for the whole WPT Repo is not
necessary and took ~3 seconds on a performant develop machine.
And this can take forever if the developer is using a remote file
system.

This change allows updating WPT manifest for tests only intended
to be run. This brings down the test time for a few tests to ~0.5
seconds.

Additional work is needed to plug in this into the test runner.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants