-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running a directory of tests is slow to start up. #337
Comments
45 seconds is a lot longer than I've ever seen, even on large codebases. Pub has 460 separate test files, and it only takes about 10s for tests to start executing there. What do you see in the time before tests start executing? At what point does it first print Since it looks like this is something specific to your setup, you may need to do some debugging yourself. I know sky has some special sauce for loading tests that works around the lack of isolates, so maybe @abarth can help out? |
Most of the time is taken without printing anything. /me is happy to help out. |
That suggests that this might be an issue with the VM starting up pub and/or the test runner very slowly for some reason. Essentially everything the test runner does—including invoking the loader plugin—is done after the first print. |
Maybe the issue is that we're invoking via |
Is that coming from a path dependency? That could be a big part of the issue. When you're using an executable from an immutable dependency (hosted or git), pub precompiles it to a snapshot which saves a lot of overhead. But if it's a path dependency where its contents may change, that's not safe, so pub starts it from source which is substantially slower. |
It should be the hosted version. The sky_tools package is in a separate git repo and we depend on it via pub.dartlang.org. |
You can check whether the precompiled executable exists by looking for |
Maybe @apwilson 's setup is different. |
Can you reproduce the 45s startup time? |
It's faster for me:
|
I think I figured it out. When it analyzes the directory it follows the packages symlink If you add followLinks:false to Directory.listSync it finishes in five seconds or so. |
turns out followLinks defaults to true. |
runner/loader.dart:73
|
That's tricky. In general, we do want to follow symlinks—it's specifically package symlinks that we don't want to dive into. Given that we're going to remove package symlinks in the near future (dart-lang/sdk#24112), I don't think this is worth adding a complex workaround for in test. |
So you're saying this issue will just go away soon w/o any code changes? |
Where "soon" is a measurement of a few months, yeah. Once Dart 1.13 is released you'll be able to pass |
Oh, so can we add followLinks:false for now? Or do you think someone is relying on links? If so, as far as I can tell this slows down everyone for the benefit of the few. |
I don't mind making a pull request if you thinks its an acceptable fix/workaround |
It's very hard to tell if someone's relying on non-package symlinks. I don't want to take the risk of breaking people who are using something we have supported in the past and intend to support in the future. I think keeping that functional is worth some short-term slowness. A possible workaround that I would be okay with would be to write a directory lister that recurses manually, but that would be a fair amount of work. |
@apwilson How many files (included those symlinked to) are in your |
I was running it on sky_engine's unit tests: https://github.com/domokit/sky_engine/tree/master/sky/unit/test 'find test | wc -l' says around 71 files |
@nex3 I also noticed just now that each of the sub-folders in sky/unit/test have their own symlinked package directories so it's not just that I'm running through packages once, I'm running through it nine times. |
@nex3 'find -L test | wc -l' shows that ends up being 773927 files in the end. |
On my machine, running the test runner over 100,000 files only takes about 8 seconds to finish listing—which isn't that much slower than running |
@nex3 ' time find -L test | wc -l' takes about 3.3 seconds real time - of course that's not listing them as I only output one line as a result. Printing each line instead takes 221 seconds. This is all from my SSD - though I believe the symlinks are pointing to files on my HDD where my pub cache is. So I guess its a matter of what you do with each file as well as access time. If it only take 30us to process each of the 700k files you're up to the 21 seconds I see on my SSD. |
@nex3 also if it's 8 seconds for 100k files, is it 56 seconds for 700k files? If so then you're worse off than I am with my 45 seconds for 700k files :) |
Internally this is still printing every line, though, since they have to get from
This includes time to print to the terminal, which makes it not a great metric.
This might be part of the problem. Cross-volume linking can cause strange performance. If it's fast for
We do essentially nothing with the vast majority of them—we notice that they're in
Sorry, that was a typo. I was actually looking at 1M files. |
Cross volume performance is actually faster for me than having everything
|
I think you are incurring the I/O costs regardless as the enumeration of the directory gathers each and every file and folder in the packages directory before it even offers you their paths to filter out the package ones. In order to get all those paths I think it would have to read them all, incurring that IO cost even if you later discard them. Plus you're getting an entire FileSystemEntity not just a path to the file so there had to be some I/O per file just to fill in the FileSystemEntity with the appropriate metadata about the file. |
@nex3 Do you think it's feasible to make following links configurable? |
The |
We're definitely incurring a cost for every file, no question. But my point was that the cost is inherent to
A
I really don't want to make an API-visible change that won't make sense in two releases. The only options I like here are manually recursing or waiting until symlinks go away. |
sky_engine's run_tests tool will run all the tests in it's unit/test directory (See https://github.com/domokit/sky_engine/tree/master/sky/unit/test).
Running sky's tests can around 45 seconds on a non-SSD before any tests start executing.
On an SSD the time before tests are run is about 22 seconds.
I traced the problem to executable.dart but didn't proceed further.
The text was updated successfully, but these errors were encountered: