Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running a directory of tests is slow to start up. #337

Closed
apwilson opened this issue Sep 11, 2015 · 32 comments
Closed

Running a directory of tests is slow to start up. #337

apwilson opened this issue Sep 11, 2015 · 32 comments
Labels
closed-not-planned Closed as we don't intend to take action on the reported issue

Comments

@apwilson
Copy link

sky_engine's run_tests tool will run all the tests in it's unit/test directory (See https://github.com/domokit/sky_engine/tree/master/sky/unit/test).

Running sky's tests can around 45 seconds on a non-SSD before any tests start executing.

On an SSD the time before tests are run is about 22 seconds.

I traced the problem to executable.dart but didn't proceed further.

@nex3 nex3 added type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) needs-info Additional information needed from the issue author labels Sep 11, 2015
@nex3
Copy link
Member

nex3 commented Sep 11, 2015

45 seconds is a lot longer than I've ever seen, even on large codebases. Pub has 460 separate test files, and it only takes about 10s for tests to start executing there.

What do you see in the time before tests start executing? At what point does it first print loading path/to/test.dart...?

Since it looks like this is something specific to your setup, you may need to do some debugging yourself. I know sky has some special sauce for loading tests that works around the lack of isolates, so maybe @abarth can help out?

@abarth
Copy link
Contributor

abarth commented Sep 11, 2015

Most of the time is taken without printing anything. /me is happy to help out.

@nex3
Copy link
Member

nex3 commented Sep 11, 2015

That suggests that this might be an issue with the VM starting up pub and/or the test runner very slowly for some reason. Essentially everything the test runner does—including invoking the loader plugin—is done after the first print.

@abarth
Copy link
Contributor

abarth commented Sep 11, 2015

Maybe the issue is that we're invoking via pub run sky_tools:sky_test and that's not set up properly....

@nex3
Copy link
Member

nex3 commented Sep 11, 2015

Is that coming from a path dependency? That could be a big part of the issue. When you're using an executable from an immutable dependency (hosted or git), pub precompiles it to a snapshot which saves a lot of overhead. But if it's a path dependency where its contents may change, that's not safe, so pub starts it from source which is substantially slower.

@abarth
Copy link
Contributor

abarth commented Sep 11, 2015

It should be the hosted version. The sky_tools package is in a separate git repo and we depend on it via pub.dartlang.org.

@nex3
Copy link
Member

nex3 commented Sep 11, 2015

You can check whether the precompiled executable exists by looking for .pub/bin/sky_tools/sky_test.dart.snapshot.

@abarth
Copy link
Contributor

abarth commented Sep 11, 2015

abarth@abarth3:/src/domokit/sky_engine/src/sky/unit$ ls -larth .pub/bin/sky_tools/sky_test.dart.snapshot
-rw-r----- 1 abarth eng 2.1M Sep  9 17:26 .pub/bin/sky_tools/sky_test.dart.snapshot

Maybe @apwilson 's setup is different.

@nex3
Copy link
Member

nex3 commented Sep 11, 2015

Can you reproduce the 45s startup time?

@abarth
Copy link
Contributor

abarth commented Sep 11, 2015

It's faster for me:

abarth@abarth3:/src/domokit/sky_engine/src$ time ./sky/tools/run_tests 
00:02 +88: All tests passed!                                                                       

real    0m5.105s
user    0m45.321s
sys 0m8.270s

@apwilson
Copy link
Author

I think I figured it out. When it analyzes the directory it follows the packages symlink

If you add followLinks:false to Directory.listSync it finishes in five seconds or so.

@apwilson
Copy link
Author

turns out followLinks defaults to true.

@apwilson
Copy link
Author

runner/loader.dart:73

return mergeStreams(new Directory(dir).listSync(recursive: true, followLinks:false)

@nex3
Copy link
Member

nex3 commented Sep 11, 2015

That's tricky. In general, we do want to follow symlinks—it's specifically package symlinks that we don't want to dive into.

Given that we're going to remove package symlinks in the near future (dart-lang/sdk#24112), I don't think this is worth adding a complex workaround for in test.

@nex3 nex3 closed this as completed Sep 11, 2015
@apwilson
Copy link
Author

So you're saying this issue will just go away soon w/o any code changes?

@nex3
Copy link
Member

nex3 commented Sep 11, 2015

Where "soon" is a measurement of a few months, yeah. Once Dart 1.13 is released you'll be able to pass --no-package-symlinks to pub get and the test package will still work; in 1.14, the symlinks will no longer be generated by default.

@apwilson
Copy link
Author

Oh, so can we add followLinks:false for now? Or do you think someone is relying on links? If so, as far as I can tell this slows down everyone for the benefit of the few.

@apwilson
Copy link
Author

I don't mind making a pull request if you thinks its an acceptable fix/workaround

@nex3
Copy link
Member

nex3 commented Sep 11, 2015

It's very hard to tell if someone's relying on non-package symlinks. I don't want to take the risk of breaking people who are using something we have supported in the past and intend to support in the future. I think keeping that functional is worth some short-term slowness.

A possible workaround that I would be okay with would be to write a directory lister that recurses manually, but that would be a fair amount of work.

@nex3 nex3 added closed-not-planned Closed as we don't intend to take action on the reported issue and removed type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) needs-info Additional information needed from the issue author labels Sep 11, 2015
@nex3
Copy link
Member

nex3 commented Sep 11, 2015

@apwilson How many files (included those symlinked to) are in your test/ directory? It seems strange that it would take dart:io 45 seconds to iterate over a directory.

@apwilson
Copy link
Author

I was running it on sky_engine's unit tests:

https://github.com/domokit/sky_engine/tree/master/sky/unit/test

'find test | wc -l' says around 71 files

@apwilson
Copy link
Author

@nex3 I also noticed just now that each of the sub-folders in sky/unit/test have their own symlinked package directories so it's not just that I'm running through packages once, I'm running through it nine times.

@apwilson
Copy link
Author

@nex3 'find -L test | wc -l' shows that ends up being 773927 files in the end.

@nex3
Copy link
Member

nex3 commented Sep 11, 2015

On my machine, running the test runner over 100,000 files only takes about 8 seconds to finish listing—which isn't that much slower than running find explicitly. Maybe there's something else going on? Or is your system resource-constrained? How long does it take you to run find?

@apwilson
Copy link
Author

@nex3 ' time find -L test | wc -l' takes about 3.3 seconds real time - of course that's not listing them as I only output one line as a result.

Printing each line instead takes 221 seconds.

This is all from my SSD - though I believe the symlinks are pointing to files on my HDD where my pub cache is.

So I guess its a matter of what you do with each file as well as access time. If it only take 30us to process each of the 700k files you're up to the 21 seconds I see on my SSD.

@apwilson
Copy link
Author

@nex3 also if it's 8 seconds for 100k files, is it 56 seconds for 700k files? If so then you're worse off than I am with my 45 seconds for 700k files :)

@nex3
Copy link
Member

nex3 commented Sep 11, 2015

' time find -L test | wc -l' takes about 3.3 seconds real time - of course that's not listing them as I only output one line as a result.

Internally this is still printing every line, though, since they have to get from find to wc.

Printing each line instead takes 221 seconds.

This includes time to print to the terminal, which makes it not a great metric.

This is all from my SSD - though I believe the symlinks are pointing to files on my HDD where my pub cache is.

This might be part of the problem. Cross-volume linking can cause strange performance. If it's fast for find it should be fast for Dart, though.

So I guess its a matter of what you do with each file as well as access time. If it only take 30us to process each of the 700k files you're up to the 21 seconds I see on my SSD.

We do essentially nothing with the vast majority of them—we notice that they're in /packages/ directories and discard them immediately. Presumably any in-process string checks are dwarfed by IO costs.

also if it's 8 seconds for 100k files, is it 56 seconds for 700k files? If so then you're worse off than I am with my 45 seconds for 700k files :)

Sorry, that was a typo. I was actually looking at 1M files.

@apwilson
Copy link
Author

Cross volume performance is actually faster for me than having everything
on the hdd - the 45 second metric was for a single hdd volume. 21 seconds
is the mixed ssd and hdd.
On Sep 11, 2015 4:36 PM, "Natalie Weizenbaum" [email protected]
wrote:

' time find -L test | wc -l' takes about 3.3 seconds real time - of course
that's not listing them as I only output one line as a result.

Internally this is still printing every line, though, since they have to
get from find to wc.

Printing each line instead takes 221 seconds.

This includes time to print to the terminal, which makes it not a great
metric.

This is all from my SSD - though I believe the symlinks are pointing to
files on my HDD where my pub cache is.

This might be part of the problem. Cross-volume linking can cause strange
performance. If it's fast for find it should be fast for Dart, though.

So I guess its a matter of what you do with each file as well as access
time. If it only take 30us to process each of the 700k files you're up to
the 21 seconds I see on my SSD.

We do essentially nothing with the vast majority of them—we notice that
they're in /packages/ directories and discard them immediately.
Presumably any in-process string checks are dwarfed by IO costs.

also if it's 8 seconds for 100k files, is it 56 seconds for 700k files? If
so then you're worse off than I am with my 45 seconds for 700k files :)

Sorry, that was a typo. I was actually looking at 1M files.


Reply to this email directly or view it on GitHub
#337 (comment).

@apwilson
Copy link
Author

@nex3

We do essentially nothing with the vast majority of them—we notice that they're in /packages/ directories and discard them immediately. Presumably any in-process string checks are dwarfed by IO costs.

I think you are incurring the I/O costs regardless as the enumeration of the directory gathers each and every file and folder in the packages directory before it even offers you their paths to filter out the package ones.

In order to get all those paths I think it would have to read them all, incurring that IO cost even if you later discard them.

Plus you're getting an entire FileSystemEntity not just a path to the file so there had to be some I/O per file just to fill in the FileSystemEntity with the appropriate metadata about the file.

@apwilson
Copy link
Author

@nex3 Do you think it's feasible to make following links configurable?

@zoechi
Copy link

zoechi commented Sep 14, 2015

The packages symlinks are about to go away anyway. dart-lang/sdk#24112

@nex3
Copy link
Member

nex3 commented Sep 14, 2015

I think you are incurring the I/O costs regardless as the enumeration of the directory gathers each and every file and folder in the packages directory before it even offers you their paths to filter out the package ones.

In order to get all those paths I think it would have to read them all, incurring that IO cost even if you later discard them.

We're definitely incurring a cost for every file, no question. But my point was that the cost is inherent to Directory.listSync, not to the code that's using the results.

Plus you're getting an entire FileSystemEntity not just a path to the file so there had to be some I/O per file just to fill in the FileSystemEntity with the appropriate metadata about the file.

A FileSystemEntity has no metadata other than the file path and its own type; everything else is read dynamically from the filesystem.

Do you think it's feasible to make following links configurable?

I really don't want to make an API-visible change that won't make sense in two releases. The only options I like here are manually recursing or waiting until symlinks go away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-not-planned Closed as we don't intend to take action on the reported issue
Projects
None yet
Development

No branches or pull requests

4 participants