Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pub: Add option to disable precompiling apps in dependencies #21765

Closed
mkustermann opened this issue Dec 2, 2014 · 19 comments
Closed

Pub: Add option to disable precompiling apps in dependencies #21765

mkustermann opened this issue Dec 2, 2014 · 19 comments
Labels
type-enhancement A request for a change that isn't a bug

Comments

@mkustermann
Copy link
Member

There are use cases where one would like to have "pub get --offline" to be as fast as possible (assuming the pub cache has been already populated).

Currently "pub get" seems to be pre-compiling "pub run"-able scripts in dependent packages. This will increase the optimistic runtime of "pub get --offline" significantly.

There are several options:

a) Pre-compile binaries into the pub-cache:
  * Having N applications using package:intl at the same version will currently pre-compile intl:extract_to_arb & intl:generate_from_arb N times.
    => This is clearly wasteful work!
    => One could pre-compile bin/*.dart apps when their packages get added to the PUB_CACHE.

This might be not an option, because the N applications might use the same version of package:intl, but different transitive dependencies of package:intl.
On the other hand, a package specifies it's dependencies and as long as these dependencies are met, the command line apps in bin/*.dart should behave the same way.

[I assume the pub team is not in favor of option a)?]

b) Adding an "--no-precompile" option to "pub get".

@munificent
Copy link
Member

Currently "pub get" seems to be pre-compiling "pub run"-able scripts in dependent packages. This will increase the optimistic runtime of "pub get --offline" significantly.

You say "will" here. Is this speculative, or do you have real-world data where the perf is unsatisfactory? Can you give us some details of which packages are slow to pub get because of this?

This might be not an option, because the N applications might use the same version of package:intl, but different transitive dependencies of package:intl.

Correct, that's why we don't do a). We could possibly cache it if the set of transitive dependencies are all the same version, but calculating that is tricky and it's not clear how often you'd get to take advantage of sharing after doing that.

b) Adding an "--no-precompile" option to "pub get".

That's certainly an easy thing to do. We try to minimize the number of command-line options because they easily overwhelm users and it's very hard to remove them once added. (For a concrete example, see the mess that is git's command-line UI). But, if we get consistent feedback from users that this is a problem, I'm not opposed to adding this.


Removed Type-Defect, Priority-Unassigned labels.
Added Type-Enhancement, Priority-Medium labels.

@mkustermann
Copy link
Member Author

The use case we have right now is running server applications inside a docker container. Every time the server application changes, a file system watcher picks it up and rebuilds a docker image which is then run.

Building docker images works in a layered fashion and layers can be cached in certain situations. In our situation we can cache the external package layer. So every time the server code changes a new docker image is built. Which involves invoking "pub get --offline".

Obviously the developer cycle is very dependent on how long it takes to execute commands to build the image. Which includes pub.

The pre-compiled entry points are useless for this use case, but we still pay the price/time for them.

Here's an example

$ cat pubspec.yaml
name: foo
version: 0.1.0
author: bar
dependencies:
  archive: any
  googleapis_auth: any
  intl: any

$ time pub get --offline
Resolving dependencies...
Got dependencies!
Precompiling dependencies...
Loading source assets...
Precompiled googleapis_auth.
Precompiling executables...
Loading source assets...
Precompiled archive:tar.
Precompiled intl:extract_to_arb.
Precompiled intl:generate_from_arb.
pub get --offline 2.04s user 0.25s system 163% cpu 1.399 total

$ time pub get --offline
Resolving dependencies...
Got dependencies!
pub get --offline 0.40s user 0.06s system 100% cpu 0.461 total

a) Notice that the "pub get" time is nearly 3 times as long as without pre compilation!

b) Notice that it "Precompiled googleapis_auth" - which is completely non-sensical as far as I can tell.
=> Filed Issue #21777 for this. Maybe I'm overlooking something here?

There might be ways to work around this for us and we are looking into alternatives for making the development cycle faster, but "pub get --offline" contributes sometimes significantly to it ATM.

As a side note:
In my opinion it should be possible to use our package manager to manage package dependencies without any fancy extras one might not care about. So maybe the precompilation should've been an opt-in mechanism (e.g. pub get --precompile / pub precompile) or it could be done in a lazy manner (i.e. only precompile on first invocation), which is not such a bad idea, given that every "pub run" could trigger a re-compilation anyway (e.g. if one updated the Dart SDK).

@DartBot
Copy link

DartBot commented Dec 3, 2014

This comment was originally written by @zoechi


I'm currently working on finding a way around this issue (same scenario Docker for Managed VMs)

As far as I understand this, the pub get --offline command is run to fix the symlinks which should go away anyway (see http://dartbug.com/21749).

I don't think getting rid of symlinks would make this issue redundant in general but maybe for the mentioned use case. (see http://darbug.com/21777)

Sidenote:
My current workaround is to have a startup script that generates a file containing a list of the symlinks in the packages directory in a form that when "sourcing" (bash) it, it creates the same symlinks in the packages directory inside the Docker image. I ADD it to the image and execute it. So far this seems only necessary as a workaround for 21749)

@munificent
Copy link
Member

So every time the server code changes a new docker image is built. Which involves invoking "pub get --offline".

Ah, interesting. This context helps. In general, we didn't design pub get (offline or not) to be part of a developer's regular iteration loop. Can you give me some more details on why it has to be run every time? Is this because of symlinks?

it could be done in a lazy manner (i.e. only precompile on first invocation), which is not such a bad idea, given that every "pub run" could trigger a re-compilation anyway (e.g. if one updated the Dart SDK).

I like this idea. What do you think, Natalie?

@nex3
Copy link
Member

nex3 commented Dec 3, 2014

it could be done in a lazy manner (i.e. only precompile on first invocation), which is not such a bad idea, given that every "pub run" could trigger a re-compilation anyway (e.g. if one updated the Dart SDK).

I like this idea. What do you think, Natalie?

I'm not a big fan. We know on "pub get" that the executables won't change until the next "pub get" (or SDK upgrade), and we often already have a barback context spun up to precompile source snapshots.

If "pub get" is just being used to regenerate symlinks, would it be possible to use "pub run" instead of running the dart executable directly? It works without symlinks.

@DartBot
Copy link

DartBot commented Dec 3, 2014

This comment was originally written by @zoechi


I think pub run wouldn't make sense in this case because it is for development and I assume this would make it hard to use the debugger.

Just running the app worked without the symlinks (as far as I remember) but when I connected the debugger the app crashed because it couldn't access the source files in the packages directories (see also http://dartbug.com/21749)

I still think that mounting the source directory into the Docker image would be a way better solution - currently not supported by gcloud preview app ...though.

@mkustermann
Copy link
Member Author

Can you give me some more details on why it has to be run every time? Is this because of symlinks?

Yes, one reason is symlinks. You can look at the simple Dockerfile our users can use (see here: https://github.com/dart-lang/dart_docker/blob/master/runtime/Dockerfile):

  FROM google/dart-runtime-base

  WORKDIR /app

  ONBUILD ADD pubspec.* /app/
  ONBUILD RUN pub get
  ONBUILD ADD . /app/
  ONBUILD RUN pub get --offline

Every line in this file creates a new docker image layer. It can be cached if none of the files added before it changed:
Step 1) - "ADD pubspec.* /app" - will add the pubspec.{lock,yaml} files (the result will be cached as long as these two files don't change)
Step 2) - "RUN pub get" - will fetch all external dependencies and put them into the pub cache. (Very Important => Result will be cached as long as pubspec.* doesn't change)
Step 3) - "ADD . /app" - will add all application files (this will change every time the developer changes code, but is very fast)
Step 4) - "RUN pub get --offline" - will create symlinks for all entry points inside e.g. /app/bin

If "pub get" is just being used to regenerate symlinks, would it be possible to use "pub run" instead of running the dart executable directly?

Theoretically, yes. Practically, as Guenter says, we need to be able to control Dart VM flags, debugger port, observatory, checked mode, ... I don't know whether 'pub run' runs apps in a subprocess / in a different isolate / ..., but I guess it doesn't support this kind of control?

@nex3
Copy link
Member

nex3 commented Dec 4, 2014

"RUN pub get --offline" - will create symlinks for all entry points inside e.g. /app/bin

I'm a little confused here. Why is this "pub get" recompiling the snapshots? It shouldn't do so if the pubspec and lockfile are unchanged from the previous run.

Practically, as Guenter says, we need to be able to control Dart VM flags, debugger port, observatory, checked mode, ... I don't know whether 'pub run' runs apps in a subprocess / in a different isolate / ..., but I guess it doesn't support this kind of control?

Currently "pub run" runs a subprocess, but we want to switch to an isolate at some point for various reasons. If passing flags to it is important, though, we could make that happen; feel free to file an issue. I've also filed issue #21791 to track adding the ability to spawn an isolate with different flags.

@mkustermann
Copy link
Member Author

"RUN pub get --offline" - will create symlinks for all entry points inside e.g. /app/bin

I'm a little confused here. Why is this "pub get" recompiling the snapshots?
It shouldn't do so if the pubspec and lockfile are unchanged from the previous run.

I haven't implemented pub/barback, so I don't know how it works and cannot fully answer this question. But I can tell you why it might recompile it: Step 3) in the above base image Dockerfile adds the whole application folder with everything inside it (i.e. including pubspec.). If the "pub get" pre-compilation logic is based on timestamps, then the overriding of the pubspec. files with the same content might cause a newer timestamp and trigger a recompilation.

Now you might argue, it's our problem, because we don't keep the timestamps the same, but there is no good alternative. The workaround is to make several commands for all folders/files one wants to add in Step 3) (except pubspec.*) but this information is not known to us. Our base docker image is therefore not able to use "ONBUILD " instructions.

=> In short: This puts the burden of doing this to our users. And not just that, it is also very subtle and error prone and can cause a lot of wasted time (I speak from experience).

... I've also filed issue #21791 to track adding the ability to spawn an isolate with different flags. ...

That might help at some point, but:

@mkustermann
Copy link
Member Author

"RUN pub get --offline" - will create symlinks for all entry points inside e.g. /app/bin

I'm a little confused here. Why is this "pub get" recompiling the snapshots?
It shouldn't do so if the pubspec and lockfile are unchanged from the previous run.

... If the "pub get" pre-compilation logic is based on timestamps, ...

Actually, I have played with this a bit more, and it doesn't look like this is the case.

$ docker build .
Sending build context to Docker daemon 2.206 MB
Sending build context to Docker daemon
Step 0 : FROM google/dart-runtime

Executing 5 build triggers

Trigger 0, ADD pubspec.yaml /app/
Step 0 : ADD pubspec.yaml /app/
 ---> Using cache
Trigger 1, ADD pubspec.lock /app/
Step 0 : ADD pubspec.lock /app/
 ---> Using cache
Trigger 2, RUN pub get
Step 0 : RUN pub get
 ---> Using cache
Trigger 3, ADD . /app/
Step 0 : ADD . /app/
Trigger 4, RUN pub get --offline
Step 0 : RUN pub get --offline
 ---> Running in 1862a3ba6500
Resolving dependencies...
Got dependencies!
Precompiling executables...
Loading source assets...
Precompiled intl:extract_to_arb.
Precompiled intl:generate_from_arb.
 ---> 76d663ecf5a4
Removing intermediate container 070edd469493
Removing intermediate container 1862a3ba6500
Successfully built 76d663ecf5a4

The reason seems to be that the "ADD . /app" will also add the ".pub" to the docker image, which overrides the old previously cached files.
To prevent that, one needs to put a ".pub" or "app/.pub" into a ".dockerignore" file. One more step everybody needs to do, to get a faster developer cycle.

@guenther:
Could you try putting ".pub" and/or "app/.pub" into a ".dockerignore" file and see whether your precompile steps go away?

@DartBot
Copy link

DartBot commented Dec 5, 2014

This comment was originally written by @zoechi


@martin
I reverted to the current google/dart-runtime image and now (on startup) the first pub get takes about 30 sec and the second 5 sec. When I then modify the source while gcloud/Docker is running the second pub get takes about 5 sec (and about 5 sec delay until pub get starts running) ... (10 - 15 sec for each source modification)

... no matter whether I have .pub in .dockerignore or not.

I still get this error:

Error on line 20, column 3 of ../root/.pub-cache/hosted/pub.dartlang.org/bwu_polymer_routing-0.1.0/pubspec.yaml: Error loading transformer: Invalid arguments(s): sdkDirectory must be provided.

  • di
      ^^^

I had to simplify my project to make it work with the provided image (remove path dependency, ...)
In my custom Dockerfile I can use
RUN ln -s /usr/lib/dart /usr/lib/dart/bin/dart-sdk to prevent above error message but that isn't possible with the provided image because the added RUN ... is only executed after the ONBUILD ... statements.

I'll investigate further with a custom image while using the provided image as starting point ...

Not directly related:
I didn't get the mentioned port already in use error but instead most of the time the image/container recreation only works only once when I modify a source file, then I have to restart gcloud to make it work again.

@nex3
Copy link
Member

nex3 commented Dec 9, 2014

The reason seems to be that the "ADD . /app" will also add the ".pub" to the docker image, which overrides the old previously cached files.
To prevent that, one needs to put a ".pub" or "app/.pub" into a ".dockerignore" file. One more step everybody needs to do, to get a faster developer cycle.

This is all just a stop-gap anyway, right, until issue #15103 is fixed? I don't think this is an onerous step.

Alternately, you could provide a simple executable that just symlinks packages directories throughout the application and does nothing else. This may be cleaner than relying on a side effect of "pub get --offline".

@mkustermann
Copy link
Member Author

This is all just a stop-gap anyway, right, until issue #15103 is fixed? I don't think this is an onerous step.

Yes, it's unfortunately always about working around issues.

Alternately, you could provide a simple executable that just symlinks packages directories

This is not an option IMHO - seems like a too big hack for me. It's "pub get" s job to fetch dependencies and configure a package-root which the VM can understand.

This may be cleaner than relying on a side effect of "pub get --offline".

It's not a side-effect, but half of "pub get" s job ATM (when Issue #15103 is fixed, the job of "pub get" will be to create a packages.json file [or something like that] I assume)

@nex3
Copy link
Member

nex3 commented Dec 9, 2014

This is not an option IMHO - seems like a too big hack for me. It's "pub get" s job to fetch dependencies and configure a package-root which the VM can understand.

And it does do those things, but it's not "pub get"'s job to be fast enough for the critical development loop. I feel like using pub in a situation where you actively don't want it to do dependency resolution is as much of a hack as using a custom executable. You're entitled to your own sense of aesthetics, of course, but I don't think that's compelling enough to warrant a change in pub when better-scoped alternatives exist.

@DartBot
Copy link

DartBot commented Dec 9, 2014

This comment was originally written by @zoechi


I also have troubles seeing to create the symlinks with another tools as a good alternative. It's what I'm currently doing but it's an ugly hack.
To do this properly it would probably need to parse the pubspeck.lock and interpret all possible values like path, git, hosted dependencies, and probably other things too.
All things that are already done and tested and maintained in pub.

@nex3
Copy link
Member

nex3 commented Dec 9, 2014

Pub provides a handy command for this: "pub list-package-dirs" will print a JSON-formatted map of package names to locations on disk where every dependency resides. Unlike "pub get", it is intended to be used in a development loop, and is accordingly fast.

@DartBot
Copy link

DartBot commented Dec 10, 2014

This comment was originally written by @zoechi


That is very helpful, thanks!

@DartBot
Copy link

DartBot commented Dec 11, 2014

This comment was originally written by @zoechi


Precompiles are now gone in my setup for restarts after source modifications (after the initial startup)
Sorry, I don't know if this is because of installed updates or changes I made in the used Dockerfiles.

@DartBot
Copy link

DartBot commented Jun 5, 2015

This issue has been moved to dart-lang/pub#1199.

@DartBot DartBot closed this as completed Jun 5, 2015
@kevmoo kevmoo added type-enhancement A request for a change that isn't a bug and removed triaged labels Mar 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests

5 participants