Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable classloader caching for macros and plugins in 2.13 ? #548

Open
smarter opened this issue Aug 27, 2018 · 5 comments
Open

Enable classloader caching for macros and plugins in 2.13 ? #548

smarter opened this issue Aug 27, 2018 · 5 comments
Milestone

Comments

@smarter
Copy link
Member

smarter commented Aug 27, 2018

In scala/scala#6412 the caching that was added in 2.12.5 was turned off by default:

This PR switches the default in order to be conservative in a 2.12 minor release

Can we reenable this for 2.13 now? The benefits look pretty good, from scala/scala#6314:

In short, this change seems to yield compile-time performance improvements ranging from 16% to 35% for projects using compiler plugins.

/cc @adriaanm @retronym @jvican

@smarter smarter added this to the 2.13.0-RC1 milestone Aug 27, 2018
@jvican
Copy link
Member

jvican commented Aug 27, 2018

The blocker for this change is cache invalidation. In retronym/scala#27, I propose a generic way of getting rid of this (arguably complex) problem in the compiler. However, I'm skeptical we'll be able to enable this feature by default without the help/intervention of build tools.

The general idea is to have a --hash-classpath compiler argument that the users of the compiler can pass to the compiler. This hash classpath would contain a hash per compilation entry telling the compiler whether an entry has changed or not. Based on this information, the compiler could invalidate the caches for those entries in an opaque way.

This is a powerful and generic way of solving this problem. In all the cases I know, the clients of the compiler already have the information about what things changed or not in the classpath (or in the build DAG), and this information is the source of truth.

With this approach, we can support classloader caching "by default" in all Scala codebases with some tweaks to Zinc (while Bazel can instead pass the hashes themselves to the compiler). Zinc already reasons about changes to classpath because it needs to invalidate changes in external projects, so it can trivially pass this information to Scala's 2.13 compiler bridge.

In the linked ticket, I also propose an akin mechanism to invalidate symbols that changed in the symbol table, but that's partially unrelated to this topic as it has more to do with the invalidation mechanism of resident compilation.

@jvican
Copy link
Member

jvican commented Aug 27, 2018

I feel I haven't done a good job at fully selling all the benefits of this approach. To me, the most important one is the removal of duplication.

If we have a big classpath for a build of a 100 modules and we implement the invalidation logic inside the compiler, the compiler will need to read the timestamps of every classpath entry in every compilation of a project. That is a lot of work. By letting the build tool tell the compiler which things changed and which didn't, we solve this problem as the state that keeps track of the invalidation comes from outside the compiler and is global to the build tool knowledge.

If I have such hook, bloop and other build tools can implement an efficient way of reading timestamps or computing checksums of compiler inputs in a lazy way (context: scalacenter/bloop#32).

Here's a better way a build tool could implement this than the status quo:

  1. Have a global oracle in the build tool running on an IO thread pool that watches changes in compiler inputs (like classpath entries).
  2. When the global oracle detects a change, it lazily reads its timestamp or computes it checksum and exposes it to the rest of the build tool that needs this information.

When does the build tools needs this information? It turns out that it does in several cases:

  1. Detect a change in the file system. The file watcher can get spurious OS notifications when no content has been changed. To get reliable notifications, file watchers like https://github.com/gmethvin/directory-watcher/ hash the contents of the watches files (I have no idea why timestamps are not enough here).
  2. Detect changes in sources and classpath in Zinc. Zinc currently computes checksums for both classpath entries and sources.
  3. Detect changes in classpath inside the compiler.

All these use cases happen on a per-project basis, so to reiterate having a global oracle that provides this information to the use sites directly is a big win because most of the classpath entries are shared in the projects of a build.

@eatkins
Copy link

eatkins commented Aug 27, 2018

@jvican and I talked about this in https://gitter.im/sbt/zinc-contrib. I have an imminent PR to sbt and io that does something quite similar to what is proposed in the above comment.

@SethTisue
Copy link
Member

@retronym are you working on this for 2.13.0-RC1?

@retronym
Copy link
Member

retronym commented Feb 15, 2019

Not for RC1, unfortunately. I do some more foundational parts merged or queued which make it safer/easier to use, namely:

  • cached classloaders are discarded after a short timeout, which means you can choose the new "always" cache policy to ignore invalidation, sidestepping the problem of invalidating directory which are harder than JARs to timestamp).
    • caveat: $BUILDTOOL needs to start calling Global.close to make this work
  • exposing the -Ymacroclasspath setting to let you curate a smaller classpath for macros, which will increase the hit ratio for the cache.

@SethTisue SethTisue modified the milestones: 2.13.0-RC1, 2.13.1 Feb 15, 2019
@szeiger szeiger modified the milestones: 2.13.1, 2.13.2 Sep 9, 2019
@SethTisue SethTisue modified the milestones: 2.13.2, 2.13.3 Feb 6, 2020
@SethTisue SethTisue modified the milestones: 2.13.3, 2.13.4 May 15, 2020
@dwijnand dwijnand modified the milestones: 2.13.4, Backlog Oct 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants