From 73d662af9c05e9729a40345a802bfcaeb4704fed Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Tue, 19 Nov 2024 20:47:46 +0800 Subject: [PATCH 1/9] . --- docs/modules/ROOT/nav.adoc | 1 + .../ROOT/pages/comparisons/java-compile.adoc | 303 ++++++++++++++++++ .../modules/ROOT/pages/comparisons/maven.adoc | 33 +- .../ROOT/pages/comparisons/why-mill.adoc | 10 +- 4 files changed, 324 insertions(+), 23 deletions(-) create mode 100644 docs/modules/ROOT/pages/comparisons/java-compile.adoc diff --git a/docs/modules/ROOT/nav.adoc b/docs/modules/ROOT/nav.adoc index dc3fbf12c11..8c0aab4f358 100644 --- a/docs/modules/ROOT/nav.adoc +++ b/docs/modules/ROOT/nav.adoc @@ -36,6 +36,7 @@ ** xref:comparisons/gradle.adoc[] ** xref:comparisons/sbt.adoc[] ** xref:comparisons/unique.adoc[] +** xref:comparisons/java-compile.adoc[] * The Mill CLI ** xref:cli/installation-ide.adoc[] ** xref:cli/flags.adoc[] diff --git a/docs/modules/ROOT/pages/comparisons/java-compile.adoc b/docs/modules/ROOT/pages/comparisons/java-compile.adoc new file mode 100644 index 00000000000..c5a5a27dbe9 --- /dev/null +++ b/docs/modules/ROOT/pages/comparisons/java-compile.adoc @@ -0,0 +1,303 @@ +# How Fast Should Java Compile? + +include::partial$gtag-config.adoc[] + +Java compiles have the reputation for being slow, but that reputation does +not match today's reality. The Java compiler can compile "typical" Java code at over +100,000 lines a second on a single core. That means that even a million line project +should take more than 10s to compile in a single-threaded fashion, and _should_ take +even less time when taking advantage of parallelism. + +Doing some ad-hoc benchmarks on the time taken for common build tools to compile Java, +we find that although the compiler is blazing fast, all build tools add significant +overhead over compiling Java directly: + +|=== +| Compiling Netty Common (41,601 lines) | Time | Multiplier | Compiling Mockito Core (29,712 lines) | Time | Multiplier +| Javac Hot | 0.29s | 1.0x | Javac Hot | 0.36s | 1.0x +| Javac Cold | 1.62s | 5.6x | Javac Cold | 1.29s | 4.4x +| Maven | 4.89s | 16.9x | Gradle | 4.41s | 15.2x +| Mill | 1.11s | 3.8x | Mill | 1.20s | 4.1x +|=== + +Although Mill does the best in these benchmarks, all build tools fall short of how fast +compiling Java _should_ be. This post explores how these numbers were arrived at, and +what that means for the future of Java build tooling. + +## Mockito Core + +To begin to understand the problem, lets consider the codebase of the popular Mockito project: + +* https://github.com/mockito/mockito + +Mockito is a medium-sized Java project with a few dozen sub-modules and about ~100,000 lines +of code. To give us a simple reproducible scenario, let's consider the root mockito module +with sources in `src/main/java/`, on which all the downstream module and tests depend on. + +Mockito is built using Gradle. It's not totally trivial to extract the compilation classpath +from Gradle, but the following stackoverflow answer gives us some tips:; + +* https://stackoverflow.com/a/50639444/871202[How do I print out the Java classpath in gradle?] + +This gives us the following classpath: + +``` +export MY_CLASSPATH=/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/net.bytebuddy/byte-buddy/1.14.18/81e9b9a20944626e6757b5950676af901c2485/byte-buddy-1.14.18.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/net.bytebuddy/byte-buddy-agent/1.14.18/417558ea01fe9f0e8a94af28b9469d281c4e3984/byte-buddy-agent-1.14.18.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/junit/junit/4.13.2/8ac9e16d933b6fb43bc7f576336b8f4d7eb5ba12/junit-4.13.2.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest-core/2.2/3f2bd07716a31c395e2837254f37f21f0f0ab24b/hamcrest-core-2.2.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.opentest4j/opentest4j/1.3.0/152ea56b3a72f655d4fd677fc0ef2596c3dd5e6e/opentest4j-1.3.0.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.objenesis/objenesis/3.3/1049c09f1de4331e8193e579448d0916d75b7631/objenesis-3.3.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest/2.2/1820c0968dba3a11a1b30669bb1f01978a91dedc/hamcrest-2.2.jar +``` + +We can then pass this classpath into `javac -cp`, together with `src/main/java/**/*.java`, +to perform the compilation outside of Gradle using `javac` directly. Running this a few +times gives us the timings below: + +```bash +> time javac -cp $MY_CLASSPATH src/main/java/**/*.java +1.290s +1.250s +1.293s +``` + +To give us an idea of how many lines of code we are compiling, we can run: + +```bash +> find src/main/java | grep \\.java | xargs wc -l +... +41601 total +``` + +Combining this information, we find that 41601 lines of code compiled in ~1.29 seconds +(taking the median of the three runs above) suggests that `javac` compiles about ~32,000 +lines of code per second. + +These benchmarks were run ad-hoc on my laptop, an M1 10-core Macbook Pro, with OpenJDK +Corretto 17.0.6. The numbers would differ on different Java versions, hardware, operating systems, +and filesystems. Nevertheless, the overall trend is strong enough that you should be +able to reproduce the results despite variations in the benchmarking environment. + +Compiling 32,000 lines of code per second is not bad. But it is nowhere near how fast the +Java compiler _can_ run. Any software experience with JVM experience would know the next +obvious optimization for us to explore. + +## Hot JVMs + +One issue with the above benchmark is that it uses `javac` as a sub-process. The Java +compiler runs on the Java Virtual Machine, and like any JVM application, it has a slow +startup time, takes time warming-up, but then has good steady-state performance. +Running `javac` from the command line is thus the _worst possible_ way of using +the Java compiler, so compiling ~32,000 lines/sec is the worst possible performance you could +get out of the Java compiler on this Java codebase. + +To get good performance out of `javac`, like any other JVM application, we need to keep it +long-lived so it has a chance to warm up. While running the `javac` in a long-lived Java +program is not commonly taught, neither is it particularly difficult. Here is a complete +`Bench.java` file that does this, repeatedly running java compilation in a loop where it +has a chance to warm up, using the same `MY_CLASSPATH` and source files we saw earlier. +We print the output statistics to the terminal so we can see how fast Java compilation can +occur once things have a chance to warm up: + +```java +// Bench.java +import javax.tools.*; +import java.io.IOException; +import java.io.OutputStreamWriter; +import java.nio.file.*; +import java.util.List; +import java.util.stream.Collectors; + +public class Bench { + public static void main(String[] args) throws Exception { + while (true) { + long now = System.currentTimeMillis(); + String classpath = System.getenv("MY_CLASSPATH"); + Path sourceFolder = Paths.get("src/main/java"); + + List files = Files.walk(sourceFolder) + .filter(p -> p.toString().endsWith(".java")) + .map(p -> + new SimpleJavaFileObject(p.toUri(), JavaFileObject.Kind.SOURCE) { + public CharSequence getCharContent(boolean ignoreEncodingErrors) throws IOException { + return Files.readString(p); + } + } + ) + .collect(Collectors.toList()); + + JavaCompiler compiler = ToolProvider.getSystemJavaCompiler(); + + StandardJavaFileManager fileManager = compiler + .getStandardFileManager(null, null, null); + + // Run the compiler + JavaCompiler.CompilationTask task = compiler.getTask( + new OutputStreamWriter(System.out), + fileManager, + null, + List.of("-classpath", classpath), + null, + files + ); + + System.out.println("Compile Result: " + task.call()); + long end = System.currentTimeMillis(); + long lineCount = Files.walk(sourceFolder) + .filter(p -> p.toString().endsWith(".java")) + .map(p -> { + try { return Files.readAllLines(p).size(); } + catch(Exception e){ throw new RuntimeException(e); } + }) + .reduce(0, (x, y) -> x + y); + System.out.println("Lines: " + lineCount); + System.out.println("Duration: " + (end - now)); + System.out.println("Lines/second: " + lineCount / ((end - now) * 1000)); + } + } +} +``` + +Running this using `java Bench.java` in the Mockito repo root, eventually we see it +settle on approximately the following numbers: + +```java +359ms +378ms +353ms +``` + +The codebase hasn't changed, so compiling we are still compiling `41601` lines of code. +But now it only takes ~359ms, which tells us that using a long-lived warm Java compiler +we can compile approximately *116,000* lines of Java a second on a single core. + +## Double-checking our results on Netty Common + +Compiling 116,000 lines of Java per second is very fast. That means we should expect +a million-line Java codebase to compile in about 9 seconds, _on a single thread_. That +may seem surprisingly fast, and you may be forgiven if you find it hard to believe. To +double-check our results, we can pick another codebase to run some ad-hoc benchmarks. +For this I will use the Netty codebase: + +- https://github.com/netty/netty + +Netty is a large-ish Java project: ~500,000 lines of code. Again, to pick a somewhat +easily-reproducible benchmark, we want a decently-sized module that's relatively +standalone within the project: `netty-common` is a perfect fit. Again, we can use `find | grep | xargs` +to see how many lines of code we are looking at: + +```bash +$ find common/src/main/java | grep \\.java | xargs wc -l +29712 total +``` + +Again, Maven doesn't make it easy to show the classpath used to call `javac` ourselves, +but the following stackoverflow answer gives us a hint in how to do so: + +- https://stackoverflow.com/a/16655088/871202[In Maven, how output the classpath being used?] + +```bash +> ./mvnw clean; time ./mvnw -e -X -pl common -Pfast -DskipTests -Dcheckstyle.skip -Denforcer.skip=true install +``` + +If you grep the output for `-classpath`, we see: + +```bash +-classpath /Users/lihaoyi/Github/netty/common/target/classes:/Users/lihaoyi/.m2/repository/org/graalvm/nativeimage/svm/19.3.6/svm-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/sdk/graal-sdk/19.3.6/graal-sdk-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/nativeimage/objectfile/19.3.6/objectfile-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/nativeimage/pointsto/19.3.6/pointsto-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/truffle/truffle-nfi/19.3.6/truffle-nfi-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/truffle/truffle-api/19.3.6/truffle-api-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/compiler/compiler/19.3.6/compiler-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/jctools/jctools-core/4.0.5/jctools-core-4.0.5.jar:/Users/lihaoyi/.m2/repository/org/jetbrains/annotations-java5/23.0.0/annotations-java5-23.0.0.jar:/Users/lihaoyi/.m2/repository/org/slf4j/slf4j-api/1.7.30/slf4j-api-1.7.30.jar:/Users/lihaoyi/.m2/repository/commons-logging/commons-logging/1.2/commons-logging-1.2.jar:/Users/lihaoyi/.m2/repository/org/apache/logging/log4j/log4j-1.2-api/2.17.2/log4j-1.2-api-2.17.2.jar:/Users/lihaoyi/.m2/repository/org/apache/logging/log4j/log4j-api/2.17.2/log4j-api-2.17.2.jar:/Users/lihaoyi/.m2/repository/io/projectreactor/tools/blockhound/1.0.6.RELEASE/blockhound-1.0.6.RELEASE.jar +``` + +Again, we can `export MY_CLASSPATH` and start using `javac` from the command line: + +```bash +> javac -cp $MY_CLASSPATH common/src/main/java/**/*.java +1.624s +1.757s +1.606s +``` + +Or programmatically using the `Bench.java` program we saw earlier: + +```bash +294ms +282ms +285ms +``` + +Again taking the median number for the hot-in-memory benchmark above (285ms), and +the lines of code in Netty Common (29712), this tells us that `netty-common` compiles +at *~104,000 lines/second* + +## What About Build Tools? + +Although the Java Compiler is blazing fast - compiling code at >100k lines/second and +completing both Mockito-Core and Netty-Common in ~300ms - the experience of using Java +build tools is nowhere near as snappy. Consider the benchmark of clean-compiling the +Mockito-Core codebase using Gradle or Mill: + +```bash +$ ./gradlew clean; time ./gradlew :classes --no-build-cache +4.14s +4.41s +4.41s + +$ ./mill clean; time ./mill compile +1.20s +1.12s +1.30s +``` + +Or the benchmark of clean-compiling the Netty-Common codebase using Maven or Mill: + +```bash +$ ./mvnw clean; time ./mvnw -pl common -Pfast -DskipTests -Dcheckstyle.skip -Denforcer.skip=true -Dmaven.test.skip=true install +4.85s +4.96s +4.89s + +$ ./mill clean common; time ./mill common.compile +1.10s +1.12s +1.11s +``` + +We explore the comparison between xref:comparisons/gradle.adoc[Gradle vs Mill] +or xref:comparisons/maven.adoc[Maven vs Mill] in more detail on their own dedicated pages. +For this article, the important thing is not comparing the build tools against each other, +but comparing the build tools against what how fast they _could_ be if they just used +the `javac` Java compiler directly: + +|=== +| Compiling Netty Common (41,601 lines) | Time | Multiplier | Compiling Mockito Core (29,712 lines) | Time | Multiplier +| Javac Hot | 0.29s | 1.0x | Javac Hot | 0.36s | 1.0x +| Javac Cold | 1.62s | 5.6x | Javac Cold | 1.29s | 4.4x +| Maven | 4.89s | 16.9x | Gradle | 4.41s | 15.2x +| Mill | 1.11s | 3.8x | Mill | 1.20s | 4.1x +|=== + +## Conclusion + +From this study we can see the paradox: the Java _compiler and runtime_ is blazing fast, +while Java _build tools_ are dreadfully slow. Something that _should_ compile in a fraction +of a second using a warm `javac` takes several seconds (15-16x longer) to +compile using Maven or Gradle. Mill does better, but even it adds 4x overhead and falls +short of the snappiness you would expect from a compiler that takes ~0.3s to compile these +30-40kLOC Java codebases. + +These benchmarks were run ad-hoc and on my laptop on arbitrary codebases, and the details +will obviously differ depending on environment and the code in question. Running it on an +entire codebase, rather than a single module, will give different results. Nevertheless, the +results are clear: "typical" Java code _should_ compile at ~100,000 lines/second. Anything +less is purely build-tool overhead from Maven, Gradle, or Mill. + +Build tools do a lot more than the Java compiler. They do dependency management, parallelism, +caching and invalidation, and all sorts of other auxiliary tasks. But in the common case where +someone edits code and then compiles it, any time doing other things and not spent _actually +compiling Java_ is pure overhead. Checking for cache invalidation in _shouldn't_ take 15-16x +as long as actually compiling your code. It obviously does _today_, but it _shouldn't_. + +The Mill build tool goes to great lengths to try and minimize overhead: keeping long-lived +worker processes to let the compiler warm up, using a fast-launching minimal client to connect +to it, repeated bouts of benchmarking and optimization. Nevertheless, we can see that Mill +has a long way to go to catch up to the raw performance of `javac`. Mill's Java client adds +100s of milliseconds of startup overhead, connecting to the server over a +https://github.com/kohlschutter/junixsocket[JUnixSocket] takes 100s of milliseconds more, +checking dependency resolution caches takes considerable time. These are the things that +makes Mill take >1s to compile a Java module that should take 0.3x to compile. + +Improving Mill is a work in progress, and until compiling a 30-40kLOC Java codebase from +the command line takes 0.3 seconds, it will continue to be a work in progress. \ No newline at end of file diff --git a/docs/modules/ROOT/pages/comparisons/maven.adoc b/docs/modules/ROOT/pages/comparisons/maven.adoc index 51a28de54ec..53d60c3f7f6 100644 --- a/docs/modules/ROOT/pages/comparisons/maven.adoc +++ b/docs/modules/ROOT/pages/comparisons/maven.adoc @@ -73,9 +73,10 @@ Mill's performance compares to Maven: |=== | Benchmark | Maven | Mill | Speedup + | <> | 1m 38.80s | 23.14s | 4.3x | <> | 0m 48.92s | 0m 08.79s | 5.6x -| <> | 0m 08.46s | 0m 01.94s | 4.4x +| <> | 0m 04.89s | 0m 01.11s | 4.4x | <> | 0m 06.82s | 0m 00.54s | 12.6x | <> | 0m 05.25s | 0m 00.47s | 11.2x |=== @@ -180,30 +181,26 @@ when performing a clean build of the Netty repository. === Clean Compile Single-Module ```bash -$ ./mvnw clean; time ./mvnw -pl common -Pfast -DskipTests -Dcheckstyle.skip -Denforcer.skip=true install -0m 08.46s -0m 08.90s -0m 08.30s - -$ ./mill clean common; time ./mill common.test.compile -0m 01.99s -0m 01.81s -0m 01.94s +$ ./mvnw clean; time ./mvnw -pl common -Pfast -DskipTests -Dcheckstyle.skip -Denforcer.skip=true -Dmaven.test.skip=true install +4.85s +4.96s +4.89s + +$ ./mill clean common; time ./mill common.compile +1.10s +1.12s +1.11s ``` -This exercise limits the comparison to compiling a single module, in this case `common/`. -`./mvnw -pl common install` compiles both the `main/` and `test/` sources, whereas -`./mill common.compile` would only compile the `main/` sources, and we need to explicitly -reference `common.test.compile` to compile both (because `common.test.compile` depends on -`common.compile`, `common.compile` gets run automatically) +This exercise limits the comparison to compiling a single module, in this case `common/`, +ignore test sources. Again, we can see a significant speedup of Mill v.s. Maven remains even when compiling a -single module: a clean compile of `common/` is about 9x faster with Mill than with Maven! -Again, `common/` is about 40,000 lines of Java source code, so at 10,000-50,000 lines per +single module: a clean compile of `common/` is about 4x faster with Mill than with Maven! +Again, `common/` is about 30,000 lines of Java source code, so at 10,000-50,000 lines per second we would expect it to compile in about 1-4s. That puts Mill's compile times right at what you would expect, whereas Maven's has a significant overhead. - === Incremental Compile Single-Module ```bash diff --git a/docs/modules/ROOT/pages/comparisons/why-mill.adoc b/docs/modules/ROOT/pages/comparisons/why-mill.adoc index a4acbe052e9..1e71d48a6e6 100644 --- a/docs/modules/ROOT/pages/comparisons/why-mill.adoc +++ b/docs/modules/ROOT/pages/comparisons/why-mill.adoc @@ -62,11 +62,11 @@ both parallel and sequential, and for many modules or for a single module: |=== | Benchmark | Maven | Mill | Speedup -| xref:comparisons/maven.adoc#_sequential_clean_compile_all[Sequential Clean Compile All] | 1m 38.80s | 23.14s | 4.3x -| xref:comparisons/maven.adoc#_parallel_clean_compile_all[Parallel Clean Compile All] | 0m 48.92s | 0m 08.79s | 5.6x -| xref:comparisons/maven.adoc#_clean_compile_single_module[Clean Compile Single Module] | 0m 08.46s | 0m 01.94s | 4.4x -| xref:comparisons/maven.adoc#_incremental_compile_single_module[Incremental Compile Single Module] | 0m 06.82s | 0m 00.54s | 12.6x -| xref:comparisons/maven.adoc#_no_op_compile_single_module[No-Op Compile Single Module] | 0m 05.25s | 0m 00.47s | 11.2x +| xref:comparisons/maven.adoc#_sequential_clean_compile_all[Sequential Clean Compile All] | 98.80s | 23.14s | 4.3x +| xref:comparisons/maven.adoc#_parallel_clean_compile_all[Parallel Clean Compile All] | 48.92s | 8.79s | 5.6x +| xref:comparisons/maven.adoc#_clean_compile_single_module[Clean Compile Single Module] | 4.89s | 1.11s | 4.4x +| xref:comparisons/maven.adoc#_incremental_compile_single_module[Incremental Compile Single Module] | 6.82s | 0.54s | 12.6x +| xref:comparisons/maven.adoc#_no_op_compile_single_module[No-Op Compile Single Module] | 05.25s | 0.47s | 11.2x |=== First, let's look at *Parallel Clean Compile All*. From dd0c40116f53d96da931b1dd40219af69d6351c5 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Tue, 19 Nov 2024 20:58:58 +0800 Subject: [PATCH 2/9] . --- .../ROOT/pages/comparisons/java-compile.adoc | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/docs/modules/ROOT/pages/comparisons/java-compile.adoc b/docs/modules/ROOT/pages/comparisons/java-compile.adoc index c5a5a27dbe9..33c7edbfa7d 100644 --- a/docs/modules/ROOT/pages/comparisons/java-compile.adoc +++ b/docs/modules/ROOT/pages/comparisons/java-compile.adoc @@ -281,8 +281,8 @@ short of the snappiness you would expect from a compiler that takes ~0.3s to com These benchmarks were run ad-hoc and on my laptop on arbitrary codebases, and the details will obviously differ depending on environment and the code in question. Running it on an entire codebase, rather than a single module, will give different results. Nevertheless, the -results are clear: "typical" Java code _should_ compile at ~100,000 lines/second. Anything -less is purely build-tool overhead from Maven, Gradle, or Mill. +results are clear: "typical" Java code _should_ compile at ~100,000 lines/second on a single +thread. Anything less is purely build-tool overhead from Maven, Gradle, or Mill. Build tools do a lot more than the Java compiler. They do dependency management, parallelism, caching and invalidation, and all sorts of other auxiliary tasks. But in the common case where @@ -293,11 +293,6 @@ as long as actually compiling your code. It obviously does _today_, but it _shou The Mill build tool goes to great lengths to try and minimize overhead: keeping long-lived worker processes to let the compiler warm up, using a fast-launching minimal client to connect to it, repeated bouts of benchmarking and optimization. Nevertheless, we can see that Mill -has a long way to go to catch up to the raw performance of `javac`. Mill's Java client adds -100s of milliseconds of startup overhead, connecting to the server over a -https://github.com/kohlschutter/junixsocket[JUnixSocket] takes 100s of milliseconds more, -checking dependency resolution caches takes considerable time. These are the things that -makes Mill take >1s to compile a Java module that should take 0.3x to compile. - -Improving Mill is a work in progress, and until compiling a 30-40kLOC Java codebase from -the command line takes 0.3 seconds, it will continue to be a work in progress. \ No newline at end of file +has a long way to go to catch up to the raw performance of `javac`. Improving Mill is a work +in progress, and until compiling a 30-40kLOC Java codebase from the command line takes ~0.3 +seconds, it will continue to be a work in progress. \ No newline at end of file From d8ac03925ec719d0a78c987d5a7d3e08dd8b6aed Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Tue, 19 Nov 2024 21:19:08 +0800 Subject: [PATCH 3/9] . --- docs/modules/ROOT/pages/comparisons/java-compile.adoc | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/modules/ROOT/pages/comparisons/java-compile.adoc b/docs/modules/ROOT/pages/comparisons/java-compile.adoc index 33c7edbfa7d..6631ae19845 100644 --- a/docs/modules/ROOT/pages/comparisons/java-compile.adoc +++ b/docs/modules/ROOT/pages/comparisons/java-compile.adoc @@ -2,6 +2,10 @@ include::partial$gtag-config.adoc[] +On modern hardware and software, Java source code should compile at >100,000 lines per +second, i.e. a million-line Java codebase should clean compile in <10 seconds on a single +thread. Anything less than that is due to overhead from your build system and other tooling. + Java compiles have the reputation for being slow, but that reputation does not match today's reality. The Java compiler can compile "typical" Java code at over 100,000 lines a second on a single core. That means that even a million line project From 82dd1176f219e683edb86f28f7911c92eac66113 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Tue, 19 Nov 2024 21:21:09 +0800 Subject: [PATCH 4/9] . --- docs/modules/ROOT/pages/comparisons/java-compile.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/modules/ROOT/pages/comparisons/java-compile.adoc b/docs/modules/ROOT/pages/comparisons/java-compile.adoc index 6631ae19845..cb390b302ff 100644 --- a/docs/modules/ROOT/pages/comparisons/java-compile.adoc +++ b/docs/modules/ROOT/pages/comparisons/java-compile.adoc @@ -1,4 +1,4 @@ -# How Fast Should Java Compile? +# How Fast Does Java Compile? include::partial$gtag-config.adoc[] From e84b56efcbaa549ee95769e460a3452f2a2273cc Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Tue, 19 Nov 2024 22:00:10 +0800 Subject: [PATCH 5/9] . --- .../modules/ROOT/pages/comparisons/maven.adoc | 64 +++++++++---------- .../ROOT/pages/comparisons/why-mill.adoc | 2 +- 2 files changed, 33 insertions(+), 33 deletions(-) diff --git a/docs/modules/ROOT/pages/comparisons/maven.adoc b/docs/modules/ROOT/pages/comparisons/maven.adoc index 53d60c3f7f6..c583a5c1b4e 100644 --- a/docs/modules/ROOT/pages/comparisons/maven.adoc +++ b/docs/modules/ROOT/pages/comparisons/maven.adoc @@ -74,11 +74,11 @@ Mill's performance compares to Maven: | Benchmark | Maven | Mill | Speedup -| <> | 1m 38.80s | 23.14s | 4.3x -| <> | 0m 48.92s | 0m 08.79s | 5.6x -| <> | 0m 04.89s | 0m 01.11s | 4.4x -| <> | 0m 06.82s | 0m 00.54s | 12.6x -| <> | 0m 05.25s | 0m 00.47s | 11.2x +| <> | 98.80s | 23.14s | 4.3x +| <> | 48.92s | 8.79s | 5.6x +| <> | 4.89s | 1.11s | 4.4x +| <> | 6.82s | 0.54s | 12.6x +| <> | 5.25s | 0.47s | 11.2x |=== The column on the right shows the speedups of how much faster Mill is compared to the @@ -90,14 +90,14 @@ we can explain the difference in performing the same task with the two different ```bash $ ./mvnw clean; time ./mvnw -Pfast -Dcheckstyle.skip -Denforcer.skip=true -DskipTests install -1m 38.80s -1m 36.14s -1m 39.95s +98.80s +96.14s +99.95s $ ./mill clean; time ./mill -j1 __.compile -0m 23.99s -0m 23.14s -0m 22.68s +23.99s +23.14s +22.68s ``` This benchmark exercises the simple "build everything from scratch" workflow, with all remote @@ -144,9 +144,9 @@ Maven's overhead adds to the clean build: ```bash $ ./mill clean; time ./mill -j1 __.jar -0m 32.58s -0m 24.90s -0m 23.35s +32.58s +24.90s +23.35s ``` From this benchmark, we can see that although both Mill and Maven are doing the same work, @@ -159,14 +159,14 @@ whereas Mill directly uses the classfiles generated on disk to bypass all that w ```bash $ ./mvnw clean; time ./mvnw -T 10 -Pfast -DskipTests -Dcheckstyle.skip -Denforcer.skip=true install -0m 48.92s -0m 48.41s -0m 49.50s +48.92s +48.41s +49.50s $ ./mill clean; time ./mill __.compile -0m 09.07s -0m 08.79s -0m 07.93s +9.07s +8.79s +7.93s ``` This example compares Maven v.s. Mill, when performing the clean build on 10 threads. @@ -209,18 +209,18 @@ $ time ./mvnw -pl common -Pfast -DskipTests -Dcheckstyle.skip -Denforcer.skip=t Compiling 174 source files to /Users/lihaoyi/Github/netty/common/target/classes Compiling 60 source files to /Users/lihaoyi/Github/netty/common/target/test-classes -0m 06.89s -0m 06.34s -0m 06.82s +6.89s +6.34s +6.82s $ echo "" >> common/src/main/java/io/netty/util/AbstractConstant.java $ time ./mill common.test.compile compiling 1 Java source to /Users/lihaoyi/Github/netty/out/common/compile.dest/classes ... -0m 00.78s -0m 00.54s -0m 00.51s +0.78s +0.54s +0.51s ``` This benchmark explores editing a single file and re-compiling `common/`. @@ -305,14 +305,14 @@ the same thing in Maven ```bash $ time ./mvnw -pl common -Pfast -DskipTests -Dcheckstyle.skip -Denforcer.skip=true install -0m 05.08s -0m 05.25s -0m 05.26s +5.08s +5.25s +5.26s $ time ./mill common.test.compile -0m 00.49s -0m 00.47s -0m 00.45s +0.49s +0.47s +0.45s ``` This last benchmark explores the boundaries of Maven and Mill: what happens if diff --git a/docs/modules/ROOT/pages/comparisons/why-mill.adoc b/docs/modules/ROOT/pages/comparisons/why-mill.adoc index 1e71d48a6e6..df1ad4a46d7 100644 --- a/docs/modules/ROOT/pages/comparisons/why-mill.adoc +++ b/docs/modules/ROOT/pages/comparisons/why-mill.adoc @@ -66,7 +66,7 @@ both parallel and sequential, and for many modules or for a single module: | xref:comparisons/maven.adoc#_parallel_clean_compile_all[Parallel Clean Compile All] | 48.92s | 8.79s | 5.6x | xref:comparisons/maven.adoc#_clean_compile_single_module[Clean Compile Single Module] | 4.89s | 1.11s | 4.4x | xref:comparisons/maven.adoc#_incremental_compile_single_module[Incremental Compile Single Module] | 6.82s | 0.54s | 12.6x -| xref:comparisons/maven.adoc#_no_op_compile_single_module[No-Op Compile Single Module] | 05.25s | 0.47s | 11.2x +| xref:comparisons/maven.adoc#_no_op_compile_single_module[No-Op Compile Single Module] | 5.25s | 0.47s | 11.2x |=== First, let's look at *Parallel Clean Compile All*. From 62756076de55dbf9355c65c458bf34481d444550 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Tue, 19 Nov 2024 22:54:40 +0800 Subject: [PATCH 6/9] . --- .../ROOT/pages/comparisons/java-compile.adoc | 26 ++++++++++++++----- .../ROOT/pages/comparisons/why-mill.adoc | 2 +- 2 files changed, 20 insertions(+), 8 deletions(-) diff --git a/docs/modules/ROOT/pages/comparisons/java-compile.adoc b/docs/modules/ROOT/pages/comparisons/java-compile.adoc index cb390b302ff..c57c4a3f34d 100644 --- a/docs/modules/ROOT/pages/comparisons/java-compile.adoc +++ b/docs/modules/ROOT/pages/comparisons/java-compile.adoc @@ -259,11 +259,7 @@ $ ./mill clean common; time ./mill common.compile 1.11s ``` -We explore the comparison between xref:comparisons/gradle.adoc[Gradle vs Mill] -or xref:comparisons/maven.adoc[Maven vs Mill] in more detail on their own dedicated pages. -For this article, the important thing is not comparing the build tools against each other, -but comparing the build tools against what how fast they _could_ be if they just used -the `javac` Java compiler directly: +Tabulating this all together gives us the table we saw at the start of this page: |=== | Compiling Netty Common (41,601 lines) | Time | Multiplier | Compiling Mockito Core (29,712 lines) | Time | Multiplier @@ -273,6 +269,22 @@ the `javac` Java compiler directly: | Mill | 1.11s | 3.8x | Mill | 1.20s | 4.1x |=== +We explore the comparison between xref:comparisons/gradle.adoc[Gradle vs Mill] +or xref:comparisons/maven.adoc[Maven vs Mill] in more detail on their own dedicated pages. +For this article, the important thing is not comparing the build tools against each other, +but comparing the build tools against what how fast they _could_ be if they just used +the `javac` Java compiler directly: + +One thing worth calling out is that the overhead of the various build tools does not +appear to go down in larger builds. Although this `Clean Compile Single-Module` only +deals with compiling a single small module, the `Sequential Clean Compile` benchmarks which +compile entire projects show similar build-tool slowdowns: +xref:comparisons/gradle.adoc#_sequential_clean_compile_all[Gradle compiling 100,000 lines of Java at ~5,600 lines/s] +and xref:comparisons/maven.adoc#_sequential_clean_compile_all[Maven compiling 500,000 lines of Java at ~5,100 lines/s], +with Mill reaching ~25,000 lines/s, all of which are far below the 100,000 lines/s +that we should expect from Java compilation. + + ## Conclusion From this study we can see the paradox: the Java _compiler and runtime_ is blazing fast, @@ -298,5 +310,5 @@ The Mill build tool goes to great lengths to try and minimize overhead: keeping worker processes to let the compiler warm up, using a fast-launching minimal client to connect to it, repeated bouts of benchmarking and optimization. Nevertheless, we can see that Mill has a long way to go to catch up to the raw performance of `javac`. Improving Mill is a work -in progress, and until compiling a 30-40kLOC Java codebase from the command line takes ~0.3 -seconds, it will continue to be a work in progress. \ No newline at end of file +in progress, and will reain so until compiling a 30-40kLOC Java codebase from the command line +takes ~0.3 seconds. \ No newline at end of file diff --git a/docs/modules/ROOT/pages/comparisons/why-mill.adoc b/docs/modules/ROOT/pages/comparisons/why-mill.adoc index df1ad4a46d7..2bd8eb7b044 100644 --- a/docs/modules/ROOT/pages/comparisons/why-mill.adoc +++ b/docs/modules/ROOT/pages/comparisons/why-mill.adoc @@ -93,7 +93,7 @@ across the various workflows: |=== | Benchmark | Gradle | Mill | Speedup -| xref:comparisons/maven.adoc#_sequential_clean_compile_all[Sequential Clean Compile All] | 17.6s | 5.40s | 3.3x +| xref:comparisons/maven.adoc#_sequential_clean_compile_all[Sequential Clean Compile All] | 1s | 5.40s | 3.3x | xref:comparisons/maven.adoc#_parallel_clean_compile_all[Parallel Clean Compile All] | 12.3s | 3.57s | 3.4x | xref:comparisons/maven.adoc#_clean_compile_single_module[Clean Compile Single Module] | 4.41s | 1.20s | 3.7x | xref:comparisons/maven.adoc#_incremental_compile_single_module[Incremental Compile Single Module] | 1.37s | 0.51s | 2.7x From 7d06f6d29ad3d759a187dff5abde1af0676621d7 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Tue, 19 Nov 2024 23:27:12 +0800 Subject: [PATCH 7/9] . --- .../ROOT/pages/comparisons/java-compile.adoc | 89 ++++++++++--------- 1 file changed, 49 insertions(+), 40 deletions(-) diff --git a/docs/modules/ROOT/pages/comparisons/java-compile.adoc b/docs/modules/ROOT/pages/comparisons/java-compile.adoc index c57c4a3f34d..71fd473338b 100644 --- a/docs/modules/ROOT/pages/comparisons/java-compile.adoc +++ b/docs/modules/ROOT/pages/comparisons/java-compile.adoc @@ -2,31 +2,28 @@ include::partial$gtag-config.adoc[] -On modern hardware and software, Java source code should compile at >100,000 lines per -second, i.e. a million-line Java codebase should clean compile in <10 seconds on a single -thread. Anything less than that is due to overhead from your build system and other tooling. - Java compiles have the reputation for being slow, but that reputation does -not match today's reality. The Java compiler can compile "typical" Java code at over +not match today's reality. Nowadays the Java compiler can compile "typical" Java code at over 100,000 lines a second on a single core. That means that even a million line project -should take more than 10s to compile in a single-threaded fashion, and _should_ take -even less time when taking advantage of parallelism. +should take more than 10s to compile in a single-threaded fashion, and should be even +faster in the presence of parallelism -Doing some ad-hoc benchmarks on the time taken for common build tools to compile Java, -we find that although the compiler is blazing fast, all build tools add significant -overhead over compiling Java directly: +Doing some ad-hoc benchmarks on the time taken for common build tools to compile Java +on a single core, we find that although the compiler is blazing fast, all build tools +add significant overhead over compiling Java directly: |=== -| Compiling Netty Common (41,601 lines) | Time | Multiplier | Compiling Mockito Core (29,712 lines) | Time | Multiplier -| Javac Hot | 0.29s | 1.0x | Javac Hot | 0.36s | 1.0x -| Javac Cold | 1.62s | 5.6x | Javac Cold | 1.29s | 4.4x -| Maven | 4.89s | 16.9x | Gradle | 4.41s | 15.2x -| Mill | 1.11s | 3.8x | Mill | 1.20s | 4.1x +| Mockito Core | Time | Compiler lines/s | Multiplier | Netty Common | Time | Compiler lines/s | Multiplier +| Javac Hot | 0.36s | 115,600 | 1.0x | Javac Hot | 0.29s | 102,500 | 1.0x +| Javac Cold | 1.29s | 32,200 | 4.4x | Javac Cold | 1.62s | 18,300 | 5.6x +| Mill | 1.20s | 34,700 | 4.1x | Mill | 1.11s | 26,800 | 3.8x +| Gradle | 4.41s | 9,400 | 15.2x | Maven | 4.89s | 6,100 | 16.9x |=== -Although Mill does the best in these benchmarks, all build tools fall short of how fast -compiling Java _should_ be. This post explores how these numbers were arrived at, and -what that means for the future of Java build tooling. +Although Mill does the best in these benchmarks among the build tools, all build tools +fall short of how fast compiling Java _should_ be. This post explores how these numbers +were arrived at, and what that means in un-tapped potential for Java build tooling to +become truly great. ## Mockito Core @@ -39,16 +36,24 @@ of code. To give us a simple reproducible scenario, let's consider the root mock with sources in `src/main/java/`, on which all the downstream module and tests depend on. Mockito is built using Gradle. It's not totally trivial to extract the compilation classpath -from Gradle, but the following stackoverflow answer gives us some tips:; +from Gradle, but the following stackoverflow answer gives us some tips: * https://stackoverflow.com/a/50639444/871202[How do I print out the Java classpath in gradle?] +```bash +> ./gradlew clean && ./gradlew :classes --no-build-cache --debug | grep "classpath " +``` + This gives us the following classpath: ``` export MY_CLASSPATH=/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/net.bytebuddy/byte-buddy/1.14.18/81e9b9a20944626e6757b5950676af901c2485/byte-buddy-1.14.18.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/net.bytebuddy/byte-buddy-agent/1.14.18/417558ea01fe9f0e8a94af28b9469d281c4e3984/byte-buddy-agent-1.14.18.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/junit/junit/4.13.2/8ac9e16d933b6fb43bc7f576336b8f4d7eb5ba12/junit-4.13.2.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest-core/2.2/3f2bd07716a31c395e2837254f37f21f0f0ab24b/hamcrest-core-2.2.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.opentest4j/opentest4j/1.3.0/152ea56b3a72f655d4fd677fc0ef2596c3dd5e6e/opentest4j-1.3.0.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.objenesis/objenesis/3.3/1049c09f1de4331e8193e579448d0916d75b7631/objenesis-3.3.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest/2.2/1820c0968dba3a11a1b30669bb1f01978a91dedc/hamcrest-2.2.jar ``` +Note that for this benchmark, all third-party dependencies have already been resolved +and downloaded from Maven Central. We can thus simply reference the jars on disk directly, +which we do above. + We can then pass this classpath into `javac -cp`, together with `src/main/java/**/*.java`, to perform the compilation outside of Gradle using `javac` directly. Running this a few times gives us the timings below: @@ -166,7 +171,7 @@ settle on approximately the following numbers: 353ms ``` -The codebase hasn't changed, so compiling we are still compiling `41601` lines of code. +The codebase hasn't changed, so compiling we are still compiling 41,601 lines of code. But now it only takes ~359ms, which tells us that using a long-lived warm Java compiler we can compile approximately *116,000* lines of Java a second on a single core. @@ -222,14 +227,13 @@ Or programmatically using the `Bench.java` program we saw earlier: 285ms ``` -Again taking the median number for the hot-in-memory benchmark above (285ms), and -the lines of code in Netty Common (29712), this tells us that `netty-common` compiles -at *~104,000 lines/second* +Taking 285ms for a hot-in-memory compile of 29,712 lines of code, `netty-common` +therefore compiles at *~104,000 lines/second*. ## What About Build Tools? Although the Java Compiler is blazing fast - compiling code at >100k lines/second and -completing both Mockito-Core and Netty-Common in ~300ms - the experience of using Java +completing both Mockito-Core and Netty-Common in ~300ms - the experience of using typical Java build tools is nowhere near as snappy. Consider the benchmark of clean-compiling the Mockito-Core codebase using Gradle or Mill: @@ -262,32 +266,35 @@ $ ./mill clean common; time ./mill common.compile Tabulating this all together gives us the table we saw at the start of this page: |=== -| Compiling Netty Common (41,601 lines) | Time | Multiplier | Compiling Mockito Core (29,712 lines) | Time | Multiplier -| Javac Hot | 0.29s | 1.0x | Javac Hot | 0.36s | 1.0x -| Javac Cold | 1.62s | 5.6x | Javac Cold | 1.29s | 4.4x -| Maven | 4.89s | 16.9x | Gradle | 4.41s | 15.2x -| Mill | 1.11s | 3.8x | Mill | 1.20s | 4.1x +| Mockito Core | Time | Compiler lines/s | Multiplier | Netty Common | Time | Compiler lines/s | Multiplier +| Javac Hot | 0.36s | 115,600 | 1.0x | Javac Hot | 0.29s | 102,500 | 1.0x +| Javac Cold | 1.29s | 32,200 | 4.4x | Javac Cold | 1.62s | 18,300 | 5.6x +| Mill | 1.20s | 34,700 | 4.1x | Mill | 1.11s | 26,800 | 3.8x +| Gradle | 4.41s | 9,400 | 15.2x | Maven | 4.89s | 6,100 | 16.9x |=== We explore the comparison between xref:comparisons/gradle.adoc[Gradle vs Mill] or xref:comparisons/maven.adoc[Maven vs Mill] in more detail on their own dedicated pages. For this article, the important thing is not comparing the build tools against each other, but comparing the build tools against what how fast they _could_ be if they just used -the `javac` Java compiler directly: +the `javac` Java compiler directly. And it's clear that compared to the actual work +done by `javac` to actually compile your code, build tools add a frankly absurd amount +of overhead ranging from ~4x for Mill to 15-16x for Maven and Gradle! One thing worth calling out is that the overhead of the various build tools does not -appear to go down in larger builds. Although this `Clean Compile Single-Module` only -deals with compiling a single small module, the `Sequential Clean Compile` benchmarks which -compile entire projects show similar build-tool slowdowns: +appear to go down in larger builds. Although this *Clean Compile Single-Module* benchmark +we explored above only deals with compiling a single small module, a similar *Sequential +Clean Compile* benchmarks which compile the entire Mockito and Netty projects shows similar build-tool slowdowns: xref:comparisons/gradle.adoc#_sequential_clean_compile_all[Gradle compiling 100,000 lines of Java at ~5,600 lines/s] and xref:comparisons/maven.adoc#_sequential_clean_compile_all[Maven compiling 500,000 lines of Java at ~5,100 lines/s], -with Mill reaching ~25,000 lines/s, all of which are far below the 100,000 lines/s -that we should expect from Java compilation. +with Mill reaching ~25,000 lines/s. All of these are far below the 100,000 lines/s +that we should expect from Java compilation, and roughly line up with the numbers measured +above. ## Conclusion -From this study we can see the paradox: the Java _compiler and runtime_ is blazing fast, +From this study we can see the paradox: the Java _compiler_ is blazing fast, while Java _build tools_ are dreadfully slow. Something that _should_ compile in a fraction of a second using a warm `javac` takes several seconds (15-16x longer) to compile using Maven or Gradle. Mill does better, but even it adds 4x overhead and falls @@ -302,13 +309,15 @@ thread. Anything less is purely build-tool overhead from Maven, Gradle, or Mill. Build tools do a lot more than the Java compiler. They do dependency management, parallelism, caching and invalidation, and all sorts of other auxiliary tasks. But in the common case where -someone edits code and then compiles it, any time doing other things and not spent _actually +someone edits code and then compiles it, and all your dependencies are already downloaded and +cached locally, any time doing other things and not spent _actually compiling Java_ is pure overhead. Checking for cache invalidation in _shouldn't_ take 15-16x -as long as actually compiling your code. It obviously does _today_, but it _shouldn't_. +as long as actually compiling your code. I mean it obviously does _today_, but it _shouldn't_. The Mill build tool goes to great lengths to try and minimize overhead: keeping long-lived worker processes to let the compiler warm up, using a fast-launching minimal client to connect to it, repeated bouts of benchmarking and optimization. Nevertheless, we can see that Mill has a long way to go to catch up to the raw performance of `javac`. Improving Mill is a work -in progress, and will reain so until compiling a 30-40kLOC Java codebase from the command line -takes ~0.3 seconds. \ No newline at end of file +in progress, and will remain so until compiling a 30-40kLOC Java codebase from the command line +takes ~0.3 seconds. If Java build and compile times are something that matter to you, you +should try out the Mill build tool and get involved! \ No newline at end of file From 6786216b8736728fc293c97934abb4fe015927b7 Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Wed, 20 Nov 2024 09:31:51 +0800 Subject: [PATCH 8/9] . --- docs/modules/ROOT/nav.adoc | 1 - .../pages/cli/alternate-installation.adoc | 120 ----------------- .../ROOT/pages/cli/installation-ide.adoc | 121 ++++++++++++++++++ .../ROOT/pages/comparisons/java-compile.adoc | 97 ++++++++------ 4 files changed, 181 insertions(+), 158 deletions(-) delete mode 100644 docs/modules/ROOT/pages/cli/alternate-installation.adoc diff --git a/docs/modules/ROOT/nav.adoc b/docs/modules/ROOT/nav.adoc index 8c0aab4f358..2e4b7b13712 100644 --- a/docs/modules/ROOT/nav.adoc +++ b/docs/modules/ROOT/nav.adoc @@ -42,7 +42,6 @@ ** xref:cli/flags.adoc[] ** xref:cli/builtin-commands.adoc[] ** xref:cli/query-syntax.adoc[] -** xref:cli/alternate-installation.adoc[] * Migrating to Mill ** xref:migrating/maven.adoc[] // This section gives a tour of the various user-facing features of Mill: diff --git a/docs/modules/ROOT/pages/cli/alternate-installation.adoc b/docs/modules/ROOT/pages/cli/alternate-installation.adoc deleted file mode 100644 index 0f46fc9e839..00000000000 --- a/docs/modules/ROOT/pages/cli/alternate-installation.adoc +++ /dev/null @@ -1,120 +0,0 @@ -= Other installation methods - -CAUTION: The installation methods listed below are maintained outside of Mill and may not have -the same features as the xref:cli/installation-ide.adoc#_bootstrap_scripts[bootstrap scripts]. You can try using them, -but the officially supported way to use Mill is via the bootstrap script above, so the Mill -maintainers may be unable to help you if you have issues with some alternate installation method. - -CAUTION: Some of the installations via package managers install a fixed version of Mill and -do not support project-specific selection of the preferred Mill version. If you want to use -the `MILL_VERSION` environment variable or need support for `.mill-version` or -`.config/mill-version` files to control the actual used Mill version, please use -a xref:cli/installation-ide.adoc#_bootstrap_scripts[bootstrap script] instead. - -== OS X - -Installation via https://github.com/Homebrew/homebrew-core/blob/master/Formula/m/mill.rb[homebrew]: - -[source,sh] ----- -brew install mill ----- - - -== Arch Linux - -Arch Linux has an https://archlinux.org/packages/extra/any/mill/[Extra package for mill]: - -[source,bash] ----- -pacman -S mill - ----- - -== FreeBSD - -Installation via http://man.freebsd.org/pkg/8[pkg(8)]: - -[source,sh] ----- -pkg install mill - ----- - -== Gentoo Linux - -[source,sh] ----- -emerge dev-java/mill-bin - ----- - -== Windows - -To get started, download Mill from -{mill-github-url}/releases/download/{mill-last-tag}/{mill-last-tag}-assembly[Github releases], and save it as `mill.bat`. - -If you're using https://scoop.sh[Scoop] you can install Mill via - -[source,bash] ----- -scoop install mill ----- - -== WSL / MSYS2 / Cycgin / Git-Bash - -Mill also works on "sh" environments on Windows (e.g., -https://www.msys2.org[MSYS2], -https://www.cygwin.com[Cygwin], -https://gitforwindows.org[Git-Bash], -https://docs.microsoft.com/en-us/windows/wsl[WSL]); to get started, follow the instructions in the <<_manual>> -section. Note that: - -* In some environments (such as WSL), Mill might have to be run without a server (using `-i`, `--interactive`, or `--no-server`.) - -* On Cygwin, run the following after downloading mill: - -[source,bash] ----- -sed -i '0,/-cp "\$0"/{s/-cp "\$0"/-cp `cygpath -w "\$0"`/}; 0,/-cp "\$0"/{s/-cp "\$0"/-cp `cygpath -w "\$0"`/}' /usr/local/bin/mill ----- - -== Docker - -You can download and run -a https://hub.docker.com/r/nightscape/scala-mill/["Docker image containing OpenJDK, Scala and Mill"] using - -[source,bash] ----- -docker pull nightscape/scala-mill -docker run -it nightscape/scala-mill ----- - -== Manual - -To get started, download Mill and install it into your HOME ".local/bin" via the following -`curl`/`chmod` command: - -[source,bash,subs="verbatim,attributes"] ----- -sh -c "curl -L {mill-github-url}/releases/download/{mill-last-tag}/{mill-last-tag} > ~/.local/bin/mill && chmod +x ~/.local/bin/mill" ----- - -== Coursier (unsupported) - -Installing mill via `coursier` or `cs` is currently not officially supported. -There are various issues, especially with interactive mode. - -== Asdf (unsupported) - -You can install and manage Mill via the Multiple Runtime Version Manager - https://asdf-vm.com/[`asdf`]. - -Support by `asdf` is currently possible by using the https://github.com/asdf-community/asdf-mill[`asdf-mill` plugin]: - -.Steps to install the `mill` plugin and Mill with `asdf` -[source,bash] ---- -asdf plugin add mill -asdf install mill latest -asdf global mill latest ---- diff --git a/docs/modules/ROOT/pages/cli/installation-ide.adoc b/docs/modules/ROOT/pages/cli/installation-ide.adoc index fa8c42b3ccf..f2c271c0e61 100644 --- a/docs/modules/ROOT/pages/cli/installation-ide.adoc +++ b/docs/modules/ROOT/pages/cli/installation-ide.adoc @@ -256,3 +256,124 @@ The easiest way to use a development release is to use one of the `MILL_VERSION` environment variable or a `.mill-version` or `.config/mill-version` file. + +== Other installation methods + +CAUTION: The installation methods listed below are maintained outside of Mill and may not have +the same features as the xref:cli/installation-ide.adoc#_bootstrap_scripts[bootstrap scripts]. You can try using them, +but the officially supported way to use Mill is via the bootstrap script above, so the Mill +maintainers may be unable to help you if you have issues with some alternate installation method. + +CAUTION: Some of the installations via package managers install a fixed version of Mill and +do not support project-specific selection of the preferred Mill version. If you want to use +the `MILL_VERSION` environment variable or need support for `.mill-version` or +`.config/mill-version` files to control the actual used Mill version, please use +a xref:cli/installation-ide.adoc#_bootstrap_scripts[bootstrap script] instead. + +=== OS X + +Installation via https://github.com/Homebrew/homebrew-core/blob/master/Formula/m/mill.rb[homebrew]: + +[source,sh] +---- +brew install mill +---- + + +=== Arch Linux + +Arch Linux has an https://archlinux.org/packages/extra/any/mill/[Extra package for mill]: + +[source,bash] +---- +pacman -S mill + +---- + +=== FreeBSD + +Installation via http://man.freebsd.org/pkg/8[pkg(8)]: + +[source,sh] +---- +pkg install mill + +---- + +=== Gentoo Linux + +[source,sh] +---- +emerge dev-java/mill-bin + +---- + +=== Windows + +To get started, download Mill from +{mill-github-url}/releases/download/{mill-last-tag}/{mill-last-tag}-assembly[Github releases], and save it as `mill.bat`. + +If you're using https://scoop.sh[Scoop] you can install Mill via + +[source,bash] +---- +scoop install mill +---- + +=== WSL / MSYS2 / Cycgin / Git-Bash + +Mill also works on "sh" environments on Windows (e.g., +https://www.msys2.org[MSYS2], +https://www.cygwin.com[Cygwin], +https://gitforwindows.org[Git-Bash], +https://docs.microsoft.com/en-us/windows/wsl[WSL]); to get started, follow the instructions in the <<_manual>> +section. Note that: + +* In some environments (such as WSL), Mill might have to be run without a server (using `-i`, `--interactive`, or `--no-server`.) + +* On Cygwin, run the following after downloading mill: + +[source,bash] +---- +sed -i '0,/-cp "\$0"/{s/-cp "\$0"/-cp `cygpath -w "\$0"`/}; 0,/-cp "\$0"/{s/-cp "\$0"/-cp `cygpath -w "\$0"`/}' /usr/local/bin/mill +---- + +=== Docker + +You can download and run +a https://hub.docker.com/r/nightscape/scala-mill/["Docker image containing OpenJDK, Scala and Mill"] using + +[source,bash] +---- +docker pull nightscape/scala-mill +docker run -it nightscape/scala-mill +---- + +=== Manual + +To get started, download Mill and install it into your HOME ".local/bin" via the following +`curl`/`chmod` command: + +[source,bash,subs="verbatim,attributes"] +---- +sh -c "curl -L {mill-github-url}/releases/download/{mill-last-tag}/{mill-last-tag} > ~/.local/bin/mill && chmod +x ~/.local/bin/mill" +---- + +=== Coursier (unsupported) + +Installing mill via `coursier` or `cs` is currently not officially supported. +There are various issues, especially with interactive mode. + +=== Asdf (unsupported) + +You can install and manage Mill via the Multiple Runtime Version Manager - https://asdf-vm.com/[`asdf`]. + +Support by `asdf` is currently possible by using the https://github.com/asdf-community/asdf-mill[`asdf-mill` plugin]: + +.Steps to install the `mill` plugin and Mill with `asdf` +[source,bash] +--- +asdf plugin add mill +asdf install mill latest +asdf global mill latest +--- diff --git a/docs/modules/ROOT/pages/comparisons/java-compile.adoc b/docs/modules/ROOT/pages/comparisons/java-compile.adoc index 71fd473338b..e1beff799ec 100644 --- a/docs/modules/ROOT/pages/comparisons/java-compile.adoc +++ b/docs/modules/ROOT/pages/comparisons/java-compile.adoc @@ -8,9 +8,8 @@ not match today's reality. Nowadays the Java compiler can compile "typical" Java should take more than 10s to compile in a single-threaded fashion, and should be even faster in the presence of parallelism -Doing some ad-hoc benchmarks on the time taken for common build tools to compile Java -on a single core, we find that although the compiler is blazing fast, all build tools -add significant overhead over compiling Java directly: +Doing some ad-hoc benchmarks, we find that although the compiler is blazing fast, all +build tools add significant overhead over compiling Java directly: |=== | Mockito Core | Time | Compiler lines/s | Multiplier | Netty Common | Time | Compiler lines/s | Multiplier @@ -20,10 +19,10 @@ add significant overhead over compiling Java directly: | Gradle | 4.41s | 9,400 | 15.2x | Maven | 4.89s | 6,100 | 16.9x |=== -Although Mill does the best in these benchmarks among the build tools, all build tools -fall short of how fast compiling Java _should_ be. This post explores how these numbers -were arrived at, and what that means in un-tapped potential for Java build tooling to -become truly great. +Although Mill does the best in these benchmarks among the build tools (Maven, Gradle, and Mill), +all build tools fall short of how fast compiling Java _should_ be. This post explores how +these numbers were arrived at, and what that means in un-tapped potential for Java build +tooling to become truly great. ## Mockito Core @@ -86,21 +85,21 @@ Compiling 32,000 lines of code per second is not bad. But it is nowhere near how Java compiler _can_ run. Any software experience with JVM experience would know the next obvious optimization for us to explore. -## Hot JVMs +## Keeping the JVM Hot One issue with the above benchmark is that it uses `javac` as a sub-process. The Java compiler runs on the Java Virtual Machine, and like any JVM application, it has a slow startup time, takes time warming-up, but then has good steady-state performance. -Running `javac` from the command line is thus the _worst possible_ way of using -the Java compiler, so compiling ~32,000 lines/sec is the worst possible performance you could -get out of the Java compiler on this Java codebase. +Running `javac` from the command line and compiling ~32,000 lines/sec is thus the _worst_ +possible performance you could get out of the Java compiler on this Java codebase. To get good performance out of `javac`, like any other JVM application, we need to keep it long-lived so it has a chance to warm up. While running the `javac` in a long-lived Java program is not commonly taught, neither is it particularly difficult. Here is a complete `Bench.java` file that does this, repeatedly running java compilation in a loop where it -has a chance to warm up, using the same `MY_CLASSPATH` and source files we saw earlier. -We print the output statistics to the terminal so we can see how fast Java compilation can +has a chance to warm up, to emulate the long lived JVM process that a build tool like Mill +may spawn and manage. We use the same `MY_CLASSPATH` and source files we saw earlier and +print the output statistics to the terminal so we can see how fast Java compilation can occur once things have a chance to warm up: ```java @@ -156,7 +155,7 @@ public class Bench { .reduce(0, (x, y) -> x + y); System.out.println("Lines: " + lineCount); System.out.println("Duration: " + (end - now)); - System.out.println("Lines/second: " + lineCount / ((end - now) * 1000)); + System.out.println("Lines/second: " + lineCount / ((end - now) / 1000)); } } } @@ -165,22 +164,27 @@ public class Bench { Running this using `java Bench.java` in the Mockito repo root, eventually we see it settle on approximately the following numbers: -```java +```bash 359ms 378ms 353ms ``` -The codebase hasn't changed, so compiling we are still compiling 41,601 lines of code. -But now it only takes ~359ms, which tells us that using a long-lived warm Java compiler +The codebase hasn't changed - we are still compiling 41,601 lines of code - +but now it only takes ~359ms. That tells us that using a long-lived warm Java compiler we can compile approximately *116,000* lines of Java a second on a single core. -## Double-checking our results on Netty Common Compiling 116,000 lines of Java per second is very fast. That means we should expect a million-line Java codebase to compile in about 9 seconds, _on a single thread_. That -may seem surprisingly fast, and you may be forgiven if you find it hard to believe. To -double-check our results, we can pick another codebase to run some ad-hoc benchmarks. +may seem surprisingly fast, and you may be forgiven if you find it hard to believe. As +mentioned earlier, this number is expected to vary based on the codebase being compiled; +could it be that Mockito-Core just happens to be a very simple Java module that compiles +quickly? + +## Double-checking Our Results + +To double-check our results, we can pick another codebase to run some ad-hoc benchmarks. For this I will use the Netty codebase: - https://github.com/netty/netty @@ -230,6 +234,14 @@ Or programmatically using the `Bench.java` program we saw earlier: Taking 285ms for a hot-in-memory compile of 29,712 lines of code, `netty-common` therefore compiles at *~104,000 lines/second*. +Although the choice of project is arbitrary, Mockito-Core and Netty-Common are decent +examples of Java code found "out in the wild". They aren't synthetic fake codebases generated +for the purpose of benchmarks, nor are they particularly unusual or idiosyncratic. They follow +most Java best practices and adhere to many of the most common Java linters (although those +were disabled for this performance benchmark). This is Java code that looks just like +any Java code you may write in your own projects, and it effortlessless compiles at +>100,000 lines/second. + ## What About Build Tools? Although the Java Compiler is blazing fast - compiling code at >100k lines/second and @@ -281,16 +293,25 @@ the `javac` Java compiler directly. And it's clear that compared to the actual w done by `javac` to actually compile your code, build tools add a frankly absurd amount of overhead ranging from ~4x for Mill to 15-16x for Maven and Gradle! -One thing worth calling out is that the overhead of the various build tools does not -appear to go down in larger builds. Although this *Clean Compile Single-Module* benchmark -we explored above only deals with compiling a single small module, a similar *Sequential -Clean Compile* benchmarks which compile the entire Mockito and Netty projects shows similar build-tool slowdowns: -xref:comparisons/gradle.adoc#_sequential_clean_compile_all[Gradle compiling 100,000 lines of Java at ~5,600 lines/s] -and xref:comparisons/maven.adoc#_sequential_clean_compile_all[Maven compiling 500,000 lines of Java at ~5,100 lines/s], -with Mill reaching ~25,000 lines/s. All of these are far below the 100,000 lines/s -that we should expect from Java compilation, and roughly line up with the numbers measured -above. +## Whole Project Compile Speed +One thing worth calling out is that the overhead of the various build tools does not +appear to go down in larger builds. This *Clean Compile Single-Module* benchmark +we explored above only deals with compiling a single small module. But a similar *Sequential +Clean Compile* benchmarks which compiles the entire Mockito and Netty projects on +a single core shows similar numbers for the various build tools: + +* xref:comparisons/gradle.adoc#_sequential_clean_compile_all[Gradle compiling 100,000 lines of Java at ~5,600 lines/s] +* xref:comparisons/maven.adoc#_sequential_clean_compile_all[Maven compiling 500,000 lines of Java at ~5,100 lines/s] +* Mill compiling at ~25,000 lines/s on both the above whole-project benchmarks + +All of these are far below the 100,000 lines/s that we should expect from Java compilation, +but they roughly line up with the numbers measured above. Again, these benchmarks are ad-hoc, +on arbitrary hardware and JVM versions. They do include small amounts of other work, such +as compiling C/C++ code in Netty or doing ad-hoc file operations in Mockito. However, +most of the time is still spent in compilation, and this reinforces the early finding +that build tools (especially older ones like Maven or Gradle) are indeed adding huge +amounts of overhead on top of the extremely-fast Java compiler. ## Conclusion @@ -298,8 +319,8 @@ From this study we can see the paradox: the Java _compiler_ is blazing fast, while Java _build tools_ are dreadfully slow. Something that _should_ compile in a fraction of a second using a warm `javac` takes several seconds (15-16x longer) to compile using Maven or Gradle. Mill does better, but even it adds 4x overhead and falls -short of the snappiness you would expect from a compiler that takes ~0.3s to compile these -30-40kLOC Java codebases. +short of the snappiness you would expect from a compiler that takes ~0.3s to compile the +30-40kLOC Java codebases we experimented with. These benchmarks were run ad-hoc and on my laptop on arbitrary codebases, and the details will obviously differ depending on environment and the code in question. Running it on an @@ -312,12 +333,14 @@ caching and invalidation, and all sorts of other auxiliary tasks. But in the com someone edits code and then compiles it, and all your dependencies are already downloaded and cached locally, any time doing other things and not spent _actually compiling Java_ is pure overhead. Checking for cache invalidation in _shouldn't_ take 15-16x -as long as actually compiling your code. I mean it obviously does _today_, but it _shouldn't_. +as long as actually compiling your code. I mean it obviously does _today_, but it _shouldn't_! The Mill build tool goes to great lengths to try and minimize overhead: keeping long-lived worker processes to let the compiler warm up, using a fast-launching minimal client to connect -to it, repeated bouts of benchmarking and optimization. Nevertheless, we can see that Mill -has a long way to go to catch up to the raw performance of `javac`. Improving Mill is a work -in progress, and will remain so until compiling a 30-40kLOC Java codebase from the command line -takes ~0.3 seconds. If Java build and compile times are something that matter to you, you -should try out the Mill build tool and get involved! \ No newline at end of file +to it, repeated bouts of benchmarking and optimization. Mill already gets +xref:comparisons/why-mill.adoc#_performance[~4x faster builds] than Maven or Gradle on +real-world projects like Mockito or Netty across a wide range of workflows, but it +still has a long way to go to catch up to the raw performance of `javac`. Improving Mill +is a work in progress, and will remain so until compiling a 30-40kLOC Java codebase from the +command line takes ~0.3 seconds. If Java build and compile times are things you find important, +you should try out Mill and get involved! \ No newline at end of file From 6a980207d54bcf57a65955062aec7bf9da46343b Mon Sep 17 00:00:00 2001 From: Li Haoyi Date: Wed, 20 Nov 2024 11:25:03 +0800 Subject: [PATCH 9/9] . --- .../ROOT/pages/comparisons/java-compile.adoc | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/docs/modules/ROOT/pages/comparisons/java-compile.adoc b/docs/modules/ROOT/pages/comparisons/java-compile.adoc index e1beff799ec..63d9ec3fed8 100644 --- a/docs/modules/ROOT/pages/comparisons/java-compile.adoc +++ b/docs/modules/ROOT/pages/comparisons/java-compile.adoc @@ -335,12 +335,9 @@ cached locally, any time doing other things and not spent _actually compiling Java_ is pure overhead. Checking for cache invalidation in _shouldn't_ take 15-16x as long as actually compiling your code. I mean it obviously does _today_, but it _shouldn't_! -The Mill build tool goes to great lengths to try and minimize overhead: keeping long-lived -worker processes to let the compiler warm up, using a fast-launching minimal client to connect -to it, repeated bouts of benchmarking and optimization. Mill already gets +The Mill build tool goes to great lengths to try and minimize overhead, and already gets xref:comparisons/why-mill.adoc#_performance[~4x faster builds] than Maven or Gradle on -real-world projects like Mockito or Netty across a wide range of workflows, but it -still has a long way to go to catch up to the raw performance of `javac`. Improving Mill -is a work in progress, and will remain so until compiling a 30-40kLOC Java codebase from the -command line takes ~0.3 seconds. If Java build and compile times are things you find important, -you should try out Mill and get involved! \ No newline at end of file +real-world projects like Mockito or Netty. But there still is a long way to go give Java +developers the fast, snappy experience that the underlying Java platform can provide. If +Java build and compile times are things you find important, you should try out Mill on +your own projects and get involved in the effort! \ No newline at end of file