How Fast Does Java Compile?
Li Haoyi, 29 November 2024
Java compiles have the reputation for being slow, but that reputation does not match today’s reality. Nowadays the Java compiler can compile "typical" Java code at over 100,000 lines a second on a single core. That means that even a million line project should take more than 10s to compile in a single-threaded fashion, and should be even faster in the presence of parallelism
Doing some ad-hoc benchmarks, we find that although the compiler is blazing fast, all build tools add significant overhead over compiling Java directly:
Mockito Core |
Time |
Compiler lines/s |
Slowdown |
Netty Common |
Time |
Compiler lines/s |
Slowdown |
Javac Hot |
0.36s |
115,600 |
1.0x |
Javac Hot |
0.29s |
102,500 |
1.0x |
Javac Cold |
1.29s |
32,200 |
4.4x |
Javac Cold |
1.62s |
18,300 |
5.6x |
Mill |
1.20s |
34,700 |
4.1x |
Mill |
1.11s |
26,800 |
3.8x |
Gradle |
4.41s |
9,400 |
15.2x |
Maven |
4.89s |
6,100 |
16.9x |
Although Mill does the best in these benchmarks among the build tools (Maven, Gradle, and Mill), all build tools fall short of how fast compiling Java should be. This post explores how these numbers were arrived at, and what that means in un-tapped potential for Java build tooling to become truly great.
Mockito Core
To begin to understand the problem, lets consider the codebase of the popular Mockito project:
Mockito is a medium-sized Java project with a few dozen sub-modules and about ~100,000 lines
of code. To give us a simple reproducible scenario, let’s consider the root mockito module
with sources in src/main/java/
, on which all the downstream module and tests depend on.
Mockito is built using Gradle. It’s not totally trivial to extract the compilation classpath from Gradle, but the following stackoverflow answer gives us some tips:
> ./gradlew clean && ./gradlew :classes --no-build-cache --debug | grep "classpath "
This gives us the following classpath:
export MY_CLASSPATH=/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/net.bytebuddy/byte-buddy/1.14.18/81e9b9a20944626e6757b5950676af901c2485/byte-buddy-1.14.18.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/net.bytebuddy/byte-buddy-agent/1.14.18/417558ea01fe9f0e8a94af28b9469d281c4e3984/byte-buddy-agent-1.14.18.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/junit/junit/4.13.2/8ac9e16d933b6fb43bc7f576336b8f4d7eb5ba12/junit-4.13.2.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest-core/2.2/3f2bd07716a31c395e2837254f37f21f0f0ab24b/hamcrest-core-2.2.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.opentest4j/opentest4j/1.3.0/152ea56b3a72f655d4fd677fc0ef2596c3dd5e6e/opentest4j-1.3.0.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.objenesis/objenesis/3.3/1049c09f1de4331e8193e579448d0916d75b7631/objenesis-3.3.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest/2.2/1820c0968dba3a11a1b30669bb1f01978a91dedc/hamcrest-2.2.jar
Note that for this benchmark, all third-party dependencies have already been resolved and downloaded from Maven Central. We can thus simply reference the jars on disk directly, which we do above.
We can then pass this classpath into javac -cp
, together with src/main/java/*/.java
,
to perform the compilation outside of Gradle using javac
directly. Running this a few
times gives us the timings below:
> time javac -cp $MY_CLASSPATH src/main/java/**/*.java
1.290s
1.250s
1.293s
To give us an idea of how many lines of code we are compiling, we can run:
> find src/main/java | grep \\.java | xargs wc -l
...
41601 total
Combining this information, we find that 41601 lines of code compiled in ~1.29 seconds
(taking the median of the three runs above) suggests that javac
compiles about ~32,000
lines of code per second.
These benchmarks were run ad-hoc on my laptop, an M1 10-core Macbook Pro, with OpenJDK Corretto 17.0.6. The numbers would differ on different Java versions, hardware, operating systems, and filesystems. Nevertheless, the overall trend is strong enough that you should be able to reproduce the results despite variations in the benchmarking environment.
Compiling 32,000 lines of code per second is not bad. But it is nowhere near how fast the Java compiler can run. Any software experience with JVM experience would know the next obvious optimization for us to explore.
Keeping the JVM Hot
One issue with the above benchmark is that it uses javac
as a sub-process. The Java
compiler runs on the Java Virtual Machine, and like any JVM application, it has a slow
startup time, takes time warming-up, but then has good steady-state performance.
Running javac
from the command line and compiling ~32,000 lines/sec is thus the worst
possible performance you could get out of the Java compiler on this Java codebase.
To get good performance out of javac
, like any other JVM application, we need to keep it
long-lived so it has a chance to warm up. While running the javac
in a long-lived Java
program is not commonly taught, neither is it particularly difficult. Here is a complete
Bench.java
file that does this, repeatedly running java compilation in a loop where it
has a chance to warm up, to emulate the long lived JVM process that a build tool like Mill
may spawn and manage. We use the same MY_CLASSPATH
and source files we saw earlier and
print the output statistics to the terminal so we can see how fast Java compilation can
occur once things have a chance to warm up:
// Bench.java
import javax.tools.*;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.nio.file.*;
import java.util.List;
import java.util.stream.Collectors;
public class Bench {
public static void main(String[] args) throws Exception {
while (true) {
long now = System.currentTimeMillis();
String classpath = System.getenv("MY_CLASSPATH");
Path sourceFolder = Paths.get("src/main/java");
List<JavaFileObject> files = Files.walk(sourceFolder)
.filter(p -> p.toString().endsWith(".java"))
.map(p ->
new SimpleJavaFileObject(p.toUri(), JavaFileObject.Kind.SOURCE) {
public CharSequence getCharContent(boolean ignoreEncodingErrors) throws IOException {
return Files.readString(p);
}
}
)
.collect(Collectors.toList());
JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
StandardJavaFileManager fileManager = compiler
.getStandardFileManager(null, null, null);
// Run the compiler
JavaCompiler.CompilationTask task = compiler.getTask(
new OutputStreamWriter(System.out),
fileManager,
null,
List.of("-classpath", classpath),
null,
files
);
System.out.println("Compile Result: " + task.call());
long end = System.currentTimeMillis();
long lineCount = Files.walk(sourceFolder)
.filter(p -> p.toString().endsWith(".java"))
.map(p -> {
try { return Files.readAllLines(p).size(); }
catch(Exception e){ throw new RuntimeException(e); }
})
.reduce(0, (x, y) -> x + y);
System.out.println("Lines: " + lineCount);
System.out.println("Duration: " + (end - now));
System.out.println("Lines/second: " + lineCount / ((end - now) / 1000));
}
}
}
Running this using java Bench.java
in the Mockito repo root, eventually we see it
settle on approximately the following numbers:
359ms
378ms
353ms
The codebase hasn’t changed - we are still compiling 41,601 lines of code - but now it only takes ~359ms. That tells us that using a long-lived warm Java compiler we can compile approximately 116,000 lines of Java a second on a single core.
Compiling 116,000 lines of Java per second is very fast. That means we should expect a million-line Java codebase to compile in about 9 seconds, on a single thread. That may seem surprisingly fast, and you may be forgiven if you find it hard to believe. As mentioned earlier, this number is expected to vary based on the codebase being compiled; could it be that Mockito-Core just happens to be a very simple Java module that compiles quickly?
Double-checking Our Results
To double-check our results, we can pick another codebase to run some ad-hoc benchmarks. For this I will use the Netty codebase:
Netty is a large-ish Java project: ~500,000 lines of code. Again, to pick a somewhat
easily-reproducible benchmark, we want a decently-sized module that’s relatively
standalone within the project: netty-common
is a perfect fit. Again, we can use find | grep | xargs
to see how many lines of code we are looking at:
$ find common/src/main/java | grep \\.java | xargs wc -l
29712 total
Again, Maven doesn’t make it easy to show the classpath used to call javac
ourselves,
but the following stackoverflow answer gives us a hint in how to do so:
> ./mvnw clean; time ./mvnw -e -X -pl common -Pfast -DskipTests -Dcheckstyle.skip -Denforcer.skip=true install
If you grep the output for -classpath
, we see:
-classpath /Users/lihaoyi/Github/netty/common/target/classes:/Users/lihaoyi/.m2/repository/org/graalvm/nativeimage/svm/19.3.6/svm-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/sdk/graal-sdk/19.3.6/graal-sdk-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/nativeimage/objectfile/19.3.6/objectfile-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/nativeimage/pointsto/19.3.6/pointsto-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/truffle/truffle-nfi/19.3.6/truffle-nfi-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/truffle/truffle-api/19.3.6/truffle-api-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/compiler/compiler/19.3.6/compiler-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/jctools/jctools-core/4.0.5/jctools-core-4.0.5.jar:/Users/lihaoyi/.m2/repository/org/jetbrains/annotations-java5/23.0.0/annotations-java5-23.0.0.jar:/Users/lihaoyi/.m2/repository/org/slf4j/slf4j-api/1.7.30/slf4j-api-1.7.30.jar:/Users/lihaoyi/.m2/repository/commons-logging/commons-logging/1.2/commons-logging-1.2.jar:/Users/lihaoyi/.m2/repository/org/apache/logging/log4j/log4j-1.2-api/2.17.2/log4j-1.2-api-2.17.2.jar:/Users/lihaoyi/.m2/repository/org/apache/logging/log4j/log4j-api/2.17.2/log4j-api-2.17.2.jar:/Users/lihaoyi/.m2/repository/io/projectreactor/tools/blockhound/1.0.6.RELEASE/blockhound-1.0.6.RELEASE.jar
Again, we can export MY_CLASSPATH
and start using javac
from the command line:
> javac -cp $MY_CLASSPATH common/src/main/java/**/*.java
1.624s
1.757s
1.606s
Or programmatically using the Bench.java
program we saw earlier:
294ms
282ms
285ms
Taking 285ms for a hot-in-memory compile of 29,712 lines of code, netty-common
therefore compiles at ~104,000 lines/second.
Although the choice of project is arbitrary, Mockito-Core and Netty-Common are decent examples of Java code found "out in the wild". They aren’t synthetic fake codebases generated for the purpose of benchmarks, nor are they particularly unusual or idiosyncratic. They follow most Java best practices and adhere to many of the most common Java linters (although those were disabled for this performance benchmark). This is Java code that looks just like any Java code you may write in your own projects, and it effortlessless compiles at >100,000 lines/second.
What About Build Tools?
Although the Java Compiler is blazing fast - compiling code at >100k lines/second and completing both Mockito-Core and Netty-Common in ~300ms - the experience of using typical Java build tools is nowhere near as snappy. Consider the benchmark of clean-compiling the Mockito-Core codebase using Gradle or Mill:
$ ./gradlew clean; time ./gradlew :classes --no-build-cache
4.14s
4.41s
4.41s
$ ./mill clean; time ./mill compile
1.20s
1.12s
1.30s
Or the benchmark of clean-compiling the Netty-Common codebase using Maven or Mill:
$ ./mvnw clean; time ./mvnw -pl common -Pfast -DskipTests -Dcheckstyle.skip -Denforcer.skip=true -Dmaven.test.skip=true install
4.85s
4.96s
4.89s
$ ./mill clean common; time ./mill common.compile
1.10s
1.12s
1.11s
These benchmarks are run in similar conditions as those we saw earlier: ad-hoc on my M1 Macbook Pro, with the metadata and jars of all third-party dependencies already downloaded and cached locally. So the time we are seeing above is purely the Java compilation + the overhead of the build tool realizing it doesn’t need to do anything except compile the Java source code using the dependencies we already have on disk.
Tabulating this all together gives us the table we saw at the start of this page:
Mockito Core |
Time |
Compiler lines/s |
Slowdown |
Netty Common |
Time |
Compiler lines/s |
Slowdown |
Javac Hot |
0.36s |
115,600 |
1.0x |
Javac Hot |
0.29s |
102,500 |
1.0x |
Javac Cold |
1.29s |
32,200 |
4.4x |
Javac Cold |
1.62s |
18,300 |
5.6x |
Mill |
1.20s |
34,700 |
4.1x |
Mill |
1.11s |
26,800 |
3.8x |
Gradle |
4.41s |
9,400 |
15.2x |
Maven |
4.89s |
6,100 |
16.9x |
We explore the comparison between Gradle vs Mill
or Maven vs Mill in more detail on their own dedicated pages.
For this article, the important thing is not comparing the build tools against each other,
but comparing the build tools against what how fast they could be if they just used
the javac
Java compiler directly. And it’s clear that compared to the actual work
done by javac
to actually compile your code, build tools add a frankly absurd amount
of overhead ranging from ~4x for Mill to 15-16x for Maven and Gradle!
Whole Project Compile Speed
One thing worth calling out is that the overhead of the various build tools does not appear to go down in larger builds. This Clean Compile Single-Module benchmark we explored above only deals with compiling a single small module. But a similar Sequential Clean Compile benchmarks which compiles the entire Mockito and Netty projects on a single core shows similar numbers for the various build tools:
-
Mill compiling at ~25,000 lines/s on both the above whole-project benchmarks
All of these are far below the 100,000 lines/s that we should expect from Java compilation, but they roughly line up with the numbers measured above. Again, these benchmarks are ad-hoc, on arbitrary hardware and JVM versions. They do include small amounts of other work, such as compiling C/C++ code in Netty or doing ad-hoc file operations in Mockito. However, most of the time is still spent in compilation, and this reinforces the early finding that build tools (especially older ones like Maven or Gradle) are indeed adding huge amounts of overhead on top of the extremely-fast Java compiler.
Conclusion
From this study we can see the paradox: the Java compiler is blazing fast,
while Java build tools are dreadfully slow. Something that should compile in a fraction
of a second using a warm javac
takes several seconds (15-16x longer) to
compile using Maven or Gradle. Mill does better, but even it adds 4x overhead and falls
short of the snappiness you would expect from a compiler that takes ~0.3s to compile the
30-40kLOC Java codebases we experimented with.
These benchmarks were run ad-hoc and on my laptop on arbitrary codebases, and the details will obviously differ depending on environment and the code in question. Running it on an entire codebase, rather than a single module, will give different results. Nevertheless, the results are clear: "typical" Java code should compile at ~100,000 lines/second on a single thread. Anything less is purely build-tool overhead from Maven, Gradle, or Mill.
Build tools do a lot more than the Java compiler. They do dependency management, parallelism, caching and invalidation, and all sorts of other auxiliary tasks. But in the common case where someone edits code and then compiles it, and all your dependencies are already downloaded and cached locally, any time doing other things and not spent actually compiling Java is pure overhead. Checking for cache invalidation in shouldn’t take 15-16x as long as actually compiling your code. I mean it obviously does today, but it shouldn’t!
The Mill build tool goes to great lengths to try and minimize overhead, and already gets ~4x faster builds than Maven or Gradle on real-world projects like Mockito or Netty. But there still is a long way to go give Java developers the fast, snappy experience that the underlying Java platform can provide. If Java build and compile times are things you find important, you should try out Mill on your own projects and get involved in the effort!