How Fast Does Java Compile?

Li Haoyi, 29 November 2024

Java compiles have the reputation for being slow, but that reputation does not match today’s reality. Nowadays the Java compiler can compile "typical" Java code at over 100,000 lines a second on a single core. That means that even a million line project should take more than 10s to compile in a single-threaded fashion, and should be even faster in the presence of parallelism

Doing some ad-hoc benchmarks, we find that although the compiler is blazing fast, all build tools add significant overhead over compiling Java directly:

Mockito Core

Time

Compiler lines/s

Slowdown

Netty Common

Time

Compiler lines/s

Slowdown

Javac Hot

0.36s

115,600

1.0x

Javac Hot

0.29s

102,500

1.0x

Javac Cold

1.29s

32,200

4.4x

Javac Cold

1.62s

18,300

5.6x

Mill

1.20s

34,700

4.1x

Mill

1.11s

26,800

3.8x

Gradle

4.41s

9,400

15.2x

Maven

4.89s

6,100

16.9x

Although Mill does the best in these benchmarks among the build tools (Maven, Gradle, and Mill), all build tools fall short of how fast compiling Java should be. This post explores how these numbers were arrived at, and what that means in un-tapped potential for Java build tooling to become truly great.

Mockito Core

To begin to understand the problem, lets consider the codebase of the popular Mockito project:

Mockito is a medium-sized Java project with a few dozen sub-modules and about ~100,000 lines of code. To give us a simple reproducible scenario, let’s consider the root mockito module with sources in src/main/java/, on which all the downstream module and tests depend on.

Mockito is built using Gradle. It’s not totally trivial to extract the compilation classpath from Gradle, but the following stackoverflow answer gives us some tips:

> ./gradlew clean && ./gradlew :classes --no-build-cache --debug | grep "classpath "

This gives us the following classpath:

export MY_CLASSPATH=/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/net.bytebuddy/byte-buddy/1.14.18/81e9b9a20944626e6757b5950676af901c2485/byte-buddy-1.14.18.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/net.bytebuddy/byte-buddy-agent/1.14.18/417558ea01fe9f0e8a94af28b9469d281c4e3984/byte-buddy-agent-1.14.18.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/junit/junit/4.13.2/8ac9e16d933b6fb43bc7f576336b8f4d7eb5ba12/junit-4.13.2.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest-core/2.2/3f2bd07716a31c395e2837254f37f21f0f0ab24b/hamcrest-core-2.2.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.opentest4j/opentest4j/1.3.0/152ea56b3a72f655d4fd677fc0ef2596c3dd5e6e/opentest4j-1.3.0.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.objenesis/objenesis/3.3/1049c09f1de4331e8193e579448d0916d75b7631/objenesis-3.3.jar:/Users/lihaoyi/.gradle/caches/modules-2/files-2.1/org.hamcrest/hamcrest/2.2/1820c0968dba3a11a1b30669bb1f01978a91dedc/hamcrest-2.2.jar

Note that for this benchmark, all third-party dependencies have already been resolved and downloaded from Maven Central. We can thus simply reference the jars on disk directly, which we do above.

We can then pass this classpath into javac -cp, together with src/main/java/*/.java, to perform the compilation outside of Gradle using javac directly. Running this a few times gives us the timings below:

> time javac -cp $MY_CLASSPATH src/main/java/**/*.java
1.290s
1.250s
1.293s

To give us an idea of how many lines of code we are compiling, we can run:

> find src/main/java | grep \\.java | xargs wc -l
...
41601 total

Combining this information, we find that 41601 lines of code compiled in ~1.29 seconds (taking the median of the three runs above) suggests that javac compiles about ~32,000 lines of code per second.

These benchmarks were run ad-hoc on my laptop, an M1 10-core Macbook Pro, with OpenJDK Corretto 17.0.6. The numbers would differ on different Java versions, hardware, operating systems, and filesystems. Nevertheless, the overall trend is strong enough that you should be able to reproduce the results despite variations in the benchmarking environment.

Compiling 32,000 lines of code per second is not bad. But it is nowhere near how fast the Java compiler can run. Any software experience with JVM experience would know the next obvious optimization for us to explore.

Keeping the JVM Hot

One issue with the above benchmark is that it uses javac as a sub-process. The Java compiler runs on the Java Virtual Machine, and like any JVM application, it has a slow startup time, takes time warming-up, but then has good steady-state performance. Running javac from the command line and compiling ~32,000 lines/sec is thus the worst possible performance you could get out of the Java compiler on this Java codebase.

To get good performance out of javac, like any other JVM application, we need to keep it long-lived so it has a chance to warm up. While running the javac in a long-lived Java program is not commonly taught, neither is it particularly difficult. Here is a complete Bench.java file that does this, repeatedly running java compilation in a loop where it has a chance to warm up, to emulate the long lived JVM process that a build tool like Mill may spawn and manage. We use the same MY_CLASSPATH and source files we saw earlier and print the output statistics to the terminal so we can see how fast Java compilation can occur once things have a chance to warm up:

// Bench.java
import javax.tools.*;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.nio.file.*;
import java.util.List;
import java.util.stream.Collectors;

public class Bench {
    public static void main(String[] args) throws Exception {
        while (true) {
            long now = System.currentTimeMillis();
            String classpath = System.getenv("MY_CLASSPATH");
            Path sourceFolder = Paths.get("src/main/java");

            List<JavaFileObject> files = Files.walk(sourceFolder)
                .filter(p -> p.toString().endsWith(".java"))
                .map(p ->
                    new SimpleJavaFileObject(p.toUri(), JavaFileObject.Kind.SOURCE) {
                        public CharSequence getCharContent(boolean ignoreEncodingErrors) throws IOException {
                            return Files.readString(p);
                        }
                    }
                )
                .collect(Collectors.toList());

            JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();

            StandardJavaFileManager fileManager = compiler
                .getStandardFileManager(null, null, null);

            // Run the compiler
            JavaCompiler.CompilationTask task = compiler.getTask(
                new OutputStreamWriter(System.out),
                fileManager,
                null,
                List.of("-classpath", classpath),
                null,
                files
            );

            System.out.println("Compile Result: " + task.call());
            long end = System.currentTimeMillis();
            long lineCount = Files.walk(sourceFolder)
                .filter(p -> p.toString().endsWith(".java"))
                .map(p -> {
                    try { return Files.readAllLines(p).size(); }
                    catch(Exception e){ throw new RuntimeException(e); }
                })
                .reduce(0, (x, y) -> x + y);
            System.out.println("Lines: " + lineCount);
            System.out.println("Duration: " + (end - now));
            System.out.println("Lines/second: " + lineCount / ((end - now) / 1000));
        }
    }
}

Running this using java Bench.java in the Mockito repo root, eventually we see it settle on approximately the following numbers:

359ms
378ms
353ms

The codebase hasn’t changed - we are still compiling 41,601 lines of code - but now it only takes ~359ms. That tells us that using a long-lived warm Java compiler we can compile approximately 116,000 lines of Java a second on a single core.

Compiling 116,000 lines of Java per second is very fast. That means we should expect a million-line Java codebase to compile in about 9 seconds, on a single thread. That may seem surprisingly fast, and you may be forgiven if you find it hard to believe. As mentioned earlier, this number is expected to vary based on the codebase being compiled; could it be that Mockito-Core just happens to be a very simple Java module that compiles quickly?

Double-checking Our Results

To double-check our results, we can pick another codebase to run some ad-hoc benchmarks. For this I will use the Netty codebase:

Netty is a large-ish Java project: ~500,000 lines of code. Again, to pick a somewhat easily-reproducible benchmark, we want a decently-sized module that’s relatively standalone within the project: netty-common is a perfect fit. Again, we can use find | grep | xargs to see how many lines of code we are looking at:

$ find common/src/main/java | grep \\.java | xargs wc -l
29712 total

Again, Maven doesn’t make it easy to show the classpath used to call javac ourselves, but the following stackoverflow answer gives us a hint in how to do so:

> ./mvnw clean; time ./mvnw -e -X -pl common -Pfast -DskipTests  -Dcheckstyle.skip -Denforcer.skip=true install

If you grep the output for -classpath, we see:

-classpath /Users/lihaoyi/Github/netty/common/target/classes:/Users/lihaoyi/.m2/repository/org/graalvm/nativeimage/svm/19.3.6/svm-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/sdk/graal-sdk/19.3.6/graal-sdk-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/nativeimage/objectfile/19.3.6/objectfile-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/nativeimage/pointsto/19.3.6/pointsto-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/truffle/truffle-nfi/19.3.6/truffle-nfi-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/truffle/truffle-api/19.3.6/truffle-api-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/graalvm/compiler/compiler/19.3.6/compiler-19.3.6.jar:/Users/lihaoyi/.m2/repository/org/jctools/jctools-core/4.0.5/jctools-core-4.0.5.jar:/Users/lihaoyi/.m2/repository/org/jetbrains/annotations-java5/23.0.0/annotations-java5-23.0.0.jar:/Users/lihaoyi/.m2/repository/org/slf4j/slf4j-api/1.7.30/slf4j-api-1.7.30.jar:/Users/lihaoyi/.m2/repository/commons-logging/commons-logging/1.2/commons-logging-1.2.jar:/Users/lihaoyi/.m2/repository/org/apache/logging/log4j/log4j-1.2-api/2.17.2/log4j-1.2-api-2.17.2.jar:/Users/lihaoyi/.m2/repository/org/apache/logging/log4j/log4j-api/2.17.2/log4j-api-2.17.2.jar:/Users/lihaoyi/.m2/repository/io/projectreactor/tools/blockhound/1.0.6.RELEASE/blockhound-1.0.6.RELEASE.jar

Again, we can export MY_CLASSPATH and start using javac from the command line:

> javac -cp $MY_CLASSPATH common/src/main/java/**/*.java
1.624s
1.757s
1.606s

Or programmatically using the Bench.java program we saw earlier:

294ms
282ms
285ms

Taking 285ms for a hot-in-memory compile of 29,712 lines of code, netty-common therefore compiles at ~104,000 lines/second.

Although the choice of project is arbitrary, Mockito-Core and Netty-Common are decent examples of Java code found "out in the wild". They aren’t synthetic fake codebases generated for the purpose of benchmarks, nor are they particularly unusual or idiosyncratic. They follow most Java best practices and adhere to many of the most common Java linters (although those were disabled for this performance benchmark). This is Java code that looks just like any Java code you may write in your own projects, and it effortlessless compiles at >100,000 lines/second.

What About Build Tools?

Although the Java Compiler is blazing fast - compiling code at >100k lines/second and completing both Mockito-Core and Netty-Common in ~300ms - the experience of using typical Java build tools is nowhere near as snappy. Consider the benchmark of clean-compiling the Mockito-Core codebase using Gradle or Mill:

$ ./gradlew clean; time ./gradlew :classes --no-build-cache
4.14s
4.41s
4.41s

$ ./mill clean; time ./mill compile
1.20s
1.12s
1.30s

Or the benchmark of clean-compiling the Netty-Common codebase using Maven or Mill:

$ ./mvnw clean; time ./mvnw -pl common -Pfast -DskipTests  -Dcheckstyle.skip -Denforcer.skip=true -Dmaven.test.skip=true install
4.85s
4.96s
4.89s

$ ./mill clean common; time ./mill common.compile
1.10s
1.12s
1.11s

These benchmarks are run in similar conditions as those we saw earlier: ad-hoc on my M1 Macbook Pro, with the metadata and jars of all third-party dependencies already downloaded and cached locally. So the time we are seeing above is purely the Java compilation + the overhead of the build tool realizing it doesn’t need to do anything except compile the Java source code using the dependencies we already have on disk.

Tabulating this all together gives us the table we saw at the start of this page:

Mockito Core

Time

Compiler lines/s

Slowdown

Netty Common

Time

Compiler lines/s

Slowdown

Javac Hot

0.36s

115,600

1.0x

Javac Hot

0.29s

102,500

1.0x

Javac Cold

1.29s

32,200

4.4x

Javac Cold

1.62s

18,300

5.6x

Mill

1.20s

34,700

4.1x

Mill

1.11s

26,800

3.8x

Gradle

4.41s

9,400

15.2x

Maven

4.89s

6,100

16.9x

We explore the comparison between Gradle vs Mill or Maven vs Mill in more detail on their own dedicated pages. For this article, the important thing is not comparing the build tools against each other, but comparing the build tools against what how fast they could be if they just used the javac Java compiler directly. And it’s clear that compared to the actual work done by javac to actually compile your code, build tools add a frankly absurd amount of overhead ranging from ~4x for Mill to 15-16x for Maven and Gradle!

Whole Project Compile Speed

One thing worth calling out is that the overhead of the various build tools does not appear to go down in larger builds. This Clean Compile Single-Module benchmark we explored above only deals with compiling a single small module. But a similar Sequential Clean Compile benchmarks which compiles the entire Mockito and Netty projects on a single core shows similar numbers for the various build tools:

All of these are far below the 100,000 lines/s that we should expect from Java compilation, but they roughly line up with the numbers measured above. Again, these benchmarks are ad-hoc, on arbitrary hardware and JVM versions. They do include small amounts of other work, such as compiling C/C++ code in Netty or doing ad-hoc file operations in Mockito. However, most of the time is still spent in compilation, and this reinforces the early finding that build tools (especially older ones like Maven or Gradle) are indeed adding huge amounts of overhead on top of the extremely-fast Java compiler.

Conclusion

From this study we can see the paradox: the Java compiler is blazing fast, while Java build tools are dreadfully slow. Something that should compile in a fraction of a second using a warm javac takes several seconds (15-16x longer) to compile using Maven or Gradle. Mill does better, but even it adds 4x overhead and falls short of the snappiness you would expect from a compiler that takes ~0.3s to compile the 30-40kLOC Java codebases we experimented with.

These benchmarks were run ad-hoc and on my laptop on arbitrary codebases, and the details will obviously differ depending on environment and the code in question. Running it on an entire codebase, rather than a single module, will give different results. Nevertheless, the results are clear: "typical" Java code should compile at ~100,000 lines/second on a single thread. Anything less is purely build-tool overhead from Maven, Gradle, or Mill.

Build tools do a lot more than the Java compiler. They do dependency management, parallelism, caching and invalidation, and all sorts of other auxiliary tasks. But in the common case where someone edits code and then compiles it, and all your dependencies are already downloaded and cached locally, any time doing other things and not spent actually compiling Java is pure overhead. Checking for cache invalidation in shouldn’t take 15-16x as long as actually compiling your code. I mean it obviously does today, but it shouldn’t!

The Mill build tool goes to great lengths to try and minimize overhead, and already gets ~4x faster builds than Maven or Gradle on real-world projects like Mockito or Netty. But there still is a long way to go give Java developers the fast, snappy experience that the underlying Java platform can provide. If Java build and compile times are things you find important, you should try out Mill on your own projects and get involved in the effort!