The Mill Build Engineering Blog
Welcome to the Mill build engineering blog! This is the home for articles on technical topics related to JVM platform tooling and language-agnostic build tooling, some specific to the Mill build tool but mostly applicable to anyone working on build tooling for large codebases in JVM and non-JVM languages.
Understanding JVM Garbage Collector Performance
Li Haoyi, 10 January 2025
Garbage collectors are a core part of many programming languages. While they generally work well, on occasion when they go wrong they can fail in very unintuitive ways. This article will discuss the fundamental design of how garbage collectors work, and tie it to real benchmarks of how GCs perform on the Java Virtual Machine. You should come away with a deeper understanding of how the JVM garbage collector works and concrete ways you can work to improve its performance in your own real-world projects.
How JVM Executable Assembly Jars Work
Li Haoyi, 2 January 2025
One feature of the Mill JVM build tool is that the assembly jars it creates are directly executable:
> ./mill show foo.assembly # generate the assembly jar
"ref:v0:bd2c6c70:/Users/lihaoyi/test/out/foo/assembly.dest/out.jar"
> out/foo/assembly.dest/out.jar # run the assembly jar directly
Hello World
Other JVM build tools also can generate assemblies, but most need you to run them
via java -jar
or java -cp
,
or require you to use jlink or
jpackage
which are much more heavyweight and troublesome to set up. Mill automates that, and while not
groundbreaking, it is a nice convenience that makes your JVM
code built with Mill fit more nicely into command-line centric workflows common in modern
software systems.
This article will discuss how Mill’s executable assemblies are implemented, so perhaps other build tools and toolchains will be able to provide the same convenience
How To Manage Flaky Tests in your CI Workflows
Li Haoyi, 1 January 2025
Many projects suffer from the problem of flaky tests: tests that pass or fail non-deterministically. These cause confusion, slow development cycles, and endless arguments between individuals and teams in an organization.
This article dives deep into working with flaky tests, from the perspective of someone who built the first flaky test management systems at Dropbox and Databricks and maintained the related build and CI workflows over the past decade. The issue of flaky tests can be surprisingly unintuitive, with many "obvious" approaches being ineffective or counterproductive. But it turns out there are right and wrong answers to many of these issues, and we will discuss both so you can better understand what managing flaky tests is all about.
Faster CI with Selective Testing
Li Haoyi, 24 December 2024
Selective testing is a key technique necessary for working with any large codebase or monorepo: picking which tests to run to validate a change or pull-request, because running every test every time is costly and slow. This blog post will explore what selective testing is all about, the different approaches you can take with selective testing, based on my experience working on developer tooling and CI for the last decade at Dropbox and Databricks. Lastly, we will discuss how the Mill build tool supports selective testing.
Why Use a Monorepo Build Tool?
Li Haoyi, 17 December 2024
Software build tools mostly fall into two categories:
One question that comes up constantly is why do people use Monorepo build tools? Tools like Bazel are orders of magnitude more complicated and hard to use than tools like Poetry or Cargo, so why do people use them at all?
How Fast Does Java Compile?
Li Haoyi, 29 November 2024
Java compiles have the reputation for being slow, but that reputation does not match today’s reality. Nowadays the Java compiler can compile "typical" Java code at over 100,000 lines a second on a single core. That means that even a million line project should take more than 10s to compile in a single-threaded fashion, and should be even faster in the presence of parallelism