Bundled Libraries

Mill comes bundled with a set of external Open Source libraries and projects.

OS-lib

Mill uses OS-Lib for all of its file system and subprocess operations.

Sandbox Working Directories

One thing to note about Mill’s usage of OS-Lib is that Mill sets the os.pwd and for filesystem operations and subprocesses to each task’s .dest folder, as part of its Mill Sandboxing efforts to prevent accidental interference between tasks:

build.mill (download, browse)
import mill.*

def task1 = Task {
  os.write(os.pwd / "file.txt", "hello")
  PathRef(os.pwd / "file.txt")
}

def task2 = Task {
  os.call(("bash", "-c", "echo 'world' >> file.txt"))
  PathRef(os.pwd / "file.txt")
}

def command = Task {
  println(task1().path)
  println(os.read(task1().path))
  println(task2().path)
  println(os.read(task2().path))
}

Thus although both task1 and task2 above write to os.pwd / "file.txt" - one via os.write and one via a Bash subprocess - each task gets its own working directory that prevents the files from colliding on disk. Thus the final command can depend on both tasks and read each task’s file.txt separately without conflict

Path Serialization

Task return values are cached as JSON, so any os.Path/PathRef in them is serialized to a string. To keep out/ independent of where the build lives on disk (so caches stay portable and can feed a remote cache), Mill rewrites paths under the workspace or home directory as relative aliases — ../mill-workspace/…​ and ../mill-home/…​ — and plants matching forwarder symlinks so they resolve back to the real files. This affects os.Path.toString, .toIO, .toNIO, and any serialized os.Path/PathRef. Within a task you never notice it: the aliases round-trip transparently.

When to reach for custom serialization

An alias only resolves from a cwd where Mill planted its forwarders, which it does for just three cwd shapes: task .dest folders and the run sandbox (which use ../mill-), and the workspace root (which uses out/mill-, since ../mill- would escape the workspace). So when serializing a path for a subprocess, the correct string depends on *that subprocess’s cwd, not the current task’s, and plain .toString is often wrong. Pick the PathRef.to* helper for where the subprocess runs and what it accepts:

PathRef.toRelString(path, subprocessCwd)

The reproducible form: serializes path using the alias scheme valid at subprocessCwd (out/mill- at the workspace root, ../mill- in a .dest/sandbox), falling back to an absolute path when subprocessCwd has no forwarders. This is the only custom form that stays reproducible (it never bakes in an absolute path), so prefer it whenever the subprocess runs in a directory Mill aliases and accepts relative paths. Reach for it specifically when the subprocess’s cwd differs from the current task’s — most commonly a tool you spawn at the workspace root, whose out/mill- aliases differ from the task’s ../mill-, so the default .toString would be wrong. (workspaceRoot is the default subprocessCwd, matching that common case.)

PathRef.toAbsString(path) / PathRef.toAbsFile(path) / PathRef.toAbsNioPath(path)

The lexically absolute path — made absolute and ..-normalized textually, without touching the disk, so any symlinks are left intact — as a String, java.io.File, or java.nio.file.Path. Use these when the subprocess runs in some directory Mill does not alias (so a relative alias can’t resolve there), or treats relative and absolute paths differently (resolving relatives against a base other than its cwd, or accepting only absolute paths).

PathRef.toResolvedPathString(path) / PathRef.toResolvedOsPath(path)

The same, but additionally run through toRealPath, so symlinks are followed to the canonical on-disk location. The difference from the toAbs* forms only shows up when the path traverses a symlink — one of Mill’s mill-* forwarders, or an OS symlink like macOS’s /tmp/private/tmp — and the consumer canonicalizes paths itself (e.g. Node’s module resolution or native-image) or you compare two paths for equality; a merely-lexical absolute path would still contain the symlink and mismatch the real one. Falls back to the lexical absolute form when the path doesn’t exist on disk.

> ./mill command # mac/linux
.../out/task1.dest/file.txt
hello
.../out/task2.dest/file.txt
world

uPickle

Mill uses uPickle to cache task output to disk as JSON, and to output JSON for third-party tools to consume. The output of every Mill task must be JSON serializable via uPickle.

The uPickle serialized return of every Mill task is used for multiple purposes:

  • As the format for caching things on disk

  • The output format for show, which can be used for manual inspection piped to external tools

  • Decided whether downstream results can be read from the cache or whether they need to be recomputed

Primitives and Collections

Most Scala primitive types (Strings, Ints, Booleans, etc.) and collections types (Seqs, Lists, Tuples, etc.) are serializable by default.

build.mill (download, browse)
import mill.*

def taskInt = Task { 123 }
def taskBoolean = Task { true }
def taskString = Task { "hello " + taskInt() + " world " + taskBoolean() }
> ./mill show taskInt
123

> ./mill show taskBoolean
true

> ./mill show taskString
"hello 123 world true"

> ./mill show taskTuple
[
  123,
  true,
  "hello 123 world true"
]
def taskTuple = Task { (taskInt(), taskBoolean(), taskString()) }
def taskSeq = Task { Seq(taskInt(), taskInt() * 2, taskInt() * 3) }
def taskMap = Task { Map("int" -> taskInt().toString, "boolean" -> taskBoolean().toString) }
> ./mill show taskSeq
[
  123,
  246,
  369
]

> ./mill show taskMap
{
  "int": "123",
  "boolean": "true"
}

Paths and PathRef

os.Paths from OS-Lib are also serializable as strings.

def taskPath = Task {
  os.write(os.pwd / "file.txt", "hello")
  os.pwd / "file.txt"
}
> ./mill show taskPath
".../out/taskPath.dest/file.txt"

Note that returning an os.Path from a task will only invalidate downstream tasks on changes to the path itself (e.g. from returning file.txt to file2.txt), and not to changes to the contents of any file or folder at that path. If you want to invalidate downstream tasks depending on the contents of a file or folder, you should return a PathRef:

def taskPathRef = Task {
  os.write(os.pwd / "file.txt", "hello")
  PathRef(os.pwd / "file.txt")
}
> ./mill show taskPathRef
"ref.../out/taskPathRef.dest/file.txt"

The serialized PathRef contains a hexadecimal hash signature of the file or folder referenced on disk, computed from its contents.

Requests-Scala

Mill bundles Requests-Scala for you to use to make HTTP requests. Requests-Scala lets you integrate your build with the world beyond your local filesystem.

Requests-Scala is mostly used in Mill for downloading files as part of your build. These can either be data files or executables, and in either case they are downloaded once and cached for later use.

Downloading Compilers and Source Code

In the example below, we download the Remote APIs source zip, download a Bazel Build Tool binary, and use Bazel to compile the Remote APIs source code as part of our build:

build.mill (download, browse)
import mill.*

def remoteApisZip = Task {
  println("downloading bazel-remote-apis sources...")
  os.write(
    Task.dest / "source.zip",
    requests.get("https://github.com/bazelbuild/remote-apis/archive/refs/tags/v2.2.0.zip")
  )
  PathRef(Task.dest / "source.zip")
}

def bazel = Task {
  println("downloading bazel...")
  val fileName =
    if (System.getProperty("os.name") == "Mac OS X") "bazel-5.4.1-darwin-arm64"
    else "bazel-5.4.1-linux-x86_64"

  os.write(
    Task.dest / "bazel",
    requests.get(s"https://github.com/bazelbuild/bazel/releases/download/5.4.1/$fileName")
  )
  os.perms.set(Task.dest / "bazel", "rwxrwxrwx")
  PathRef(Task.dest / "bazel")
}

def compiledRemoteApis = Task {
  val javaBuildTarget = "build/bazel/remote/execution/v2:remote_execution_java_proto"
  os.call(("unzip", remoteApisZip().path, "-d", Task.dest))
  os.call((bazel().path, "build", javaBuildTarget), cwd = Task.dest / "remote-apis-2.2.0")

  val queried = os.call(
    (bazel().path, "cquery", javaBuildTarget, "--output=files"),
    cwd = Task.dest / "remote-apis-2.2.0"
  )

  queried
    .out
    .lines()
    .map(line => PathRef(Task.dest / "remote-apis-2.2.0" / os.SubPath(line)))
}

In the execution example below, we can see the first time we ask for compiledRemoteApis, Mill downloads the Bazel build tool, downloads the Remote APIs source code, and then invokes Bazel to compile them:

> ./mill show compiledRemoteApis
downloading bazel...
downloading bazel-remote-apis sources...
Loading: ...
Analyzing: ...
...
INFO: Build completed successfully...
[
  "ref:.../bazel-out/...fastbuild/bin/build/bazel/semver/libsemver_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libduration_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libtimestamp_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libwrappers_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_api_http_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libdescriptor_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_api_annotations_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libany_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_rpc_status_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libempty_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_longrunning_operations_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/build/bazel/remote/execution/v2/libremote_execution_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/build/bazel/remote/execution/v2/remote_execution_proto-speed-src.jar"
]

However, in subsequent evaluations of compiledRemoteApis, the two downloads and the Bazel invocation are skipped and the earlier output directly and immediately re-used:

> ./mill show compiledRemoteApis
[
  "ref:.../bazel-out/...fastbuild/bin/build/bazel/semver/libsemver_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libduration_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libtimestamp_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libwrappers_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_api_http_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libdescriptor_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_api_annotations_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libany_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_rpc_status_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libempty_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_longrunning_operations_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/build/bazel/remote/execution/v2/libremote_execution_proto-speed.jar",
  "ref:.../bazel-out/...fastbuild/bin/build/bazel/remote/execution/v2/remote_execution_proto-speed-src.jar"
]

The various tasks will only be re-evaluated if there are code changes in your build.mill file that affect them.

In general, Using requests.get to download files as part of your build is only safe as long as the files you download are immutable. Mill cannot know whether the remote HTTP endpoint has been changed or not. However, empirically most URLs you may want to download files from do turn out to be immutable: from package repositories, artifact servers, and so on. So this works out surprisingly well in practice.

MainArgs

Mill uses MainArgs to handle argument parsing for Task.Commands that are run from the command line.

build.mill (download, browse)
import mill.*

def commandSimple(str: String, i: Int, bool: Boolean = true) = Task.Command {
  println(s"$str $i $bool")
}

Mill uses MainArgs to let you parse most common Scala primitive types as command parameters: String`s, `Int`s, `Boolean`s, etc. Single-character parameter names are treated as short arguments called with one dash `- rather than two dashes --. Default values work as you would expect, and are substituted in if a value is not given at the command line

> ./mill commandSimple --str hello -i 123
hello 123 true

os.Path

In addition to the builtin set of types that MainArgs supports, Mill also supports parsing OS-Lib os.Paths from the command line:

def commandTakingPath(path: os.Path) = Task.Command {
  println(path)
}
> ./mill commandTakingPath --path foo/bar/baz.txt
...foo/bar/baz.txt

Task

Mill allows commands to take Task[T]s as parameters anywhere they can take an unboxed T. This can be handy later on if you want to call the command as part of another task, while passing it the value of an upstream task:

def commandTakingTask(str: Task[String]) = Task.Command {
  val result = "arg: " + str()
  println(result)
  result
}
> ./mill commandTakingTask --str helloworld
arg: helloworld
def upstreamTask = Task {
  "HELLO"
}

def taskCallingCommand = Task {
  commandTakingTask(upstreamTask)()
}
> ./mill show taskCallingCommand
"arg: HELLO"

Evaluator (experimental)

Evaluator Command are experimental and suspected to change. See issue #502 for details.

You can define a command that takes in the current Evaluator as an argument, which you can use to inspect the entire build, or run arbitrary tasks. For example, here is a customPlanCommand command which uses this to traverse the module tree to find the tasks specified by the tasks strings, and plan out what would be necessary to run them

import mill.api.{SelectMode, Evaluator}

def customPlanCommand(evaluator: Evaluator, tasks: String*) = Task.Command(exclusive = true) {
  val resolved = evaluator
    .resolveTasks(tasks, SelectMode.Multi)
    .get

  val plan = evaluator.plan(resolved)
    .sortedGroups
    .keys()
    .map(_.toString)
    .toArray

  plan.foreach(println)
  ()
}

We can call our customPlanCommand from the command line and pass it the taskCallingCommand we saw earlier, and it prints out the list of tasks it needs to run in the order necessary to reach `taskCallingCommand

> ./mill customPlanCommand taskCallingCommand
upstreamTask
commandTakingTask
taskCallingCommand

Many built-in tools are implemented as custom evaluator commands: inspect, resolve, show. If you want a way to run Mill commands and programmatically manipulate the tasks and outputs, you do so with your own evaluator command.

Coursier

Coursier is the Scala application and artifact manager. Mill uses Coursier for all third-party artifact resolution and management in JVM languages (Scala, Java, etc.)