Bundled Libraries
Mill comes bundled with a set of external Open Source libraries and projects.
OS-lib
Mill uses OS-Lib for all of its file system and subprocess operations.
Sandbox Working Directories
One thing to note about Mill’s usage of OS-Lib is that Mill sets the os.pwd
and for filesystem operations and subprocesses to each task’s .dest folder,
as part of its Mill Sandboxing efforts to prevent accidental
interference between tasks:
import mill.*
def task1 = Task {
os.write(os.pwd / "file.txt", "hello")
PathRef(os.pwd / "file.txt")
}
def task2 = Task {
os.call(("bash", "-c", "echo 'world' >> file.txt"))
PathRef(os.pwd / "file.txt")
}
def command = Task {
println(task1().path)
println(os.read(task1().path))
println(task2().path)
println(os.read(task2().path))
}
Thus although both task1 and task2 above write to os.pwd / "file.txt" -
one via os.write and one via a Bash subprocess - each task gets its own
working directory that prevents the files from colliding on disk. Thus the final
command can depend on both tasks and read each task’s file.txt separately
without conflict
Path Serialization
Task return values are cached as JSON, so any os.Path/PathRef in them is
serialized to a string. To keep out/ independent of where the build lives on disk
(so caches stay portable and can feed a remote cache),
Mill rewrites paths under the workspace or home directory as relative aliases — ../mill-workspace/… and ../mill-home/… — and plants matching forwarder
symlinks so they resolve back to the real files. This affects os.Path.toString,
.toIO, .toNIO, and any serialized os.Path/PathRef. Within a task you never
notice it: the aliases round-trip transparently.
When to reach for custom serialization
An alias only resolves from a cwd where Mill planted its forwarders, which it does for
just three cwd shapes: task .dest folders and the run sandbox (which use ../mill-),
and the workspace root (which uses out/mill-, since ../mill- would escape the
workspace). So when serializing a path for a subprocess, the correct string depends on
*that subprocess’s cwd, not the current task’s, and plain .toString is often wrong.
Pick the PathRef.to* helper for where the subprocess runs and what it accepts:
PathRef.toRelString(path, subprocessCwd)-
The reproducible form: serializes
pathusing the alias scheme valid atsubprocessCwd(out/mill-at the workspace root,../mill-in a.dest/sandbox), falling back to an absolute path whensubprocessCwdhas no forwarders. This is the only custom form that stays reproducible (it never bakes in an absolute path), so prefer it whenever the subprocess runs in a directory Mill aliases and accepts relative paths. Reach for it specifically when the subprocess’s cwd differs from the current task’s — most commonly a tool you spawn at the workspace root, whoseout/mill-aliases differ from the task’s../mill-, so the default.toStringwould be wrong. (workspaceRootis the defaultsubprocessCwd, matching that common case.) PathRef.toAbsString(path)/PathRef.toAbsFile(path)/PathRef.toAbsNioPath(path)-
The lexically absolute path — made absolute and
..-normalized textually, without touching the disk, so any symlinks are left intact — as aString,java.io.File, orjava.nio.file.Path. Use these when the subprocess runs in some directory Mill does not alias (so a relative alias can’t resolve there), or treats relative and absolute paths differently (resolving relatives against a base other than its cwd, or accepting only absolute paths). PathRef.toResolvedPathString(path)/PathRef.toResolvedOsPath(path)-
The same, but additionally run through
toRealPath, so symlinks are followed to the canonical on-disk location. The difference from thetoAbs*forms only shows up when the path traverses a symlink — one of Mill’smill-*forwarders, or an OS symlink like macOS’s/tmp→/private/tmp— and the consumer canonicalizes paths itself (e.g. Node’s module resolution ornative-image) or you compare two paths for equality; a merely-lexical absolute path would still contain the symlink and mismatch the real one. Falls back to the lexical absolute form when the path doesn’t exist on disk.
> ./mill command # mac/linux
.../out/task1.dest/file.txt
hello
.../out/task2.dest/file.txt
world
uPickle
Mill uses uPickle to cache task output to disk as JSON, and to output JSON for third-party tools to consume. The output of every Mill task must be JSON serializable via uPickle.
The uPickle serialized return of every Mill task is used for multiple purposes:
-
As the format for caching things on disk
-
The output format for
show, which can be used for manual inspection piped to external tools -
Decided whether downstream results can be read from the cache or whether they need to be recomputed
Primitives and Collections
Most Scala primitive types (Strings, Ints, Booleans, etc.) and
collections types (Seqs, Lists, Tuples, etc.) are serializable by default.
import mill.*
def taskInt = Task { 123 }
def taskBoolean = Task { true }
def taskString = Task { "hello " + taskInt() + " world " + taskBoolean() }
> ./mill show taskInt
123
> ./mill show taskBoolean
true
> ./mill show taskString
"hello 123 world true"
> ./mill show taskTuple
[
123,
true,
"hello 123 world true"
]
def taskTuple = Task { (taskInt(), taskBoolean(), taskString()) }
def taskSeq = Task { Seq(taskInt(), taskInt() * 2, taskInt() * 3) }
def taskMap = Task { Map("int" -> taskInt().toString, "boolean" -> taskBoolean().toString) }
> ./mill show taskSeq
[
123,
246,
369
]
> ./mill show taskMap
{
"int": "123",
"boolean": "true"
}
Paths and PathRef
os.Paths from OS-Lib are also serializable as strings.
def taskPath = Task {
os.write(os.pwd / "file.txt", "hello")
os.pwd / "file.txt"
}
> ./mill show taskPath
".../out/taskPath.dest/file.txt"
Note that returning an os.Path from a task will only invalidate downstream
tasks on changes to the path itself (e.g. from returning file.txt to file2.txt),
and not to changes to the contents of any file or folder at that path. If you want
to invalidate downstream tasks depending on the contents of a file or folder, you
should return a PathRef:
def taskPathRef = Task {
os.write(os.pwd / "file.txt", "hello")
PathRef(os.pwd / "file.txt")
}
> ./mill show taskPathRef
"ref.../out/taskPathRef.dest/file.txt"
The serialized PathRef contains a hexadecimal hash signature of the file or
folder referenced on disk, computed from its contents.
Requests-Scala
Mill bundles Requests-Scala for you to use to make HTTP requests.
Requests-Scala lets you integrate your build with the world beyond your local
filesystem.
Requests-Scala is mostly used in Mill for downloading files as part of your build. These can either be data files or executables, and in either case they are downloaded once and cached for later use.
Downloading Compilers and Source Code
In the example below, we download the Remote APIs source zip, download a Bazel Build Tool binary, and use Bazel to compile the Remote APIs source code as part of our build:
import mill.*
def remoteApisZip = Task {
println("downloading bazel-remote-apis sources...")
os.write(
Task.dest / "source.zip",
requests.get("https://github.com/bazelbuild/remote-apis/archive/refs/tags/v2.2.0.zip")
)
PathRef(Task.dest / "source.zip")
}
def bazel = Task {
println("downloading bazel...")
val fileName =
if (System.getProperty("os.name") == "Mac OS X") "bazel-5.4.1-darwin-arm64"
else "bazel-5.4.1-linux-x86_64"
os.write(
Task.dest / "bazel",
requests.get(s"https://github.com/bazelbuild/bazel/releases/download/5.4.1/$fileName")
)
os.perms.set(Task.dest / "bazel", "rwxrwxrwx")
PathRef(Task.dest / "bazel")
}
def compiledRemoteApis = Task {
val javaBuildTarget = "build/bazel/remote/execution/v2:remote_execution_java_proto"
os.call(("unzip", remoteApisZip().path, "-d", Task.dest))
os.call((bazel().path, "build", javaBuildTarget), cwd = Task.dest / "remote-apis-2.2.0")
val queried = os.call(
(bazel().path, "cquery", javaBuildTarget, "--output=files"),
cwd = Task.dest / "remote-apis-2.2.0"
)
queried
.out
.lines()
.map(line => PathRef(Task.dest / "remote-apis-2.2.0" / os.SubPath(line)))
}
In the execution example below, we can see the first time we ask for compiledRemoteApis,
Mill downloads the Bazel build tool, downloads the Remote APIs source code, and then
invokes Bazel to compile them:
> ./mill show compiledRemoteApis
downloading bazel...
downloading bazel-remote-apis sources...
Loading: ...
Analyzing: ...
...
INFO: Build completed successfully...
[
"ref:.../bazel-out/...fastbuild/bin/build/bazel/semver/libsemver_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libduration_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libtimestamp_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libwrappers_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_api_http_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libdescriptor_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_api_annotations_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libany_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_rpc_status_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libempty_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_longrunning_operations_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/build/bazel/remote/execution/v2/libremote_execution_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/build/bazel/remote/execution/v2/remote_execution_proto-speed-src.jar"
]
However, in subsequent evaluations of compiledRemoteApis, the two downloads and
the Bazel invocation are skipped and the earlier output directly and immediately re-used:
> ./mill show compiledRemoteApis
[
"ref:.../bazel-out/...fastbuild/bin/build/bazel/semver/libsemver_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libduration_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libtimestamp_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libwrappers_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_api_http_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libdescriptor_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_api_annotations_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libany_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_rpc_status_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/com_google_protobuf/libempty_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/external/googleapis/libgoogle_longrunning_operations_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/build/bazel/remote/execution/v2/libremote_execution_proto-speed.jar",
"ref:.../bazel-out/...fastbuild/bin/build/bazel/remote/execution/v2/remote_execution_proto-speed-src.jar"
]
The various tasks will only be re-evaluated if there are code changes in your build.mill
file that affect them.
In general, Using requests.get to download files as part of your build is only safe
as long as the files you download are immutable. Mill cannot know whether the remote
HTTP endpoint has been changed or not. However, empirically most URLs you may want
to download files from do turn out to be immutable: from package repositories, artifact
servers, and so on. So this works out surprisingly well in practice.
MainArgs
Mill uses MainArgs to handle argument parsing for Task.Commands that
are run from the command line.
import mill.*
def commandSimple(str: String, i: Int, bool: Boolean = true) = Task.Command {
println(s"$str $i $bool")
}
Mill uses MainArgs to let you parse most common Scala primitive types as command
parameters: String`s, `Int`s, `Boolean`s, etc. Single-character parameter names
are treated as short arguments called with one dash `- rather than two dashes --.
Default values work as you would expect, and are substituted in if a value is not
given at the command line
> ./mill commandSimple --str hello -i 123
hello 123 true
os.Path
In addition to the builtin set of types that MainArgs supports, Mill also
supports parsing OS-Lib os.Paths from the command line:
def commandTakingPath(path: os.Path) = Task.Command {
println(path)
}
> ./mill commandTakingPath --path foo/bar/baz.txt
...foo/bar/baz.txt
Task
Mill allows commands to take Task[T]s as parameters anywhere they can
take an unboxed T. This can be handy later on if you want to call the
command as part of another task, while passing it the value of an upstream
task:
def commandTakingTask(str: Task[String]) = Task.Command {
val result = "arg: " + str()
println(result)
result
}
> ./mill commandTakingTask --str helloworld
arg: helloworld
def upstreamTask = Task {
"HELLO"
}
def taskCallingCommand = Task {
commandTakingTask(upstreamTask)()
}
> ./mill show taskCallingCommand
"arg: HELLO"
Evaluator (experimental)
Evaluator Command are experimental and suspected to change. See issue #502 for details.
You can define a command that takes in the current Evaluator as an argument,
which you can use to inspect the entire build, or run arbitrary tasks.
For example, here is a customPlanCommand command which uses this
to traverse the module tree to find the tasks specified by the tasks strings,
and plan out what would be necessary to run them
import mill.api.{SelectMode, Evaluator}
def customPlanCommand(evaluator: Evaluator, tasks: String*) = Task.Command(exclusive = true) {
val resolved = evaluator
.resolveTasks(tasks, SelectMode.Multi)
.get
val plan = evaluator.plan(resolved)
.sortedGroups
.keys()
.map(_.toString)
.toArray
plan.foreach(println)
()
}
We can call our customPlanCommand from the command line and pass it the
taskCallingCommand we saw earlier, and it prints out the list of tasks
it needs to run in the order necessary to reach `taskCallingCommand
> ./mill customPlanCommand taskCallingCommand
upstreamTask
commandTakingTask
taskCallingCommand