Bundled Libraries

Mill comes bundled with a set of external Open Source libraries and projects.

OS-lib

Project page: https://github.com/com-lihaoyi/os-lib
ScalaDoc: https://javadoc.io/doc/com.lihaoyi/os-lib_2.13/latest/index.html

Mill uses OS-Lib for all of its file system and subprocess operations.

Sandbox Working Directories

One thing to note about Mill’s usage of OS-Lib is that Mill sets the os.pwd and for filesystem operations and subprocesses to each task’s .dest folder, as part of its Mill Sandboxing efforts to prevent accidental interference between tasks:

build.mill (download, browse)

import mill._

def task1 = T{
  os.write(os.pwd / "file.txt", "hello")
  PathRef(os.pwd / "file.txt")
}

def task2 = T{
  os.call(("bash", "-c", "echo 'world' >> file.txt"))
  PathRef(os.pwd / "file.txt")
}

def command = T{
  println(task1().path)
  println(os.read(task1().path))
  println(task2().path)
  println(os.read(task2().path))
}

Thus although both task1 and task2 above write to os.pwd / "file.txt" - one via os.write and one via a Bash subprocess - each task gets its own working directory that prevents the files from colliding on disk. Thus the final command can depend on both tasks and read each task’s file.txt separately without conflict

> ./mill command # mac/linux
.../out/task1.dest/file.txt
hello
.../out/task2.dest/file.txt
world

uPickle

Project page: https://github.com/com-lihaoyi/upickle
ScalaDoc: https://javadoc.io/doc/com.lihaoyi/upickle_2.13/latest/index.html

Mill uses uPickle to cache target output to disk as JSON, and to output JSON for third-party tools to consume. The output of every Mill target must be JSON serializable via uPickle.

The uPickle serialized return of every Mill task is used for multiple purposes:

As the format for caching things on disk
The output format for show, which can be used for manual inspection piped to external tools
Decided whether downstream results can be read from the cache or whether they need to be recomputed

Primitives and Collections

Most Scala primitive types (Strings, Ints, Booleans, etc.) and collections types (Seqs, Lists, Tuples, etc.) are serializable by default.

build.mill (download, browse)

import mill._

def taskInt = T{ 123 }
def taskBoolean = T{ true }
def taskString = T{ "hello " + taskInt() + " world " + taskBoolean() }

> ./mill show taskInt
123

> ./mill show taskBoolean
true

> ./mill show taskString
"hello 123 world true"

> ./mill show taskTuple
[
  123,
  true,
  "hello 123 world true"
]

def taskTuple = T{ (taskInt(), taskBoolean(), taskString())}
def taskSeq = T{ Seq(taskInt(), taskInt() * 2, taskInt() * 3)}
def taskMap = T{ Map("int" -> taskInt().toString, "boolean" -> taskBoolean().toString) }

> ./mill show taskSeq
[
  123,
  246,
  369
]

> ./mill show taskMap
{
  "int": "123",
  "boolean": "true"
}

Paths and PathRef

os.Paths from OS-Lib are also serializable as strings.

def taskPath = T{
  os.write(os.pwd / "file.txt", "hello")
  os.pwd / "file.txt"
}

> ./mill show taskPath
".../out/taskPath.dest/file.txt"

Note that returning an os.Path from a task will only invalidate downstream tasks on changes to the path itself (e.g. from returning file.txt to file2.txt), and not to changes to the contents of any file or folder at that path. If you want to invalidate downstream tasks depending on the contents of a file or folder, you should return a PathRef:

def taskPathRef = T{
  os.write(os.pwd / "file.txt", "hello")
  PathRef(os.pwd / "file.txt")
}

> ./mill show taskPathRef
"ref.../out/taskPathRef.dest/file.txt"

The serialized PathRef contains a hexadecimal hash signature of the file or folder referenced on disk, computed from its contents.

Requests-Scala

Project page: https://github.com/com-lihaoyi/requests-scala
ScalaDoc: https://javadoc.io/doc/com.lihaoyi/requests_2.13/latest/index.html

Mill bundles Requests-Scala for you to use to make HTTP requests. Requests-Scala lets you integrate your build with the world beyond your local filesystem.

Requests-Scala is mostly used in Mill for downloading files as part of your build. These can either be data files or executables, and in either case they are downloaded once and cached for later use.

Downloading Compilers and Source Code

In the example below, we download the Remote APIs source zip, download a Bazel Build Tool binary, and use Bazel to compile the Remote APIs source code as part of our build:

build.mill (download, browse)

import mill._

def remoteApisZip = T{
  println("downloading bazel-remote-apis sources...")
  os.write(
    Task.dest / "source.zip",
    requests.get("https://github.com/bazelbuild/remote-apis/archive/refs/tags/v2.2.0.zip")
  )
  PathRef(Task.dest / "source.zip")
}

def bazel = T{
  println("downloading bazel...")
  val fileName =
    if (System.getProperty("os.name") == "Mac OS X") "bazel-5.4.1-darwin-arm64"
    else "bazel-5.4.1-linux-x86_64"

  os.write(
    Task.dest / "bazel",
    requests.get(s"https://github.com/bazelbuild/bazel/releases/download/5.4.1/$fileName")
  )
  os.perms.set(Task.dest / "bazel", "rwxrwxrwx")
  PathRef(Task.dest / "bazel")
}

def compiledRemoteApis = T{
  val javaBuildTarget = "build/bazel/remote/execution/v2:remote_execution_java_proto"
  os.call(("unzip", remoteApisZip().path, "-d", Task.dest))
  os.call((bazel().path, "build", javaBuildTarget), cwd = Task.dest / "remote-apis-2.2.0")

  val queried = os.call(
    (bazel().path, "cquery", javaBuildTarget, "--output=files"),
    cwd = Task.dest / "remote-apis-2.2.0"
  )

  queried
    .out
    .lines()
    .map(line => PathRef(Task.dest / "remote-apis-2.2.0" / os.SubPath(line)))
}

In the execution example below, we can see the first time we ask for compiledRemoteApis, Mill downloads the Bazel build tool, downloads the Remote APIs source code, and then invokes Bazel to compile them:

> ./mill show compiledRemoteApis
downloading bazel...
downloading bazel-remote-apis sources...
Loading: ...
Analyzing: ...
...
INFO: Build completed successfully...
[
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/build/bazel/semver/libsemver_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libduration_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libtimestamp_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libwrappers_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/googleapis/libgoogle_api_http_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libdescriptor_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/googleapis/libgoogle_api_annotations_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libany_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/googleapis/libgoogle_rpc_status_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libempty_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/googleapis/libgoogle_longrunning_operations_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/build/bazel/remote/execution/v2/libremote_execution_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/build/bazel/remote/execution/v2/remote_execution_proto-speed-src.jar"
]

However, in subsequent evaluations of compiledRemoteApis, the two downloads and the Bazel invocation are skipped and the earlier output directly and immediately re-used:

> ./mill show compiledRemoteApis
[
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/build/bazel/semver/libsemver_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libduration_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libtimestamp_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libwrappers_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/googleapis/libgoogle_api_http_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libdescriptor_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/googleapis/libgoogle_api_annotations_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libany_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/googleapis/libgoogle_rpc_status_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/com_google_protobuf/libempty_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/external/googleapis/libgoogle_longrunning_operations_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/build/bazel/remote/execution/v2/libremote_execution_proto-speed.jar",
  "ref:.../bazel-out/darwin_arm64-fastbuild/bin/build/bazel/remote/execution/v2/remote_execution_proto-speed-src.jar"
]

The various tasks will only be re-evaluated if there are code changes in your build.mill file that affect them.

In general, Using requests.get to download files as part of your build is only safe as long as the files you download are immutable. Mill cannot know whether the remote HTTP endpoint has been changed or not. However, empirically most URLs you may want to download files from do turn out to be immutable: from package repositories, artifact servers, and so on. So this works out surprisingly well in practice.

MainArgs

Project page: https://github.com/com-lihaoyi/mainargs
Scaladoc: https://javadoc.io/doc/com.lihaoyi/mainargs_2.13/latest/index.html

Mill uses MainArgs to handle argument parsing for Task.Commands that are run from the command line.

build.mill (download, browse)

import mill._

def commandSimple(str: String, i: Int, bool: Boolean = true) = Task.Command{
  println(s"$str $i $bool")
}

Mill uses MainArgs to let you parse most common Scala primitive types as command parameters: String`s, `Int`s, `Boolean`s, etc. Single-character parameter names are treated as short arguments called with one dash `- rather than two dashes --. Default values work as you would expect, and are substituted in if a value is not given at the command line

> ./mill commandSimple --str hello -i 123
hello 123 true

`os.Path`

In addition to the builtin set of types that MainArgs supports, Mill also supports parsing OS-Lib os.Paths from the command line:

def commandTakingPath(path: os.Path) = Task.Command{
  println(path)
}

> ./mill commandTakingPath --path foo/bar/baz.txt
...foo/bar/baz.txt

Custom Main Argument Parsers

You can define your own custom types and use them as command line arguments, as long as you define an implicit mainargs.TokensReader[T] for that type.

class LettersOrDigits(val value: String)

implicit object LettersOrDigitsTokensReader
  extends mainargs.TokensReader.Simple[LettersOrDigits] {
  def shortName = "letters-or-digits"
  def read(strs: Seq[String]) = {
    if (strs.last.forall(_.isLetterOrDigit)) Right(new LettersOrDigits(strs.last))
    else Left("non-letter/digit characters")
  }
}

def commandCustomArg(custom: LettersOrDigits) = Task.Command{
  println("hello " + custom.value)
}

Above, we show an example where we have a custom class LettersOrDigits type, where the implicit object LettersOrDigitsTokensReader does some basic validation to raise an error if the argument contains characters that are not etters or digits:

> ./mill commandCustomArg --custom 123abc
hello 123abc

> ./mill commandCustomArg --custom 123?abc
error: Invalid argument --custom <letters-or-digits> failed to parse "123?abc"...
...due to non-letter/digit characters
...

`Task`

Mill allows commands to take Task[T]s as parameters anywhere they can take an unboxed T. This can be handy later on if you want to call the command as part of another target, while passing it the value of an upstream target:

def commandTakingTask(str: Task[String]) = Task.Command{
  val result = "arg: " + str()
  println(result)
  result
}

> ./mill commandTakingTask --str helloworld
arg: helloworld

def upstreamTarget = T{
  "HELLO"
}

def targetCallingCommand = T{
  commandTakingTask(upstreamTarget)()
}

> ./mill show targetCallingCommand
"arg: HELLO"

Evaluator (experimental)

Evaluator Command are experimental and suspected to change. See issue #502 for details.

You can define a command that takes in the current Evaluator as an argument, which you can use to inspect the entire build, or run arbitrary tasks. For example, here is a customPlanCommand command which uses this to traverse the module tree to find the targets specified by the targets strings, and plan out what would be necessary to run them

import mill.eval.{Evaluator, Terminal}
import mill.resolve.{Resolve, SelectMode}

def customPlanCommand(evaluator: Evaluator, targets: String*) = Task.Command {
  Resolve.Tasks.resolve(
    evaluator.rootModule,
    targets,
    SelectMode.Multi
  ) match{
    case Left(err) => Left(err)
    case Right(resolved) =>
      val (sortedGroups, _) = evaluator.plan(resolved)
      val plan = sortedGroups
        .keys()
        .collect { case r: Terminal.Labelled[_] => r.render }
        .toArray

      plan.foreach(println)
      Right(())
  }
}

We can call our customPlanCommand from the command line and pass it the targetCallingCommand we saw earlier, and it prints out the list of tasks it needs to run in the order necessary to reach `targetCallingCommand

> ./mill customPlanCommand targetCallingCommand
upstreamTarget
commandTakingTask
targetCallingCommand

Many built-in tools are implemented as custom evaluator commands: inspect, resolve, show. If you want a way to run Mill commands and programmatically manipulate the tasks and outputs, you do so with your own evaluator command.

Coursier

Coursier is the Scala application and artifact manager. Mill uses Coursier for all third-party artifact resolution and management in JVM languages (Scala, Java, etc.)

Project page: https://github.com/coursier/coursier
Documentation: https://get-coursier.io/docs/overview