Building Python with Mill

This page contains a quick introduction to getting start with using Mill to build a simple python program. We will walk through a series of Mill builds of increasing complexity to show you the key features and usage of the Mill build tool.

The other pages of this section on python go into more depth into individual features, with more examples of how to use Mill for python and more details of how the Mill build tool works. They aren’t intended to be read comprehensively top-to-bottom, but rather looked up when you have a particular interest e.g. in testing, linting, publishing, and so on.

Simple Python Module

build.mill (download, browse)
package build
import mill._, pythonlib._

object foo extends PythonModule {

  def mainScript = Task.Source { millSourcePath / "src" / "foo.py" }

  def pythonDeps = Seq("Jinja2==3.1.4")

  object test extends PythonTests with TestModule.Unittest

}

This is a basic Mill build for a single PythonModule, with one dependency and a test suite using the Unittest Library.

You can download the example project using the download link above, or browse the full sources via the browse link. Ensure you have a JVM installed; the ./mill script manages all other dependencies. All examples, from simple hello-world projects on this page to advanced web build examples(Coming Soon…​) and real-world projects(Coming Soon…​), are fully executable and tested in Mill’s CI workflows.

The source code for this module lives in the src/ folder. Output for this module (typeChecked files, resolved dependency lists, …​) lives in out/.

build.mill
foo/
    src/
        foo/foo.py
    resources/
        ...
    test/
        src/
            foo/test.py
out/foo/
    run.json
    run.dest/
    ...
    test/
        run.json
        run.dest/
        ...

This example project uses one dependency - Jinja2 for HTML rendering and uses it to wrap a given input string in HTML templates with proper escaping.

Typical usage of a PythonModule is shown below:

> ./mill resolve foo._ # List what tasks are available to run
foo.bundle
...
foo.console
...
foo.run
...
foo.test
...
foo.typeCheck

> ./mill inspect foo.typeCheck  # Show documentation and inputs of a task
...
foo.typeCheck(PythonModule...)
    Run a typechecker on this module.
Inputs:
    foo.pythonExe
    foo.transitivePythonPath
    foo.sources
...

> ./mill foo.typeCheck  # TypeCheck the Python Files and notify errors
Success: no issues found in 1 source file

> ./mill foo.run --text "Hello Mill"  # run the main method with arguments
<h1>Hello Mill</h1>

> ./mill foo.test
...
test_escaping (test.TestScript...) ... ok
test_simple (test.TestScript...) ... ok
...Ran 2 tests...
OK
...

> ./mill show foo.bundle # Creates Bundle for the python file
".../out/foo/bundle.dest/bundle.pex"

> out/foo/bundle.dest/bundle.pex --text "Hello Mill" # running the PEX binary outside of Mill
<h1>Hello Mill</h1>

> sed -i.bak 's/print(main())/print(maaain())/g' foo/src/foo.py

> ./mill foo.typeCheck # if we make a typo in a method name, mypy flags it
error: ...Name "maaain" is not defined...

The output of every Mill task is stored in the out/ folder under a name corresponding to the task that created it. e.g. The typeCheck task puts its metadata output in out/typeCheck.json, and its output files in out/typeCheck.dest. You can also use show to make Mill print out the metadata output for a particular task.

You can run mill resolve __ to see a full list of the different tasks that are available, mill resolve _ to see the tasks within foo, mill inspect typeCheck to inspect a task’s doc-comment documentation or what it depends on, or mill show foo.typeCheck to show the output of any task.

The most common tasks that Mill can run are cached tasks, such as typeCheck, test, bundle and run. Cached tasks do not re-evaluate unless one of their inputs changes, whereas commands re-run every time. See the documentation for Tasks for details on the different task types.

Custom Build Logic

Mill makes it very easy to customize your build graph, overriding portions of it with custom logic. In this example, we override the resources of our PythonModule - normally the resources/ folder - to instead contain a single generated text file containing the line count of all the source files in that module

build.mill (download, browse)
package build
import mill._, pythonlib._

object foo extends PythonModule {

  def mainScript = Task.Source { millSourcePath / "src" / "foo.py" }

  /** All Python source files in this module, recursively from the source directories.*/
  def allSourceFiles: T[Seq[PathRef]] = Task {
    sources().flatMap(src => os.walk(src.path).filter(_.ext == "py").map(PathRef(_)))
  }

  /** Total number of lines in module source files */
  def lineCount = Task {
    allSourceFiles().map(f => os.read.lines(f.path).size).sum
  }

  /** Generate resources using lineCount of sources */
  override def resources = Task {
    val resourcesDir = Task.dest / "resources"
    os.makeDir.all(resourcesDir)
    os.write(resourcesDir / "line-count.txt", "" + lineCount())
    super.resources() ++ Seq(PathRef(Task.dest))
  }

  object test extends PythonTests with TestModule.Unittest
}

The addition of lineCount and resources overrides the previous resource folder provided by PythonModule (labelled resources.super below), replacing it with the destination folder of the new resources task, which is wired up to lineCount:

G allSourceFiles allSourceFiles lineCount lineCount allSourceFiles->lineCount resources resources lineCount->resources ... ... resources->... run run ...->run resources.super resources.super resources.super->resources
> ./mill foo.run
Line Count: 10

> ./mill show foo.lineCount
10

> ./mill inspect foo.lineCount
...
foo.lineCount(build.mill...)
    Total number of lines in module source files
Inputs:
    foo.allSourceFiles
...

> ./mill foo.test
...
test_line_count (test.TestScript...) ... ok
...Ran 1 test...
OK
...

Above, def lineCount is a new build task we define, which makes use of allSourceFiles and is in-turn used in our override of resources (an existing task). The override keyword is optional in Mill. This generated file can then be loaded and used at runtime, as see in the output of mill run.

If you’re not familiar with what tasks you can override or how they are related, you can explore the existing tasks via autocomplete in your IDE, or use the mill visualize.

os.read.lines and os.write come from the OS-Lib library, which is one of Mill’s Bundled Libraries. You can also import any other library you want from Maven Central using import $ivy, so you are not limited to what is bundled with Mill.

Custom user-defined tasks in Mill benefit from all the same things that built-in tasks do: automatic caching (in the out/ folder), parallelism (with the -j/--jobs flag), inspectability (via show / inspect), and so on. While these things may not matter for such a simple example that runs quickly, they ensure that custom build logic remains performant and maintainable even as the complexity of your project grows.

Multi-Module Project

build.mill (download, browse)
package build
import mill._, pythonlib._

trait MyModule extends PythonModule {
  def resources = super.resources() ++ Seq(PathRef(millSourcePath / "res"))
  object test extends PythonTests with TestModule.Unittest
}

object foo extends MyModule {
  def moduleDeps = Seq(bar)
  def mainScript = Task.Source { millSourcePath / "src" / "foo.py" }
}

object bar extends MyModule {
  def mainScript = Task.Source { millSourcePath / "src" / "bar.py" }
  def pythonDeps = Seq("Jinja2==3.1.4")
}

This example contains a simple Mill build with two modules, foo and bar, which you can run tasks on such as foo.run or bar.run. You can define multiple modules the same way you define a single module, using def moduleDeps to define the relationship between them. Modules can also be nested within each other, as foo.test and bar.test are nested within foo and bar respectively

Note that we split out the test submodule configuration common to both modules into a separate trait MyModule. This Trait Module lets us avoid the need to copy-paste common settings, while still letting us define any per-module configuration such as pythonDeps specific to a particular module. This is a common pattern within Mill builds.

The above builds expect the following project layout:

build.mill
foo/
    src/
        foo.py
    test/
         src/
             test.py
bar/
    src/
        bar.py
    test/
         src/
             test.py
out/
    foo/
        run.json
        run.dest/
        ...
    bar/
        run.json
        run.dest/
        ...
        test/
             run.json
             run.dest/
             ...

Typically, both source code and output files in Mill follow the module hierarchy, so e.g. input to the foo module lives in foo/src/ and compiled output files live in out/foo/run.dest. You can use mill resolve to list out what tasks you can run, e.g. mill resolve __.run below which lists out all the run tasks:

> ./mill resolve __.run
bar.run
...
foo.run

> ./mill foo.run --foo-text hello --bar-text world
...
Foo.value: hello
Bar.value: <h1>world</h1>
...

> ./mill bar.run world
Bar.value: <h1>world</h1>

> ./mill bar.test
...
test_escaping (test.TestScript...) ... ok
test_simple (test.TestScript...) ... ok
...Ran 2 tests...
OK
...

Mill’s evaluator will ensure that the modules are compiled in the right order, and recompiled as necessary when source code in each module changes. The unique path on disk that Mill automatically assigns each task also ensures you do not need to worry about choosing a path on disk to cache outputs, or filesystem collisions if multiple tasks write to the same path.

You can use wildcards and brace-expansion to select multiple tasks at once or to shorten the path to deeply nested tasks. If you provide optional task arguments and your wildcard or brace-expansion is resolved to multiple tasks, the arguments will be applied to each of the tasks.

Table 1. Wildcards and brace-expansion

Wildcard

Function

_

matches a single segment of the task path

__

matches arbitrary segments of the task path

{a,b}

is equal to specifying two tasks a and b

You can use the + symbol to add another task with optional arguments. If you need to feed a + as argument to your task, you can mask it by preceding it with a backslash (\).

> mill foo._.typeCheck # Runs `typeCheck` for all direct sub-modules of `foo`

> mill foo.__.test # Runs `test` for all transitive sub-modules of `foo`

> mill {foo,bar}.__.testCached # Runs `testCached` for all sub-modules of `foo` and `bar`

> mill __.typeCheck + foo.__.test # Runs all `typeCheck` tasks and all tests under `foo`.

For more details on the query syntax, check out the query syntax documentation