Building Python with Mill
This page contains a quick introduction to getting start with using Mill to build a simple python program. We will walk through a series of Mill builds of increasing complexity to show you the key features and usage of the Mill build tool.
The other pages of this section on python go into more depth into individual features, with more examples of how to use Mill for python and more details of how the Mill build tool works. They aren’t intended to be read comprehensively top-to-bottom, but rather looked up when you have a particular interest e.g. in testing, linting, publishing, and so on.
Simple Python Module
package build
import mill._, pythonlib._
object foo extends PythonModule {
def mainScript = Task.Source { millSourcePath / "src" / "foo.py" }
def pythonDeps = Seq("Jinja2==3.1.4")
object test extends PythonTests with TestModule.Unittest
}
This is a basic Mill build for a single PythonModule
, with one
dependency and a test suite using the Unittest
Library.
You can download the example project using the download link above,
or browse the full sources via the browse link.
Ensure you have a JVM installed; the ./mill
script manages all other dependencies.
All examples, from simple hello-world projects on this page to advanced
web build examples(Coming Soon…) and real-world projects(Coming Soon…),
are fully executable and tested in Mill’s CI workflows.
The source code for this module lives in the src/
folder.
Output for this module (typeChecked files, resolved dependency lists, …) lives in out/
.
build.mill foo/ src/ foo/foo.py resources/ ... test/ src/ foo/test.py out/foo/ run.json run.dest/ ... test/ run.json run.dest/ ...
This example project uses one dependency - Jinja2 for HTML rendering and uses it to wrap a given input string in HTML templates with proper escaping.
Typical usage of a PythonModule
is shown below:
> ./mill resolve foo._ # List what tasks are available to run
foo.bundle
...
foo.console
...
foo.run
...
foo.test
...
foo.typeCheck
> ./mill inspect foo.typeCheck # Show documentation and inputs of a task
...
foo.typeCheck(PythonModule...)
Run a typechecker on this module.
Inputs:
foo.pythonExe
foo.transitivePythonPath
foo.sources
...
> ./mill foo.typeCheck # TypeCheck the Python Files and notify errors
Success: no issues found in 1 source file
> ./mill foo.run --text "Hello Mill" # run the main method with arguments
<h1>Hello Mill</h1>
> ./mill foo.test
...
test_escaping (test.TestScript...) ... ok
test_simple (test.TestScript...) ... ok
...Ran 2 tests...
OK
...
> ./mill show foo.bundle # Creates Bundle for the python file
".../out/foo/bundle.dest/bundle.pex"
> out/foo/bundle.dest/bundle.pex --text "Hello Mill" # running the PEX binary outside of Mill
<h1>Hello Mill</h1>
> sed -i.bak 's/print(main())/print(maaain())/g' foo/src/foo.py
> ./mill foo.typeCheck # if we make a typo in a method name, mypy flags it
error: ...Name "maaain" is not defined...
The output of every Mill task is stored in the out/
folder under a name corresponding to
the task that created it. e.g. The typeCheck
task puts its metadata output in out/typeCheck.json
,
and its output files in out/typeCheck.dest
.
You can also use show
to make Mill print out the metadata output for a particular task.
You can run mill resolve __
to see a full list of the different tasks that are available,
mill resolve _
to see the tasks within foo, mill inspect typeCheck
to inspect a task’s doc-comment
documentation or what it depends on, or mill show foo.typeCheck
to show the output of any task.
The most common tasks that Mill can run are cached tasks, such as typeCheck
, test
, bundle
and
run
. Cached tasks do not re-evaluate unless one of their inputs changes, whereas commands re-run every time.
See the documentation for Tasks
for details on the different task types.
Custom Build Logic
Mill makes it very easy to customize your build graph, overriding portions of it with custom logic.
In this example, we override the resources
of our PythonModule
- normally the resources/
folder
- to instead contain a single generated text file containing the line count of all the source files in that module
package build
import mill._, pythonlib._
object foo extends PythonModule {
def mainScript = Task.Source { millSourcePath / "src" / "foo.py" }
/** All Python source files in this module, recursively from the source directories.*/
def allSourceFiles: T[Seq[PathRef]] = Task {
sources().flatMap(src => os.walk(src.path).filter(_.ext == "py").map(PathRef(_)))
}
/** Total number of lines in module source files */
def lineCount = Task {
allSourceFiles().map(f => os.read.lines(f.path).size).sum
}
/** Generate resources using lineCount of sources */
override def resources = Task {
val resourcesDir = Task.dest / "resources"
os.makeDir.all(resourcesDir)
os.write(resourcesDir / "line-count.txt", "" + lineCount())
super.resources() ++ Seq(PathRef(Task.dest))
}
object test extends PythonTests with TestModule.Unittest
}
The addition of lineCount
and resources
overrides the previous resource
folder
provided by PythonModule
(labelled resources.super
below), replacing it with the
destination
folder of the new resources
task, which is wired up to lineCount
:
> ./mill foo.run
Line Count: 10
> ./mill show foo.lineCount
10
> ./mill inspect foo.lineCount
...
foo.lineCount(build.mill...)
Total number of lines in module source files
Inputs:
foo.allSourceFiles
...
> ./mill foo.test
...
test_line_count (test.TestScript...) ... ok
...Ran 1 test...
OK
...
Above, def lineCount
is a new build task we define, which makes use of allSourceFiles
and is in-turn used in our override of resources
(an existing task).
The override
keyword is optional in Mill. This generated file can then be loaded
and used at runtime, as see in the output of mill run
.
If you’re not familiar with what tasks you can override
or how they are related,
you can explore the existing tasks via autocomplete in your IDE, or use the
mill visualize.
os.read.lines
and os.write
come from the OS-Lib library,
which is one of Mill’s Bundled Libraries.
You can also import any other library you want from Maven Central using
import $ivy,
so you are not limited to what is bundled with Mill.
Custom user-defined tasks in Mill benefit from all the same things that built-in tasks do: automatic caching (in the out/ folder), parallelism (with the -j/--jobs flag), inspectability (via show / inspect), and so on. While these things may not matter for such a simple example that runs quickly, they ensure that custom build logic remains performant and maintainable even as the complexity of your project grows.
Multi-Module Project
package build
import mill._, pythonlib._
trait MyModule extends PythonModule {
def resources = super.resources() ++ Seq(PathRef(millSourcePath / "res"))
object test extends PythonTests with TestModule.Unittest
}
object foo extends MyModule {
def moduleDeps = Seq(bar)
def mainScript = Task.Source { millSourcePath / "src" / "foo.py" }
}
object bar extends MyModule {
def mainScript = Task.Source { millSourcePath / "src" / "bar.py" }
def pythonDeps = Seq("Jinja2==3.1.4")
}
This example contains a simple Mill build with two modules, foo
and bar
, which you can run
tasks on such as foo.run
or bar.run
. You can define multiple modules the same way you define
a single module, using def moduleDeps
to define the relationship between them. Modules can also
be nested within each other, as foo.test
and bar.test
are nested within foo
and bar
respectively
Note that we split out the test
submodule configuration common to both modules into a
separate trait MyModule
. This Trait Module
lets us avoid the need to copy-paste common settings, while still letting us define any per-module
configuration such as pythonDeps
specific to a particular module. This is a common pattern within Mill builds.
The above builds expect the following project layout:
build.mill foo/ src/ foo.py test/ src/ test.py bar/ src/ bar.py test/ src/ test.py out/ foo/ run.json run.dest/ ... bar/ run.json run.dest/ ... test/ run.json run.dest/ ...
Typically, both source code and output files in Mill follow the module hierarchy,
so e.g. input to the foo module lives in foo/src/
and compiled output files live in
out/foo/run.dest
. You can use mill resolve
to list out what tasks you can run,
e.g. mill resolve __.run
below which lists out all the run
tasks:
> ./mill resolve __.run
bar.run
...
foo.run
> ./mill foo.run --foo-text hello --bar-text world
...
Foo.value: hello
Bar.value: <h1>world</h1>
...
> ./mill bar.run world
Bar.value: <h1>world</h1>
> ./mill bar.test
...
test_escaping (test.TestScript...) ... ok
test_simple (test.TestScript...) ... ok
...Ran 2 tests...
OK
...
Mill’s evaluator will ensure that the modules are compiled in the right order, and recompiled as necessary when source code in each module changes. The unique path on disk that Mill automatically assigns each task also ensures you do not need to worry about choosing a path on disk to cache outputs, or filesystem collisions if multiple tasks write to the same path.
You can use wildcards and brace-expansion to select multiple tasks at once or to shorten the path to deeply nested tasks. If you provide optional task arguments and your wildcard or brace-expansion is resolved to multiple tasks, the arguments will be applied to each of the tasks.
Wildcard |
Function |
|
matches a single segment of the task path |
|
matches arbitrary segments of the task path |
|
is equal to specifying two tasks |
You can use the + symbol to add another task with optional arguments.
If you need to feed a + as argument to your task, you can mask it by
preceding it with a backslash (\
).
> mill foo._.typeCheck # Runs `typeCheck` for all direct sub-modules of `foo`
> mill foo.__.test # Runs `test` for all transitive sub-modules of `foo`
> mill {foo,bar}.__.testCached # Runs `testCached` for all sub-modules of `foo` and `bar`
> mill __.typeCheck + foo.__.test # Runs all `typeCheck` tasks and all tests under `foo`.
For more details on the query syntax, check out the query syntax documentation