Example: Python Support
This section demonstrates how to integrate Python
support into Mill
.
We will define a simple PythonModule
trait that can resolve dependencies,
perform type checking on local code, and bundle an executable.
This integration is for educational purposes only, showcasing common technique used in building language toolchains, and is not intended for production use. |
Basic Python Build Pipeline
First, we define a pythonExe
task to create a
Python virtual environment and installs
mypy for type-checking. mypy
verifies type correctness using Python’s
type hints, helping to catch errors in development.
package build
import mill._
def pythonExe: T[PathRef] = Task {
os.call(("python3", "-m", "venv", Task.dest / "venv"))
val python = Task.dest / "venv" / "bin" / "python3"
os.call((python, "-m", "pip", "install", "mypy==1.13.0"))
PathRef(python)
}
Defining our Sources
The sources
task specifies the directory for Python source files (src
folder).
def sources: T[PathRef] = Task.Source(millSourcePath / "src")
Type Checking
The typeCheck
task verifies that the code in the main Python file passes type checks.
def typeCheck: T[Unit] = Task {
os.call(
(pythonExe().path, "-m", "mypy", "--strict", sources().path),
stdout = os.Inherit
)
}
At this point, we have a minimal working build, with a build graph that looks like this:
Here is the main.py
file
import sys
def add(a: int, b: int) -> int: return a + b
def main() -> None: print("Hello, " + " ".join(sys.argv[1:]) + "!")
if __name__ == "__main__":
main()
print(add(5, 10)) # Error Example: add("5", 10) will cause a TypeError
Running
The mainFileName
task defines the name of the main Python script (in this case main.py
).
The run
task runs the main file with user-provided command-line arguments.
It uses the virtual environment’s Python interpreter to execute the script,
with output displayed in the console.
def mainFileName: T[String] = Task { "main.py" }
def run(args: mill.define.Args) = Task.Command {
os.call(
(pythonExe().path, sources().path / mainFileName(), args.value),
stdout = os.Inherit
)
}
Note that we use stdout = os.Inherit
since we want to display any output to the user,
rather than capturing it for use in our command.
Note that like many optionally-typed languages, The run
and typeCheck
tasks are
independent: you can run a Python program without needing to typecheck it first. This is
different from compiled languages like Java, which require typechecking before execution
Below are commands that demonstrate the typechecking and running functionality of our pipeline:
> ./mill typeCheck
Success: no issues found in 1 source file
> ./mill run Mill Python
Hello, Mill Python!
15
> sed -i.bak 's/print(add(5, 10))/print(addd(5, 10))/g' src/main.py
> ./mill typeCheck # if we make a typo in a method name, mypy flags it
error: ...Name "addd" is not defined...
We have now completed a basic Python integration in Mill, as a pipeline of inter-related tasks. Next steps is to turn this one-off pipeline into a reusable PythonModule
Re-usable PythonModule
This example shows how to define a PythonModule
trait for managing Python scripts
within multiple Mill objects. PythonModule
takes the one-off pipeline we defined
earlier with sources
, pythonExe
, typeCheck
, etc. and wraps it in a trait
that can be re-used.
package build
import mill._
trait PythonModule extends Module {
def sources: T[PathRef] = Task.Source(millSourcePath / "src")
def mainFileName: T[String] = Task { "main.py" }
def pythonExe: T[PathRef] = Task {
os.call(("python3", "-m", "venv", Task.dest / "venv"))
val python = Task.dest / "venv" / "bin" / "python3"
os.call((python, "-m", "pip", "install", "mypy==1.13.0"))
PathRef(python)
}
def typeCheck: T[Unit] = Task {
os.call(
(pythonExe().path, "-m", "mypy", "--strict", sources().path),
stdout = os.Inherit
)
}
def run(args: mill.define.Args) = Task.Command {
os.call(
(pythonExe().path, sources().path / mainFileName(), args.value),
stdout = os.Inherit
)
}
}
Once the trait PythonModule
, has been defined, we can re-use it in
three seperate objects below
object foo extends PythonModule {
def mainFileName = "foo.py"
object bar extends PythonModule {
def mainFileName = "bar.py"
}
}
object qux extends PythonModule {
def mainFileName = "qux.py"
}
For this example, we have three different Python Scripts
foo/src/foo.py
, foo/bar/src/bar.py
, qux/src/qux.py
, one in each PythonModule
.
The following commands run each module and display their output:
> ./mill foo.run Mill
Hello, Mill Foo!
> ./mill foo.bar.run Mill
Hello, Mill Foo Bar!
> ./mill qux.run Mill
Hello, Mill Qux!
After this step, we have a build graph that looks like this:
Right now, the three PythonModule
s are independent. Next we will look into
how to allow them to depend on each other using moduleDeps
.
PythonModule moduleDeps
This example shows how to add module dependencies to PythonModule
, allowing them
to depend on one another.
The main change is the addition of def moduleDeps
to specify the inter-module dependencies.
We then use Task.traverse
to aggregate the sources
of the upstream modules and
make them available during typeCheck
and run
:
package build
import mill._
trait PythonModule extends Module {
// List of module dependencies required by this module.
def moduleDeps: Seq[PythonModule] = Nil
def sources: T[PathRef] = Task.Source(millSourcePath / "src")
def mainFileName: T[String] = Task { "main.py" }
def pythonExe: T[PathRef] = Task {
os.call(("python3", "-m", "venv", Task.dest / "venv"))
val python = Task.dest / "venv" / "bin" / "python3"
os.call((python, "-m", "pip", "install", "mypy==1.13.0"))
PathRef(python)
}
def typeCheck: T[Unit] = Task {
val upstreamTypeCheck = Task.traverse(moduleDeps)(_.typeCheck)()
val pythonVenv = pythonExe().path
os.call(
(pythonVenv, "-m", "mypy", "--strict", sources().path),
stdout = os.Inherit
)
}
def gatherScripts(upstream: Seq[(PathRef, PythonModule)]) = {
for ((sourcesFolder, mod) <- upstream) {
val destinationPath = os.pwd / mod.millSourcePath.subRelativeTo(build.millSourcePath)
os.copy.over(sourcesFolder.path / os.up, destinationPath)
}
}
def run(args: mill.define.Args) = Task.Command {
gatherScripts(Task.traverse(moduleDeps)(_.sources)().zip(moduleDeps))
os.call(
(pythonExe().path, sources().path / mainFileName(), args.value),
env = Map("PYTHONPATH" -> Task.dest.toString),
stdout = os.Inherit
)
}
}
Now we can take the three modules defined earlier and wire them up:
qux
depends on foo
and foo.bar
, which export their APIs for use in qux
.
object foo extends PythonModule {
def mainFileName = "foo.py"
object bar extends PythonModule {
def mainFileName = "bar.py"
}
}
object qux extends PythonModule {
def mainFileName = "qux.py"
def moduleDeps = Seq(foo, foo.bar)
}
For this example, we define the following three files, one in each module, that depend on one another:
import sys
def multiply(a: int, b: int) -> int: return a * b
import sys
def add(a: int, b: int) -> int: return a + b
from foo.bar.src.bar import add # type: ignore
from foo.src.foo import multiply # type: ignore
import sys
def divide(a: int, b: int) -> float: return a/b
def main() -> None:
x = int(sys.argv[1])
y = int(sys.argv[2])
print(f"Add: {x} + {y} = {add(x, y)} | Multiply: {x} * {y} = {multiply(x, y)} | Divide: {x} / {y} = {divide(x, y)}")
if __name__ == "__main__":
main()
> ./mill qux.run 10 20
Add: 10 + 20 = 30 | Multiply: 10 * 20 = 200 | Divide: 10 / 20 = 0.5
Task dependency graph, showing foo
and bar
tasks feeding into qux
:
PIP dependencies and bundling
This implementation extends PythonModule
with these key Tasks:
-
pythonDeps
: allows the user to define python dependencies that will bepip installed
. These are aggregated intotransitivePythonDeps
-
bundle
: Packages the module and dependencies into a standalonebundle.pex
file, making deployment easier.
package build
import mill._
trait PythonModule extends Module {
def moduleDeps: Seq[PythonModule] = Nil
def mainFileName: T[String] = Task { "main.py" }
def sources: T[PathRef] = Task.Source(millSourcePath / "src")
def pythonDeps: T[Seq[String]] = Task { Seq.empty[String] }
def transitivePythonDeps: T[Seq[String]] = Task {
val upstreamDependencies = Task.traverse(moduleDeps)(_.transitivePythonDeps)().flatten
pythonDeps() ++ upstreamDependencies
}
def pythonExe: T[PathRef] = Task {
os.call(("python3", "-m", "venv", Task.dest / "venv"))
val python = Task.dest / "venv" / "bin" / "python3"
os.call((python, "-m", "pip", "install", "mypy==1.13.0", "pex==2.24.1", transitivePythonDeps()))
PathRef(python)
}
def typeCheck: T[Unit] = Task {
val upstreamTypeCheck = Task.traverse(moduleDeps)(_.typeCheck)()
os.call(
(pythonExe().path, "-m", "mypy", "--strict", sources().path),
stdout = os.Inherit,
cwd = T.workspace
)
}
def gatherScripts(upstream: Seq[(PathRef, PythonModule)]) = {
for ((sourcesFolder, mod) <- upstream) {
val destinationPath = os.pwd / mod.millSourcePath.subRelativeTo(build.millSourcePath)
os.copy.over(sourcesFolder.path / os.up, destinationPath)
}
}
def run(args: mill.define.Args) = Task.Command {
gatherScripts(Task.traverse(moduleDeps)(_.sources)().zip(moduleDeps))
os.call(
(pythonExe().path, sources().path / mainFileName(), args.value),
env = Map("PYTHONPATH" -> Task.dest.toString),
stdout = os.Inherit
)
}
/** Bundles the project into a single PEX executable(bundle.pex). */
def bundle = Task {
gatherScripts(Task.traverse(moduleDeps)(_.sources)().zip(moduleDeps))
val pexFile = Task.dest / "bundle.pex"
os.call(
(
pythonExe().path,
"-m",
"pex",
transitivePythonDeps(),
"-D",
Task.dest,
"-c",
sources().path / mainFileName(),
"-o",
pexFile,
"--scie",
"eager"
),
env = Map("PYTHONPATH" -> Task.dest.toString),
stdout = os.Inherit
)
PathRef(pexFile)
}
}
Note the use of Task.traverse(moduleDeps)
in order to aggregate the upstream modules
library dependencies and the typeCheck
outputs.
Now, our three modules can define pythonDeps
to be used at runtime:
object foo extends PythonModule {
object bar extends PythonModule {
def pythonDeps = Seq("pandas==2.2.3", "numpy==2.1.3")
}
def pythonDeps = Seq("numpy==2.1.3")
}
object qux extends PythonModule {
def moduleDeps = Seq(foo, foo.bar)
}
To run the project and create a .pex
executable, use the following commands:
> ./mill qux.run
Numpy : Sum: 150 | Pandas: Mean: 30.0, Max: 50
> ./mill show qux.bundle
".../out/qux/bundle.dest/bundle.pex"
> out/qux/bundle.dest/bundle.pex # running the PEX binary outside of Mill
Numpy : Sum: 150 | Pandas: Mean: 30.0, Max: 50
This generates the bundle.pex
file, which packages all dependencies
and can be executed as a standalone application.
To Run the bundle.pex
file, First Provide the executable permission(+x)
to bundle.pex and then run using ./bundle.pex
command.
The final module tree and task graph is now as follows, with the additional dependencies tasks with upstream and the bundle tasks downstream:
As mentioned, The PythonModule
examples here demonstrate
how to add support for a new language toolchain in Mill.
A production-ready version would require more work to enhance features and performance.