Example: Python Support

This section demonstrates how to integrate Python support into Mill. We will define a simple PythonModule trait that can resolve dependencies, perform type checking on local code, and bundle an executable.

This integration is for educational purposes only, showcasing common technique used in building language toolchains, and is not intended for production use.

Basic Python Build Pipeline

First, we define a pythonExe task to create a Python virtual environment and installs mypy for type-checking. mypy verifies type correctness using Python’s type hints, helping to catch errors in development.

build.mill (download, browse)
package build
import mill._

def pythonExe: T[PathRef] = Task {

  os.call(("python3", "-m", "venv", Task.dest / "venv"))
  val python = Task.dest / "venv" / "bin" / "python3"
  os.call((python, "-m", "pip", "install", "mypy==1.13.0"))

  PathRef(python)
}

Defining our Sources

The sources task specifies the directory for Python source files (src folder).

def sources: T[PathRef] = Task.Source(millSourcePath / "src")

Type Checking

The typeCheck task verifies that the code in the main Python file passes type checks.

def typeCheck: T[Unit] = Task {
  os.call(
    (pythonExe().path, "-m", "mypy", "--strict", sources().path),
    stdout = os.Inherit
  )
}

At this point, we have a minimal working build, with a build graph that looks like this:

G pythonExe pythonExe typeCheck typeCheck pythonExe->typeCheck sources sources sources->typeCheck

Here is the main.py file

src/main.py (browse)
import sys
def add(a: int, b: int) -> int: return a + b
def main() -> None: print("Hello, " + " ".join(sys.argv[1:]) + "!")
if __name__ == "__main__":
    main()
    print(add(5, 10)) # Error Example: add("5", 10) will cause a TypeError

Running

The mainFileName task defines the name of the main Python script (in this case main.py). The run task runs the main file with user-provided command-line arguments. It uses the virtual environment’s Python interpreter to execute the script, with output displayed in the console.

def mainFileName: T[String] = Task { "main.py" }
def run(args: mill.define.Args) = Task.Command {
  os.call(
    (pythonExe().path, sources().path / mainFileName(), args.value),
    stdout = os.Inherit
  )
}

Note that we use stdout = os.Inherit since we want to display any output to the user, rather than capturing it for use in our command.

G pythonExe pythonExe typeCheck typeCheck pythonExe->typeCheck run run pythonExe->run sources sources sources->typeCheck sources->run mainFileName mainFileName mainFileName->run

Note that like many optionally-typed languages, The run and typeCheck tasks are independent: you can run a Python program without needing to typecheck it first. This is different from compiled languages like Java, which require typechecking before execution

Below are commands that demonstrate the typechecking and running functionality of our pipeline:

> ./mill typeCheck
Success: no issues found in 1 source file

> ./mill run Mill Python
Hello, Mill Python!
15

> sed -i.bak 's/print(add(5, 10))/print(addd(5, 10))/g' src/main.py

> ./mill typeCheck # if we make a typo in a method name, mypy flags it
error: ...Name "addd" is not defined...

We have now completed a basic Python integration in Mill, as a pipeline of inter-related tasks. Next steps is to turn this one-off pipeline into a reusable PythonModule

Re-usable PythonModule

This example shows how to define a PythonModule trait for managing Python scripts within multiple Mill objects. PythonModule takes the one-off pipeline we defined earlier with sources, pythonExe, typeCheck, etc. and wraps it in a trait that can be re-used.

build.mill (download, browse)
package build
import mill._

trait PythonModule extends Module {

  def sources: T[PathRef] = Task.Source(millSourcePath / "src")
  def mainFileName: T[String] = Task { "main.py" }

  def pythonExe: T[PathRef] = Task {

    os.call(("python3", "-m", "venv", Task.dest / "venv"))
    val python = Task.dest / "venv" / "bin" / "python3"
    os.call((python, "-m", "pip", "install", "mypy==1.13.0"))

    PathRef(python)
  }

  def typeCheck: T[Unit] = Task {
    os.call(
      (pythonExe().path, "-m", "mypy", "--strict", sources().path),
      stdout = os.Inherit
    )

  }

  def run(args: mill.define.Args) = Task.Command {
    os.call(
      (pythonExe().path, sources().path / mainFileName(), args.value),
      stdout = os.Inherit
    )
  }

}

Once the trait PythonModule, has been defined, we can re-use it in three seperate objects below

object foo extends PythonModule {
  def mainFileName = "foo.py"
  object bar extends PythonModule {
    def mainFileName = "bar.py"
  }
}

object qux extends PythonModule {
  def mainFileName = "qux.py"
}

For this example, we have three different Python Scripts foo/src/foo.py, foo/bar/src/bar.py, qux/src/qux.py, one in each PythonModule. The following commands run each module and display their output:

> ./mill foo.run Mill
Hello, Mill Foo!

> ./mill foo.bar.run Mill
Hello, Mill Foo Bar!

> ./mill qux.run Mill
Hello, Mill Qux!

After this step, we have a build graph that looks like this:

G cluster_1 foo cluster_2 bar cluster_3 qux qux.pythonExe qux.pythonExe qux.typeCheck qux.typeCheck qux.pythonExe->qux.typeCheck qux.run qux.run qux.pythonExe->qux.run qux.sources qux.sources qux.sources->qux.typeCheck qux.sources->qux.run qux.mainFileName qux.mainFileName qux.mainFileName->qux.run bar.pythonExe bar.pythonExe bar.typeCheck bar.typeCheck bar.pythonExe->bar.typeCheck bar.run bar.run bar.pythonExe->bar.run bar.sources bar.sources bar.sources->bar.typeCheck bar.sources->bar.run bar.mainFileName bar.mainFileName bar.mainFileName->bar.run foo.pythonExe foo.pythonExe foo.typeCheck foo.typeCheck foo.pythonExe->foo.typeCheck foo.run foo.run foo.pythonExe->foo.run foo.sources foo.sources foo.sources->foo.typeCheck foo.sources->foo.run foo.mainFileName foo.mainFileName foo.mainFileName->foo.run

Right now, the three PythonModules are independent. Next we will look into how to allow them to depend on each other using moduleDeps.

PythonModule moduleDeps

This example shows how to add module dependencies to PythonModule, allowing them to depend on one another.

The main change is the addition of def moduleDeps to specify the inter-module dependencies. We then use Task.traverse to aggregate the sources of the upstream modules and make them available during typeCheck and run:

build.mill (download, browse)
package build
import mill._

trait PythonModule extends Module {

  // List of module dependencies required by this module.
  def moduleDeps: Seq[PythonModule] = Nil

  def sources: T[PathRef] = Task.Source(millSourcePath / "src")
  def mainFileName: T[String] = Task { "main.py" }

  def pythonExe: T[PathRef] = Task {

    os.call(("python3", "-m", "venv", Task.dest / "venv"))
    val python = Task.dest / "venv" / "bin" / "python3"
    os.call((python, "-m", "pip", "install", "mypy==1.13.0"))

    PathRef(python)
  }

  def typeCheck: T[Unit] = Task {
    val upstreamTypeCheck = Task.traverse(moduleDeps)(_.typeCheck)()
    val pythonVenv = pythonExe().path

    os.call(
      (pythonVenv, "-m", "mypy", "--strict", sources().path),
      stdout = os.Inherit
    )

  }

  def gatherScripts(upstream: Seq[(PathRef, PythonModule)]) = {
    for ((sourcesFolder, mod) <- upstream) {
      val destinationPath = os.pwd / mod.millSourcePath.subRelativeTo(build.millSourcePath)
      os.copy.over(sourcesFolder.path / os.up, destinationPath)
    }
  }

  def run(args: mill.define.Args) = Task.Command {
    gatherScripts(Task.traverse(moduleDeps)(_.sources)().zip(moduleDeps))

    os.call(
      (pythonExe().path, sources().path / mainFileName(), args.value),
      env = Map("PYTHONPATH" -> Task.dest.toString),
      stdout = os.Inherit
    )
  }

}

Now we can take the three modules defined earlier and wire them up: qux depends on foo and foo.bar, which export their APIs for use in qux.

object foo extends PythonModule {
  def mainFileName = "foo.py"
  object bar extends PythonModule {
    def mainFileName = "bar.py"
  }
}

object qux extends PythonModule {
  def mainFileName = "qux.py"
  def moduleDeps = Seq(foo, foo.bar)
}

For this example, we define the following three files, one in each module, that depend on one another:

foo/src/foo.py (browse)
import sys
def multiply(a: int, b: int) -> int: return a * b
foo/bar/src/bar.py (browse)
import sys
def add(a: int, b: int) -> int: return a + b
qux/src/qux.py (browse)
from foo.bar.src.bar import add # type: ignore
from foo.src.foo import multiply # type: ignore
import sys
def divide(a: int, b: int) -> float: return a/b
def main() -> None:
    x = int(sys.argv[1])
    y = int(sys.argv[2])
    print(f"Add: {x} + {y} = {add(x, y)} | Multiply: {x} * {y} = {multiply(x, y)} | Divide: {x} / {y} = {divide(x, y)}")
if __name__ == "__main__":
    main()
> ./mill qux.run 10 20
Add: 10 + 20 = 30 | Multiply: 10 * 20 = 200 | Divide: 10 / 20 = 0.5

Task dependency graph, showing foo and bar tasks feeding into qux:

G cluster_1 foo cluster_2 bar cluster_3 qux bar.pythonExe bar.pythonExe bar.typeCheck bar.typeCheck bar.pythonExe->bar.typeCheck bar.run bar.run bar.pythonExe->bar.run qux.typeCheck qux.typeCheck bar.typeCheck->qux.typeCheck bar.sources bar.sources bar.sources->bar.typeCheck bar.sources->bar.run qux.run qux.run bar.sources->qux.run bar.mainFileName bar.mainFileName bar.mainFileName->bar.run foo.pythonExe foo.pythonExe foo.typeCheck foo.typeCheck foo.pythonExe->foo.typeCheck foo.run foo.run foo.pythonExe->foo.run foo.typeCheck->qux.typeCheck foo.sources foo.sources foo.sources->foo.typeCheck foo.sources->foo.run foo.sources->qux.run foo.mainFileName foo.mainFileName foo.mainFileName->foo.run qux.pythonExe qux.pythonExe qux.pythonExe->qux.typeCheck qux.pythonExe->qux.run qux.sources qux.sources qux.sources->qux.typeCheck qux.sources->qux.run qux.mainFileName qux.mainFileName qux.mainFileName->qux.run

Next, we will add support for depending on external Python libraries from PyPI, and bundling via PEX

PIP dependencies and bundling

This implementation extends PythonModule with these key Tasks:

  • pythonDeps: allows the user to define python dependencies that will be pip installed. These are aggregated into transitivePythonDeps

  • bundle: Packages the module and dependencies into a standalone bundle.pex file, making deployment easier.

build.mill (download, browse)
package build
import mill._

trait PythonModule extends Module {
  def moduleDeps: Seq[PythonModule] = Nil
  def mainFileName: T[String] = Task { "main.py" }
  def sources: T[PathRef] = Task.Source(millSourcePath / "src")

  def pythonDeps: T[Seq[String]] = Task { Seq.empty[String] }

  def transitivePythonDeps: T[Seq[String]] = Task {
    val upstreamDependencies = Task.traverse(moduleDeps)(_.transitivePythonDeps)().flatten
    pythonDeps() ++ upstreamDependencies
  }

  def pythonExe: T[PathRef] = Task {
    os.call(("python3", "-m", "venv", Task.dest / "venv"))
    val python = Task.dest / "venv" / "bin" / "python3"
    os.call((python, "-m", "pip", "install", "mypy==1.13.0", "pex==2.24.1", transitivePythonDeps()))

    PathRef(python)
  }

  def typeCheck: T[Unit] = Task {
    val upstreamTypeCheck = Task.traverse(moduleDeps)(_.typeCheck)()

    os.call(
      (pythonExe().path, "-m", "mypy", "--strict", sources().path),
      stdout = os.Inherit,
      cwd = T.workspace
    )
  }

  def gatherScripts(upstream: Seq[(PathRef, PythonModule)]) = {
    for ((sourcesFolder, mod) <- upstream) {
      val destinationPath = os.pwd / mod.millSourcePath.subRelativeTo(build.millSourcePath)
      os.copy.over(sourcesFolder.path / os.up, destinationPath)
    }
  }

  def run(args: mill.define.Args) = Task.Command {
    gatherScripts(Task.traverse(moduleDeps)(_.sources)().zip(moduleDeps))

    os.call(
      (pythonExe().path, sources().path / mainFileName(), args.value),
      env = Map("PYTHONPATH" -> Task.dest.toString),
      stdout = os.Inherit
    )
  }

  /** Bundles the project into a single PEX executable(bundle.pex). */
  def bundle = Task {
    gatherScripts(Task.traverse(moduleDeps)(_.sources)().zip(moduleDeps))

    val pexFile = Task.dest / "bundle.pex"
    os.call(
      (
        pythonExe().path,
        "-m",
        "pex",
        transitivePythonDeps(),
        "-D",
        Task.dest,
        "-c",
        sources().path / mainFileName(),
        "-o",
        pexFile,
        "--scie",
        "eager"
      ),
      env = Map("PYTHONPATH" -> Task.dest.toString),
      stdout = os.Inherit
    )

    PathRef(pexFile)
  }

}

Note the use of Task.traverse(moduleDeps) in order to aggregate the upstream modules library dependencies and the typeCheck outputs.

Now, our three modules can define pythonDeps to be used at runtime:

object foo extends PythonModule {
  object bar extends PythonModule {
    def pythonDeps = Seq("pandas==2.2.3", "numpy==2.1.3")
  }
  def pythonDeps = Seq("numpy==2.1.3")
}

object qux extends PythonModule {
  def moduleDeps = Seq(foo, foo.bar)
}

To run the project and create a .pex executable, use the following commands:

> ./mill qux.run
Numpy : Sum: 150 | Pandas: Mean: 30.0, Max: 50

> ./mill show qux.bundle
".../out/qux/bundle.dest/bundle.pex"

> out/qux/bundle.dest/bundle.pex # running the PEX binary outside of Mill
Numpy : Sum: 150 | Pandas: Mean: 30.0, Max: 50

This generates the bundle.pex file, which packages all dependencies and can be executed as a standalone application.

To Run the bundle.pex file, First Provide the executable permission(+x) to bundle.pex and then run using ./bundle.pex command.

The final module tree and task graph is now as follows, with the additional dependencies tasks with upstream and the bundle tasks downstream:

G cluster_1 foo cluster_2 bar cluster_3 qux qux.pythonDeps qux.pythonDeps qux.pythonExe qux.pythonExe qux.pythonDeps->qux.pythonExe qux.typeCheck qux.typeCheck qux.pythonExe->qux.typeCheck qux.run qux.run qux.pythonExe->qux.run qux.sources qux.sources qux.sources->qux.typeCheck qux.sources->qux.run qux.bundle qux.bundle qux.sources->qux.bundle qux.mainFileName qux.mainFileName qux.mainFileName->qux.run bar.pythonDeps bar.pythonDeps bar.pythonDeps->qux.pythonDeps bar.pythonExe bar.pythonExe bar.pythonDeps->bar.pythonExe bar.typeCheck bar.typeCheck bar.pythonExe->bar.typeCheck bar.run bar.run bar.pythonExe->bar.run bar.typeCheck->qux.typeCheck bar.sources bar.sources bar.sources->qux.run bar.sources->qux.bundle bar.sources->bar.typeCheck bar.sources->bar.run bar.bundle bar.bundle bar.sources->bar.bundle bar.mainFileName bar.mainFileName bar.mainFileName->bar.run foo.pythonDeps foo.pythonDeps foo.pythonDeps->qux.pythonDeps foo.pythonExe foo.pythonExe foo.pythonDeps->foo.pythonExe foo.typeCheck foo.typeCheck foo.pythonExe->foo.typeCheck foo.run foo.run foo.pythonExe->foo.run foo.typeCheck->qux.typeCheck foo.sources foo.sources foo.sources->qux.run foo.sources->qux.bundle foo.sources->foo.typeCheck foo.sources->foo.run foo.bundle foo.bundle foo.sources->foo.bundle foo.mainFileName foo.mainFileName foo.mainFileName->foo.run

As mentioned, The PythonModule examples here demonstrate how to add support for a new language toolchain in Mill. A production-ready version would require more work to enhance features and performance.