Python Module Configuration
This page goes into more detail about the various configuration options
for PythonModule
.
Common Configuration Overrides
This example shows some of the common tasks you may want to override on a
PythonModule
: specifying the mainScript
, adding additional
sources/resources, generating sources, and setting typecheck/run options.
Also we have support for forkenv
and pythonOptions
to allow user to
add variables for environment and options for python respectively.
package build
import mill._, pythonlib._
object foo extends PythonModule {
// You can have arbitrary numbers of third-party libraries
def pythonDeps = Seq("MarkupSafe==3.0.2", "Jinja2==3.1.4")
// choose a main Script to run if there are multiple present
def mainScript = Task.Source { millSourcePath / "custom-src" / "foo2.py" }
// Add (or replace) source folders for the module to use
def sources = Task.Sources {
super.sources() ++ Seq(PathRef(millSourcePath / "custom-src"))
}
// Add (or replace) resource folders for the module to use
def resources = Task.Sources {
super.resources() ++ Seq(PathRef(millSourcePath / "custom-resources"))
}
// Generate sources at build time
def generatedSources: T[Seq[PathRef]] = Task {
val destPath = Task.dest / "generatedSources"
os.makeDir.all(destPath)
for (name <- Seq("A", "B", "C")) os.write(
destPath / s"foo$name.py",
s"""
class Foo$name:
value = "hello $name"
""".stripMargin
)
Seq(PathRef(destPath))
}
// Pass additional environmental variables when `.run` is called.
def forkEnv: T[Map[String, String]] = Map("MY_CUSTOM_ENV" -> "my-env-value")
// Additional Python options e.g. to Turn On Warnings and ignore import and resource warnings
// we can use -Werror to treat warnings as errors
def pythonOptions: T[Seq[String]] =
Seq("-Wall", "-Wignore::ImportWarning", "-Wignore::ResourceWarning")
}
Note the use of millSourcePath
, Task.dest
, and PathRef
when preforming
various filesystem operations:
-
millSourcePath
: Base path of the module. For the root module, it’s the repo root. For inner modules, it’s the module path (e.g.,foo/bar/qux
forfoo.bar.qux
). Can be overridden if needed. -
Task.dest
: Destination folder in theout/
folder for task output. Prevents filesystem conflicts and serves as temporary storage or output for tasks. -
PathRef
: Represents the contents of a file or folder, not just its path, ensuring downstream tasks properly invalidate when contents change.
Typical Usage is given below:
> ./mill foo.run
...
Foo2.value: <h1>hello2</h1>
Foo.value: <h1>hello</h1>
FooA.value: hello A
FooB.value: hello B
FooC.value: hello C
MyResource: My Resource Contents
MyOtherResource: My Other Resource Contents
MY_CUSTOM_ENV: my-env-value
...
> ./mill show foo.bundle
".../out/foo/bundle.dest/bundle.pex"
> out/foo/bundle.dest/bundle.pex
...
Foo2.value: <h1>hello2</h1>
Foo.value: <h1>hello</h1>
FooA.value: hello A
FooB.value: hello B
FooC.value: hello C
MyResource: My Resource Contents
MyOtherResource: My Other Resource Contents
...
> sed -i.bak 's/import os/import os, warnings; warnings.warn("This is a test warning!")/g' foo/custom-src/foo2.py
> ./mill foo.run
...UserWarning: This is a test warning!...
Custom Tasks
This example shows how to define task that depend on other tasks:
-
For
generatedSources
, we override the task and make it depend directly onpythonDeps
to generate its source files. In this example, to include the list of dependencies as tuples in thevalue
variable. -
For
lineCount
, we define a brand new task that depends onsources
. That lets us access the line count at runtime usingMY_LINE_COUNT
env variable defined inforkEnv
and print it when the program runs
package build
import mill._, pythonlib._
object foo extends PythonModule {
def pythonDeps = Seq("argparse==1.4.0", "jinja2==3.1.4")
def mainScript = Task.Source { millSourcePath / "src" / "foo.py" }
def generatedSources: T[Seq[PathRef]] = Task {
val destPath = Task.dest / "generatedSources"
os.makeDir.all(destPath)
val prettyPythonDeps = pythonDeps().map { dep =>
val parts = dep.split("==")
s"""("${parts(0)}", "${parts(1)}")"""
}.mkString(", ")
os.write(
destPath / s"myDeps.py",
s"""
class MyDeps:
value = [${prettyPythonDeps}]
""".stripMargin
)
Seq(PathRef(destPath))
}
def lineCount: T[Int] = Task {
sources()
.flatMap(pathRef => os.walk(pathRef.path))
.filter(_.ext == "py")
.map(os.read.lines(_).size)
.sum
}
def forkEnv: T[Map[String, String]] = Map("MY_LINE_COUNT" -> s"${lineCount()}")
def printLineCount() = Task.Command { println(lineCount()) }
}
The above build defines the customizations to the Mill task graph shown below, with the boxes representing tasks defined or overriden above and the un-boxed labels representing existing Mill tasks:
Mill lets you define new cached Tasks using the Task {…}
syntax,
depending on existing Tasks e.g. foo.sources
via the foo.sources()
syntax to extract their current value, as shown in lineCount
above. The
return-type of a Task has to be JSON-serializable (using
uPickle, one of Mill’s Bundled Libraries)
and the Task is cached when first run until its inputs change (in this case, if
someone edits the foo.sources
files which live in foo/src
). Cached Tasks
cannot take parameters.
Note that depending on a task requires use of parentheses after the task
name, e.g. pythonDeps()
, sources()
and lineCount()
. This converts the
task of type T[V]
into a value of type V
you can make use in your task
implementation.
This example can be run as follows:
> ./mill foo.run --text hello
text: hello
MyDeps.value: [('argparse', '1.4.0'), ('jinja2', '3.1.4')]
My_Line_Count: 23
> ./mill show foo.lineCount
23
> ./mill foo.printLineCount
23
Custom tasks can contain arbitrary code. Whether you want to download files
using requests.get
, generate sources to feed into a Python Interpreter, or
create some custom pex bundle with the files you want , all of these
can simply be custom tasks with your code running in the Task {…}
block.
You can create arbitrarily long chains of dependent tasks, and Mill will
handle the re-evaluation and caching of the tasks' output for you.
Mill also provides you a Task.dest
folder for you to use as scratch space or
to store files you want to return:
-
Any files a task creates should live within
Task.dest
-
Any files a task modifies should be copied into
Task.dest
before being modified. -
Any files that a task returns should be returned as a
PathRef
to a path withinTask.dest
That ensures that the files belonging to a particular task all live in one place, avoiding file-name conflicts, preventing race conditions when tasks evaluate in parallel, and letting Mill automatically invalidate the files when the task’s inputs change.
Overriding Tasks
package build
import mill._, pythonlib._
object foo extends PythonModule {
def sources = Task {
val destPath = Task.dest / "src"
os.makeDir.all(destPath)
os.write(
destPath / "foo.py",
s"""
class Foo:
def main(self) -> None:
print("Hello World")
if __name__ == '__main__':
Foo().main()
""".stripMargin
)
Seq(PathRef(destPath))
}
def mainScript = Task { PathRef(sources().head.path / "foo.py") }
def typeCheck = Task {
println("Type Checking...")
super.typeCheck()
}
def run(args: mill.define.Args) = Task.Command {
typeCheck()
println("Running..." + args.value.mkString(" "))
super.run(args)()
}
}
You can re-define tasks to override them, and use super
if you
want to refer to the originally defined task. The above example shows how to
override typeCheck
and run
to add additional logging messages, and we
override sources
which was Task.Sources
for the src/
folder with a plain
T{…}
task that generates the necessary source files on-the-fly.
that this example replaces your src/ folder with the generated
sources, as we are overriding the def sources task. If you want to add
generated sources, you can either override generatedSources , or you can
override sources and use super to include the original source folder with super :
|
object foo2 extends PythonModule {
def generatedSources = Task {
val destPath = Task.dest / "src"
os.makeDir.all(destPath)
os.write(destPath / "foo.py", """...""")
Seq(PathRef(destPath))
}
def mainScript = Task { PathRef(generatedSources().head.path / "foo.py") }
}
object foo3 extends PythonModule {
def sources = Task {
val destPath = Task.dest / "src"
os.makeDir.all(destPath)
os.write(destPath / "foo.py", """...""")
super.sources() ++ Seq(PathRef(destPath))
}
def mainScript = Task { PathRef(sources().head.path / "foo.py") }
}
In Mill builds the override
keyword is optional.
> ./mill foo.run
Type Checking...
Success: no issues found in 1 source file
Running...
Hello World
Compilation & Execution Flags
package build
import mill._, pythonlib._
object `package` extends RootModule with PythonModule {
def mainScript = Task.Source { millSourcePath / "src" / "foo.py" }
def pythonOptions = Seq("-Wall", "-Xdev")
def forkEnv = Map("MY_ENV_VAR" -> "HELLO MILL!")
}
You can pass flags to the Python Interpreter via pythonOptions
.
> ./mill run
HELLO MILL!
By default,
run
runs the code in a subprocess, and you can pass
environment-variables via forkEnv
.
PythonPath and Filesystem Resources
package build
import mill._, pythonlib._
object foo extends PythonModule {
def mainScript = Task.Source { millSourcePath / "src" / "foo.py" }
def resources = Task.Sources { super.resources() ++ Seq(PathRef(millSourcePath / "custom")) }
object test extends PythonTests with TestModule.Unittest {
def otherFiles = Task.Source(millSourcePath / "other-files")
def forkEnv: T[Map[String, String]] =
super.forkEnv() ++ Map("OTHER_FILES_DIR" -> otherFiles().path.toString)
}
}
> ./mill foo.run
Hello World Resource File
> ./mill foo.test
...
test_all (test.TestScript...) ... ok
...Ran 1 test...
OK
...
This section discusses how tests can depend on resources locally on disk.
Mill provides two ways to do this: via the Python PYTHONPATH resources, and via
the resource folder which is made available as the environment variable
MILL_TEST_RESOURCE_DIR
;
-
The PythonPath resources are useful when you want to fetch individual files, and are bundled with the application by the
.bundle
step when constructing an bundle pex for deployment. But they do not allow you to list folders or perform other filesystem operations. -
The resource folder, available via
MILL_TEST_RESOURCE_DIR
, gives you access to the folder path of the resources on disk. This is useful in allowing you to list and otherwise manipulate the filesystem, which you cannot do with pythonPath resources. However, theMILL_TEST_RESOURCE_DIR
only exists when running tests using Mill, and is not available when executing applications packaged for deployment via.bundle
-
Apart from
resources/
, you can provide additional folders to your test suite by defining aTask.Source
(otherFiles
above) and passing it toforkEnv
. This provide the folder path as an environment variable that the test can make use of
Example application code demonstrating the techniques above can be seen below:
Hello World Resource File
Test Hello World Resource File A
Test Hello World Resource File B
Other Hello World File
import importlib.resources
class Foo:
def PythonPathResourceText(self, package, resourceName: str) -> None:
resource_content = (
importlib.resources.files(package).joinpath(resourceName).read_text()
)
return resource_content.strip()
if __name__ == "__main__":
print(Foo().PythonPathResourceText("res", "file.txt"))
import os
from pathlib import Path
import importlib.resources
import unittest
from foo import Foo # type: ignore
class TestScript(unittest.TestCase):
def test_all(self) -> None:
appPythonPathResourceText = Foo().PythonPathResourceText("res", "file.txt")
self.assertEqual(appPythonPathResourceText, "Hello World Resource File")
testPythonPathResourceText = (
importlib.resources.files("res")
.joinpath("test-file-a.txt")
.read_text()
.strip()
)
self.assertEqual(testPythonPathResourceText, "Test Hello World Resource File A")
testFileResourceFile = next(
Path(os.getenv("MILL_TEST_RESOURCE_DIR")).rglob("test-file-b.txt"), None
)
with open(testFileResourceFile, "r", encoding="utf-8") as file:
testFileResourceText = file.readline()
self.assertEqual(testFileResourceText, "Test Hello World Resource File B")
with open(
Path(os.getenv("OTHER_FILES_DIR"), "other-file.txt"), "r", encoding="utf-8"
) as file:
otherFileText = file.readline()
self.assertEqual(otherFileText, "Other Hello World File")
if __name__ == "__main__":
unittest.main()
Note that tests require that you pass in any files that they depend on explicitly. This is necessary so that Mill knows when a test needs to be re-run and when a previous result can be cached. This also ensures that tests reading and writing to the current working directory do not accidentally interfere with each others files, especially when running in parallel.
Mill runs test processes in a sandbox/ folder, not in your project root folder, to
prevent you from accidentally accessing files without explicitly passing them. Thus
you cannot just read resources off disk via with open("foo/resources/test-file-a.txt")
as file.
If you have legacy tests that need to run in the project root folder to work, you
can configure your test suite with def testSandboxWorkingDir = false
to disable
the sandbox and make the tests run in the project root.