Java Single-File Scripts

Script Use Cases

Mill Single-file Java programs can make it more convenient to script simple command-line workflows interacting with files, subprocesses, and HTTP endpoints.

For example, below is a simple script using Jackson, Unirest-Java, and PicoCLI, to write a program that crawls wikipedia and saves the crawl results to a file:

Crawler.java (download, browse)
//| mvnDeps:
//| - info.picocli:picocli:4.7.6
//| - com.konghq:unirest-java:3.14.5
//| - com.fasterxml.jackson.core:jackson-databind:2.17.2

import com.fasterxml.jackson.databind.*;
import kong.unirest.Unirest;
import picocli.CommandLine;
import java.io.*;
import java.nio.file.*;
import java.util.*;
import java.util.concurrent.Callable;

@CommandLine.Command(name = "Crawler", mixinStandardHelpOptions = true)
public class Crawler implements Callable<Integer> {

  @CommandLine.Option(names = {"--start-article"}, required = true, description = "Starting title")
  private String startArticle;

  @CommandLine.Option(names = {"--depth"}, required = true, description = "Depth of crawl")
  private int depth;

  private static final ObjectMapper mapper = new ObjectMapper();

  public static List<String> fetchLinks(String title) throws Exception {
    var response = Unirest.get("https://en.wikipedia.org/w/api.php")
      .queryString("action", "query")
      .queryString("titles", title)
      .queryString("prop", "links")
      .queryString("format", "json")
      .header("User-Agent", "WikiFetcherBot/1.0 (https://example.com; contact@example.com)")
      .asString();

    if (!response.isSuccess())
      throw new IOException("Unexpected code " + response.getStatus());

    var root = mapper.readTree(response.getBody());
    var pages = root.path("query").path("pages");
    var links = new ArrayList<String>();

    for (var it = pages.elements(); it.hasNext();) {
      var linkArr = it.next().get("links");
      if (linkArr != null && linkArr.isArray()) {
        for (var link : linkArr) {
          var titleNode = link.get("title");
          if (titleNode != null) links.add(titleNode.asText());
        }
      }
    }
    return links;
  }

  public Integer call() throws Exception {
    var seen = new HashSet<>(Set.of(startArticle));
    var current = new HashSet<>(Set.of(startArticle));

    for (int i = 0; i < depth; i++) {
      var next = new HashSet<String>();
      for (var article : current) {
        for (var link : fetchLinks(article)) {
          if (!seen.contains(link)) next.add(link);
        }
      }
      seen.addAll(next);
      current = next;
    }

    try (var w = Files.newBufferedWriter(Paths.get("fetched.json"))) {
      mapper.writerWithDefaultPrettyPrinter().writeValue(w, seen);
    }
    return 0;
  }

  public static void main(String[] args) {
    System.exit(new CommandLine(new Crawler()).execute(args));
  }
}
> ./mill Crawler.java --start-article=singapore --depth=2

> cat fetched.json
..."Calling code",...
..."+65",...
..."British Empire",...
..."1st Parliament of Singapore",...

While initially single-file Java programs may be a bit more verbose than the equivalent Bash script containing cp or curl commands, as the script grows in complexity the value of IDE support, typechecking, and JVM libraries makes writing them in Java an attractive proposition. This is especially true if you already have developers fluent in Java which may not be as familiar with the intricacies of writing robust and maintainable Bash code.

Relative and Absolute Script moduleDeps

Mill single-file scripts can import each other via either relative or absolute imports. For example, given a bar/Bar.scala file such as below:

bar/Bar.java (download, browse)
package bar;

public class Bar {
  public static String generateHtml(String text) {
    return "<h1>" + text + "</h1>";
  }
}

It can be imported via a ./Bar.java import relative to its own enclosing folder (in this case bar/), as shown below where it is imported from the bar/BarTests.java test suite in the same folder:

bar/BarTests.java (download, browse)
//| extends: [mill.script.JavaModule.Junit4]
//| moduleDeps: [./Bar.java]
package bar;

import static org.junit.Assert.assertEquals;
import org.junit.Test;

public class BarTests {
  @Test
  public void simple() {
    assertEquals("<h1>hello</h1>", Bar.generateHtml("hello"));
  }
}

Or it can be imported via bar/Bar.java absolute import, as shown below where it is imported from the foo/Foo.{language-exr}

foo/Foo.java (download, browse)
//| moduleDeps: [bar/Bar.java]
package foo;

public class Foo {
  public static void main(String[] args) throws Exception {
    System.out.println(bar.Bar.generateHtml(args[0]));
  }
}

This examples can be exercised as follows:

> ./mill bar/Bar.java:compile
> ./mill bar/BarTests.java
> ./mill foo/Foo.java hello

Custom Script Module Classes

By default, single-file Mill script modules inherit their behavior from the builtin mill.script.JavaModule. However, you can also customize them to inherit from a custom Module class that you define as part of your meta-build in mill-build/src/. For example, if we want to add a resource file generated by processing the source file of the script, this can be done in a custom LineCountJavaModule as shown below:

Qux.java (download, browse)
//| extends: [millbuild.LineCountJavaModule]
package qux;

public class Qux {
  public static String getLineCount() throws Exception {
    return new String(
        Qux.class.getClassLoader().getResourceAsStream("line-count.txt").readAllBytes());
  }

  public static void main(String[] args) throws Exception {
    System.out.println("Line Count: " + getLineCount());
  }
}
mill-build/src/LineCountJavaModule.scala (download, browse)
package millbuild
import mill.*, javalib.*, script.*

class LineCountJavaModule(scriptConfig: ScriptModule.Config)
    extends mill.script.JavaModule(scriptConfig) {

  /** Total number of lines in module source files */
  def lineCount = Task {
    allSourceFiles().map(f => os.read.lines(f.path).size).sum
  }

  /** Generate resources using lineCount of sources */
  override def resources = Task {
    os.write(Task.dest / "line-count.txt", "" + lineCount())
    super.resources() ++ Seq(PathRef(Task.dest))
  }
}
> ./mill Qux.java
...
Line Count: 13

> ./mill show Qux.java:lineCount
13

Your custom LineCountJavaModule must be a class take a mill.script.ScriptModule.Config as a parameter that is passed to the mill.script.JavaModule. Custom script module classes allows you to customize the semantics of your Java, Scala, or Kotlin single-file script modules. If you have a large number of scripts with a similar configuration, or you need customizations that cannot be done in the YAML build header, placing these customizations in a custom script module class can let you centrally define the behavior and standardize it across all scripts that inherit it via extends.