Example: hello.awk

Smallest possible dot program. Running mean of column 1, using only dot's runtime — no Num, no Sym, no library types beyond what dot itself ships.

The program

Five lines. new("plain") allocates an object id; the per-row block updates running count and Welford-style mean using .field sugar; END prints.

# hello.awk -- smallest dot example. Running mean of column 1.
# Uses only dot's runtime: new() + .field sugar.

BEGIN { N = new("plain") }
      { .N.n++; .N.mu += ($1 - .N.mu) / .N.n }
END   { printf "n=%d mean=%.3f\n", .N.n, .N.mu }

What the preprocessor sees vs. what gawk sees:

# source                                       ->  after prep.awk
.N.n++                                          ->  HEAP[N]["n"]++
.N.mu += ($1 - .N.mu) / .N.n                    ->  HEAP[N]["mu"] += ($1 - HEAP[N]["mu"]) / HEAP[N]["n"]
new("plain")                                    ->  new("plain")    (no dots, unchanged)

Run

Bundled in the dot binary, so:

dot --demo hello             # uses bundled sample.txt (10 20 30 40 50)
dot --demo hello -           # use stdin instead
printf '1\n2\n3\n' | dot --demo hello -

Output:

n=5 mean=30.000

How it works

  1. new("plain") bumps the global NID counter, sets HEAP[N]["is"] = "plain", and (since no plain_init is defined) returns N = the new id.
  2. Each input line: .N.n++ rewrites to HEAP[N]["n"]++. So count lives in HEAP[N]["n"], mean in HEAP[N]["mu"]. Pure struct sugar.
  3. Welford's online mean: mu += (x - mu) / n. Numerically stable, one pass, O(1) per row.
  4. END prints with %.3f — the . is fine here because % is not a value-char (the prep regex requires letter/digit/_/]/) before the dot).

Next

For the same idea but with bundled Num (mean+stdev) and Sym (mode+entropy) types, plus a polymorphic add() that dispatches on column type, see dotcols's stats walkthrough — per-column running stats over any CSV in ~25 lines.