Zero to running a polymorphic gawk program. Real prompts, real output, copy-pastable.

Object sugar for gawk.

Once upon a time (I think it was Thursday) I lamented to Claude Code that gawk can't return arrays. It suggested a global-heap trick: functions return the integer id of an instance stored on an external heap. In theory, horrible. In practice, surprisingly useful.

So we tried it. A few hours porting some old code to this objects on an external heap idea, and the result was a tiny gawk system — the runtime started at a dozen lines; the preprocessor was three — that, as promised, was surprisingly useful.

Here, let me show you…

dot is one preprocessor and a 9-line runtime. Pipe any source through dot and gawk sees normal awk. Build polymorphic types in awk without writing HEAP[it]["field"] a hundred times.

Smallest dot program I know

Per-column running mean of any whitespace-separated table. One object, three lines of dot syntax:

BEGIN { x = new("plain") }
      { for(c=1;c<=NF;c++) { .x.n[c]++; .x.mu[c] += ($c-.x.mu[c])/.x.n[c] } }
END   { for(c=1;c<=NF;c++) print "col", c, .x.mu[c] }
$ printf "10 1 100\n20 2 200\n30 3 300\n40 4 400\n50 5 500\n" | dot tiny.awk
col 1 30
col 2 3
col 3 300

Why new("plain")? It allocates one object id so .x.n[c] and .x.mu[c] preprocess into HEAP[x]["n"][c] and HEAP[x]["mu"][c] — otherwise n and mu would both be unset (= 0) and collide on the same HEAP slot.

Syntax sample

A leading dot means object access. .it.nHEAP[it]["n"]; bare .xHEAP[x].

Plain gawk

function num_add(it, x,    d) {
  HEAP[it]["n"]++
   d              = x - HEAP[it]["mu"]
  HEAP[it]["mu"] += d / HEAP[it]["n"]
  HEAP[it]["m2"] += d * (x - HEAP[it]["mu"]) }

dot

function num_add(it, x,    d) {
  .it.n++
   d      = x - .it.mu
  .it.mu += d / .it.n
  .it.m2 += d * (x - .it.mu) }

Install · Tutorial · Hello example · Manual · Tests

Need column types? dotcols. ML? dotlearn.