Once upon a time (I think it was Thursday) I lamented to Claude Code that gawk can't return arrays. It suggested a global-heap trick: functions return the integer id of an instance stored on an external heap. In theory, horrible. In practice, surprisingly useful.
So we tried it. A few hours porting some old code to this objects on an external heap idea, and the result was a tiny gawk system — the runtime started at a dozen lines; the preprocessor was three — that, as promised, was surprisingly useful.
Here, let me show you…
dot is one preprocessor and a 9-line runtime.
Pipe any source through dot and gawk sees normal awk.
Build polymorphic types in awk without writing HEAP[it]["field"] a hundred times.
new, arr, zap), small helper lib (rogues, o, _oo)..it.n → HEAP[it]["n"].Per-column running mean of any whitespace-separated table. One object, three lines of dot syntax:
BEGIN { x = new("plain") }
{ for(c=1;c<=NF;c++) { .x.n[c]++; .x.mu[c] += ($c-.x.mu[c])/.x.n[c] } }
END { for(c=1;c<=NF;c++) print "col", c, .x.mu[c] }
$ printf "10 1 100\n20 2 200\n30 3 300\n40 4 400\n50 5 500\n" | dot tiny.awk
col 1 30
col 2 3
col 3 300
Why new("plain")? It allocates one object id so .x.n[c] and .x.mu[c]
preprocess into HEAP[x]["n"][c] and HEAP[x]["mu"][c] — otherwise n and mu
would both be unset (= 0) and collide on the same HEAP slot.
A leading dot means object access. .it.n → HEAP[it]["n"]; bare .x → HEAP[x].
function num_add(it, x, d) {
HEAP[it]["n"]++
d = x - HEAP[it]["mu"]
HEAP[it]["mu"] += d / HEAP[it]["n"]
HEAP[it]["m2"] += d * (x - HEAP[it]["mu"]) }
function num_add(it, x, d) {
.it.n++
d = x - .it.mu
.it.mu += d / .it.n
.it.m2 += d * (x - .it.mu) }
Install · Tutorial · Hello example · Manual · Tests