17 smoke tests for dotcols. Verifies dot's runtime is bundled correctly, plus exercises Num, Sym, Data, dispatch, and the bundled stats demo.
cd dotcols
./tests/test.sh
# All 17 tests passed.
contains "build.sh runs" "built dotcols" "$out"
contains "no args -> usage" "Usage:" "$($DC 2>&1)"
contains "--help has --demo" "--demo" "$($DC --help)"
contains "--help has --get-data" "--get-data" "$($DC --help)"
contains "--demos lists stats" "stats" "$($DC --demos)"
Asserts: binary builds; CLI dispatcher is wired (every -- branch reachable); --demos auto-discovers demos/stats/.
dotcols's binary embeds dot's lib/*.awk. These tests prove it.
eq "new + .field still works" "7" "$(run_inline 'BEGIN{N=new("plain"); .N.n=7; printf "%d\n", .N.n}')"
eq "o(5.0) still collapses" "5" "$(run_inline 'BEGIN{o(5.0); print ""}')"
Asserts: new(), .field sugar, and the printer o all work inside dotcols. If the build forgot to include dot's libs, these fail with "function not defined".
eq "num via add()" "30 15.811" "$(run_inline '
BEGIN { N = new("num")
add(N,10,1); add(N,20,1); add(N,30,1); add(N,40,1); add(N,50,1)
printf "%d %.3f\n", mid(N), var(N) }')"
eq "num mid alias" "30" "$(run_inline '
BEGIN { N = new("num")
for (i=1;i<=5;i++) add(N,i*10,1)
printf "%d\n", mid(N) }')"
Asserts:
mid(N) returns the same mean — polymorphic dispatch from mid → num_mid via .N.is.eq "sym mode" "b" "$(run_inline 'BEGIN{S=new("sym"); add(S,"a",1); add(S,"b",1); add(S,"b",1); printf "%s\n", mid(S)}')"
eq "sym entropy>0" "yes" "$(run_inline 'BEGIN{S=new("sym"); add(S,"a",1); add(S,"b",1); printf "%s\n", (var(S)>0?"yes":"no")}')"
Asserts: Sym's mode picks the most-frequent key (b appeared twice); entropy is positive when the distribution isn't degenerate. Same dispatch shape as Num — same add / mid / var calls, different per-type implementations.
eq "data ingests rows" "303" "$(run_inline '
BEGIN { d = new("data"); FS = " *, *" }
NR==1 { data_head(d, $0); next }
{ data_read(d, 1) }
END { printf "%d\n", .d.nrows }
' demos/stats/sample.csv)"
Asserts: Data parses the heart-disease CSV header (14 columns, classify num!) and ingests all 303 data rows. .d.nrows is updated in-place via data_read. Any cell going to a wrong-typed column would crash earlier.
out=$($DC --demo stats)
contains "stats demo header" "column" "$out"
contains "stats demo body" "AGE" "$out"
contains "stats demo bottom" "num!" "$out"
Asserts: the bundled demo produces the expected per-column summary. First line has "column" header, body includes "AGE" (an UPPER numeric column → mean+stdev), bottom has "num!" (the classification target). End-to-end smoke for: dispatcher → build_gawk_args → preprocess + load → gawk runs over sample.csv → output.
out=$($DC --demo nonesuch 2>&1 || true)
contains "missing demo errors" "no such demo" "$out"
Asserts: bad demo name fails with a readable error, not a gawk crash.
add/mid/var calls, two implementations.If you change anything in lib/numsym.awk or lib/data.awk, run ./tests/test.sh. Two minutes round-trip.