One self-contained bash binary. Bundles dot's runtime + dotcols's column types + Num/Sym/Data + the stats demo.
| Tool | Version | Why |
|---|---|---|
gawk | ≥ 4.x | indirect calls (@fn), FUNCTAB, asorti |
bash | ≥ 4.x | arrays, process substitution <(...) |
curl | any | --get-data fetcher |
curl -sL https://raw.githubusercontent.com/timm/awk/master/dotcols/dotcols -o dotcols
chmod +x dotcols
Or install on PATH:
curl -sL https://raw.githubusercontent.com/timm/awk/master/dotcols/dotcols -o ~/.local/bin/dotcols
chmod +x ~/.local/bin/dotcols
One command. The bundled stats demo runs against its sample CSV (heart-disease, 14 columns):
./dotcols --demo stats | head -3
Expect:
column n mid spread
AGE 303 54.366 9.082
sex 303 male 0.624
Pulls 30 curated CSVs (10 classification, 10 regression, 10 optimization) from the moot data repo into ./data/:
./dotcols --get-data
Layout:
data/classify/ iris wine glass heart.c diabetes breast.w ionosphere soybean zoo sonar
data/regression/ housing autompg auto93 abalone cpu.act bodyfat machine.cpu cal.housing fishcatch servo
data/optimize/ Apache_AllMeasurements HSMGP_num SQL_AllMeasurements rs-6d-c3_obj1 rs-6d-c3_obj2 sol-6d-c2-obj1 SS-A SS-B SS-C SS-D
dotcols FILE.awk [DATA...] # run rewritten FILE.awk on DATA (or stdin)
dotcols a.awk b.awk DATA # multi-file run; .awk args go through prep
cat DATA | dotcols FILE.awk # stdin works too
dotcols -c FILE.awk # print rewritten source only
dotcols --demo NAME [DATA] # run demos/NAME/*.awk on DATA (or sample.*)
dotcols --demos # list available demos under ./demos/
dotcols --show # dump bundled lib/*.awk (post-prep)
dotcols --get-data # fetch 30 curated CSVs into ./data/
dotcols --help # full help
One command each. All three call --demo stats on a different CSV:
# classification: 4 numeric features + species class
dotcols --demo stats data/classify/iris.csv
# regression: Boston housing prices
dotcols --demo stats data/regression/housing.csv
# optimization: Apache web-server config tuning
dotcols --demo stats data/optimize/Apache_AllMeasurements.csv
git clone https://github.com/timm/awk
cd awk/dotcols
./build.sh # rebuild the dotcols binary from sources
The repo holds the modular sources (lib/numsym.awk, lib/data.awk) plus build.sh, which bundles them with ../dot/lib/* into the single-file dotcols binary.