Streaming machine learning in plain gawk.

dotlearn sits on top of dot. Three apps, one shape: read CSV, build column stats, train, evaluate. No Python, no JVM, no dependencies.

tree

Decision / regression tree. Binary cuts, balance filter, info-gain splits. Same code does classification (sym y) and regression (num y).

dotlearn --demo tree

nb

Naive Bayes classifier. Per-class column stats, m-estimate / k-estimate smoothing, log-likelihood prediction. Streaming: train then test in one pass.

dotlearn --demo nb

acquire

Active learning. Shuffle, split half/half. Warm-start with 4 labels. Acquire 50 more by centroid distance (best vs rest). Train tree. Predict test set. Top-5 picks scored against full-data percentiles. On auto93: ~95 wins with 14% of data labelled.

dotlearn --demo acquire

Take the tour · Examples · Manual

First time? Start with the dot tour — dotlearn assumes you know new(), add(), and the .it.field sugar.