six

Quick overview (expanded): data.lua — full theory mapped to code

1) Lua language essentials: syntax, locals vs globals, functions as first-class values, tables as arrays/records, pairs/ipairs iteration, and module return patterns.

2) Table manipulation and copying: shallow vs deep clone semantics and iterative construction of table-based records.

3) Random number generation in Lua: math.random, seeding (math.randomseed), and implications for reproducibility and RNG quality.

4) Triangular distribution: parameterization by (lo, mid, hi) and the sampling technique using two uniform samples to bias toward the mode.

5) Gaussian (normal) distribution generation: the Box–Muller transform (using log, cos, uniform RNG) to produce normally distributed samples.

6) Categorical (symbolic) sampling: sampling from discrete distributions by cumulative counts and mode fallback.

7) Online statistics (one-pass algorithms): Welford’s algorithm for incremental mean (mu), second moment accumulator (m2), variance and standard deviation updates.

8) Handling missing data: sentinel values (e.g., “?”) and propagation/avoidance in aggregations.

9) Symbolic summaries: frequency tables, counting occurrences, computing the mode and total counts.

10) Composite data structures and recursive operations: DATA objects that contain columns, and recursive add/sample operations over nested columns.

11) Sampling and Monte Carlo methods: drawing many random samples, aggregating results, and using sampling to approximate distributional properties.

12) Sorting and quantile estimation: sorting samples and selecting indices to compute quintiles/five-number summaries.

13) Statistical summary functions: constructing and interpreting five-number/quintile summaries and simple tabular summaries.

14) Numerical stability and edge cases: handling small sample sizes (n < 2), zero variance cases, and guarding against negative m2 due to floating-point error.

15) Performance considerations: time complexity of repeated sampling and sorting, and memory allocation for large sample sets.

16) Common Lua idioms used here: constructor functions returning typed tables, lightweight tagging via it fields, and using function tables for small test examples (eg table).