six

Hill-climbing and Random Search — an intro for grad students

Learning goals

1) High-level idea (one-paragraph)

We want to find inputs x that make an objective f(x) small (or large).

Random search draws many candidates at random and remembers the best.

Hill-climbing here uses a short loop:

(Aside that simple loop is the Cross-Entropy Method (CEM) / estimation-of-distribution approach .)

2) Minimal CEM sampler (put this at the top of any lecture)

This tiny pseudocode shows the whole algorithm in plain steps. This code only handles the very simple case of a model with one variable.

μ := initial mean
σ2 := initial variance
repeat until convergence or max iterations:
  X := draw N samples from Normal(μ, σ2)
  score each x in X with f(x) = objective(x)
  elites := top Ne points from X by score
  μ := mean(elites)
  σ2 := variance(elites)  (use small floor if zero)
end
return μ  -- mean of final distribution (best guess)

One-line intuition: “sample many → keep best → refit → repeat.”

Small annotated example: one iteration (concrete numbers)

3) hillc.lua: objective and model (short fragments)

Objective f(x) (toy polynomial). We minimize this.

local function f(x)
  return 1.61 + 2.1*x[1] + -3.5*x[2]^3 + 4*x[3]^3 - 5*x[4]^4
end

MODEL describes how to draw candidates. In data.lua a DATA is a vector of columns; NUM columns define Gaussians (mu, sd).

local function MODEL()
  return d.DATA({ d.NUM(100,1), d.NUM(20,5), d.NUM(10,4), d.NUM(3,2) })
end

Mapping: one-dimensional CEM = one d.NUM(mu, sd).

4) Random search (short explanation + code snippet)

What it does: sample repeatedly from the model and keep the best sample seen.

Code fragment:

for j=1,1000 do
  xs = d.sample(model0)
  y = add(seen, f(xs))
  if y < best then best,out = y, xs end
end

Plain English:

When to use: simple baselines, highly parallelizable, works okay for low dimension or wide priors.

5) Hill-climbing implemented in hillc.lua (climb)

What it does: sample a batch, keep the promising ones (via a z-score rule), add those to a fresh model, return the new model.

Code fragment:

seen, model1 = d.NUM(), d.clone(model0)
for j=1,100 do
  xs = d.sample(model0)
  y = add(seen, f(xs))
  if (y - seen.mu) / seen.sd < -1 then d.add(model1, xs) end
end
return model1, seen

Step-by-step (student-friendly):

How this maps to CEM:

6) Simple changes to make this robust (practical tips)

10) Summary

Random search is a one-shot global sampler with memory of the best sample. Hill-climbing here is a simple CEM-style loop: sample a batch, keep elites, refit the sampler, repeat — a practical middle ground between blind search and heavy-weight global optimizers.