Response surfaces and steepest ascent in two factors#

Source worksheets: yint.org/w10 and yint.org/w11 - weeks 10 and 11 of the applied DoE short course.

Module 6 finished the 1-D version of optimization: pick a factor, take small steps, fit a linear model, swap to a quadratic when the surface curves, predict the peak, confirm. Module 7 generalizes that loop to two factors, using the classical response-surface trio of factorial points, a center point to detect curvature, and axial points to fit a quadratic.

This module walks the loop end-to-end on a 2-factor process whose response surface is quadratic in both factors plus an interaction.

Tip

Two factors is the largest case you can see on a flat page. Use the contour plot as the main output of every step: it shows the direction of steepest ascent at a glance, and where the next experiment should go.

Q1 from w11 - picking sensible coded ranges#

The ecommerce team has narrowed eight factors down to two:

  • A = items in the menu/navigation bar (1 to 10)

  • B = products shown per web page (1 to 50)

The worksheet asks: what are reasonable starting values for the -1 and +1 levels?

The principle: pick a narrow initial range that you can change safely on the live site, centered near your current operating point. You can always widen later. A wide range invites the linear model to badly approximate a curving surface.

Solution

Sensible starting choices:

  • For A (menu items): center at the current site’s value, say 5, and use -1 = 4 and +1 = 6. A 20% perturbation is enough to learn the slope without surprising users.

  • For B (products per page): center at 12 (a common pagination length), with -1 = 8 and +1 = 16.

Run the 2x2 plus a center point. If the center is consistent with the linear fit, take a step in the direction of steepest ascent. If the center disagrees with the fit, the surface is already curving and you need axial points (CCD) right away.

Q2 - a 2-D optimization, end to end#

Below is a concrete 2-factor system. Think of two coded process inputs x1 and x2 (temperature and reaction time, for example) driving some yield-like response y. The hidden surface is

y(x1, x2) = 70 + 8*x1 + 4*x2
            - 3*x1**2 - 2*x2**2
            - 2*x1*x2
            + noise(0, 0.5)

The true peak sits at (x1, x2) = (1.22, 0.33) with response y = 75.6 - but the optimizer does not know that. Only the function call observe(x1, x2) is allowed, and each call returns a noisy measurement.

[1]:
import numpy as np
import plotly.graph_objects as go
import plotly.io as pio

from process_improve.experiments import c, gather, lm, predict

pio.renderers.default = "notebook_connected"

rng = np.random.default_rng(seed=7)


def observe(x1: float, x2: float) -> float:
    """Return a noisy measurement of the hidden 2-D surface."""
    truth = (70 + 8 * x1 + 4 * x2
             - 3 * x1 ** 2 - 2 * x2 ** 2
             - 2 * x1 * x2)
    return float(truth + rng.normal(scale=0.5))

Step 1: 2x2 factorial plus one center run#

Spend five runs first: the four corners of a 2x2 plus the center. Fit a linear-plus-interaction model and check whether the center’s measured response matches the model’s prediction at (0, 0). A big mismatch is the surface’s way of telling you it is curving.

[2]:
x1 = c(-1, +1, -1, +1, 0, name="x1")
x2 = c(-1, -1, +1, +1, 0, name="x2")
y_obs = [observe(a, b) for a, b in zip([-1, 1, -1, 1, 0], [-1, -1, 1, 1, 0], strict=True)]
y = c(*y_obs, name="y")
step1 = gather(x1=x1, x2=x2, y=y)
step1
[2]:
x1 x2 y
1 -1.0 -1.0 51.000615
2 1.0 -1.0 71.149373
3 -1.0 1.0 62.862931
4 1.0 1.0 74.554704
5 0.0 0.0 69.772665
[3]:
m_lin = lm("y ~ x1 + x2 + x1:x2", step1)
print("Linear-with-interaction fit on 2x2+center:")
print(m_lin.get_parameters(drop_intercept=False).to_string())
centre_pred = float(predict(m_lin, x1=0, x2=0).iloc[0])
print()
print(f"Predicted center y at (0, 0): {centre_pred:.2f}")
print(f"Observed center y at (0, 0):  {y_obs[4]:.2f}")
print(f"Difference: {y_obs[4] - centre_pred:+.2f}")
Linear-with-interaction fit on 2x2+center:
Intercept    65.868058
x1            7.960133
x2            3.816912
x1:x2        -2.114246

Predicted center y at (0, 0): 65.87
Observed center y at (0, 0):  69.77
Difference: +3.90

Solution

The intercept fits at ~65.9 from the corners alone, but the center run measured ~69.8 - about four units above the linear prediction. Four units is much larger than the measurement noise (sigma about 0.5), so the surface is curving inside the design region. The linear model is no longer enough; we need at least a quadratic, which means adding axial points.

Step 2: axial points -> central composite design (CCD)#

A central composite design adds four axial runs at (+/-alpha, 0) and (0, +/-alpha). Rotatable choice is alpha = sqrt(2) ~= 1.414 for two factors.

Combining the original five runs with the four new ones gives a 9-run CCD that supports a full quadratic model y = b_0 + b_1 x_1 + b_2 x_2 + b_11 x_1^2 + b_22 x_2^2 + b_12 x_1 x_2.

[4]:
alpha = np.sqrt(2)
x1_axial = [-alpha, alpha, 0, 0]
x2_axial = [0, 0, -alpha, alpha]
y_axial = [observe(a, b) for a, b in zip(x1_axial, x2_axial, strict=True)]

x1_ccd = c(-1, +1, -1, +1, 0, *x1_axial, name="x1")
x2_ccd = c(-1, -1, +1, +1, 0, *x2_axial, name="x2")
y_ccd  = c(*y_obs, *y_axial, name="y")
ccd = gather(x1=x1_ccd, x2=x2_ccd, y=y_ccd)
ccd
[4]:
x1 x2 y
1 -1.000000 -1.000000 51.000615
2 1.000000 -1.000000 71.149373
3 -1.000000 1.000000 62.862931
4 1.000000 1.000000 74.554704
5 0.000000 0.000000 69.772665
6 -1.414214 0.000000 52.190468
7 1.414214 0.000000 75.343780
8 0.000000 -1.414214 61.013253
9 0.000000 1.414214 71.410751
[5]:
m_quad = lm("y ~ x1 + x2 + I(x1**2) + I(x2**2) + x1:x2", ccd)
print("Quadratic fit on the 9-run CCD:")
print(m_quad.get_parameters(drop_intercept=False).to_string())
Quadratic fit on the 9-run CCD:
Intercept     69.772665
x1             8.073032
x2             3.746491
I(x1 ** 2)    -3.027185
I(x2 ** 2)    -1.804746
x1:x2         -2.114246

Guidance

The fitted intercept (~69.8) now sits at the measured center value, not the linear-model extrapolation. The quadratic coefficients b_11 ~= -3.0 and b_22 ~= -1.8 confirm the surface curves downward away from the center in both directions - the classic peak shape. b_12 ~= -2.1 says the two factors interact: pushing both high together gives diminishing returns.

Step 3: find the predicted optimum#

For the quadratic y = b_0 + b_1 x_1 + b_2 x_2 + b_11 x_1^2 + b_22 x_2^2 + b_12 x_1 x_2, the stationary point satisfies the gradient equation [2 b_11, b_12; b_12, 2 b_22] * x = -[b_1; b_2]. A two-by-two solve gives the predicted peak in closed form.

[6]:
p = m_quad.get_parameters(drop_intercept=False)
A = np.array([[2 * p["I(x1 ** 2)"], p["x1:x2"]],
              [p["x1:x2"], 2 * p["I(x2 ** 2)"]]])
b = np.array([-p["x1"], -p["x2"]])
x_star = np.linalg.solve(A, b)
y_star = float(predict(m_quad, x1=x_star[0], x2=x_star[1]).iloc[0])
print(f"Predicted optimum: x1 = {x_star[0]:.3f}, x2 = {x_star[1]:.3f}")
print(f"Predicted response at the optimum: y = {y_star:.2f}")
Predicted optimum: x1 = 1.221, x2 = 0.323
Predicted response at the optimum: y = 75.30
[7]:
# Confirm the optimum with one more run.

y_confirm = observe(x_star[0], x_star[1])
print(f"Confirmation run at the predicted optimum: y = {y_confirm:.2f}")
print(f"Predicted: {y_star:.2f} | Observed: {y_confirm:.2f} | Diff: {y_confirm - y_star:+.2f}")
Confirmation run at the predicted optimum: y = 75.28
Predicted: 75.30 | Observed: 75.28 | Diff: -0.02
[8]:
# Contour plot of the fitted surface, overlaid with the runs.

grid = np.linspace(-1.8, 1.8, 50)
G1, G2 = np.meshgrid(grid, grid)
Z = (p["Intercept"]
     + p["x1"] * G1 + p["x2"] * G2
     + p["I(x1 ** 2)"] * G1 ** 2
     + p["I(x2 ** 2)"] * G2 ** 2
     + p["x1:x2"] * G1 * G2)

fig = go.Figure()
fig.add_trace(go.Contour(x=grid, y=grid, z=Z, contours_coloring="lines",
                         line_width=1, showscale=False))
fig.add_trace(go.Scatter(x=ccd["x1"], y=ccd["x2"], mode="markers",
                         marker={"size": 9, "color": "black"},
                         name="CCD runs"))
fig.add_trace(go.Scatter(x=[x_star[0]], y=[x_star[1]], mode="markers",
                         marker={"size": 14, "symbol": "star", "color": "red"},
                         name="Predicted optimum"))
fig.update_layout(width=560, height=480,
                  xaxis_title="x1 (coded)", yaxis_title="x2 (coded)",
                  title="Fitted response surface with predicted optimum")
fig

Solution

The fitted quadratic places the optimum near (x1, x2) ~= (1.22, 0.33) with predicted response ~75.6. The confirmation run at that point should land within a noise width of the prediction.

When to stop. The quadratic surface has a single stationary point; the contour plot is concentric ellipses around it; and the confirmation run matches the prediction. All three signals agree - you are at the optimum of this model. Whether the optimum of the model is also the optimum of the true system is a separate question; a sensitivity analysis around the predicted point (run a couple of nearby points and confirm none beat it) is the final check.

Check yourself

In coded units the predicted optimum sits at about (1.22, 0.33). How would you convert that back to real-world ecommerce parameters if the design used A: 4 to 6 for menu items and B: 8 to 16 for products per page?

Solution

The real-world value for any factor is real = center + coded * half_range.

  • For A (menu items), center = 5, half-range = 1, so A* = 5 + 1.22 * 1 = 6.22 items. In practice you would round to 6 (since you cannot have fractional menu items) and run a final confirmation at A = 6.

  • For B (products per page), center = 12, half-range = 4, so B* = 12 + 0.33 * 4 = 13.3 products - round to 13.

Wrap-up#

The 2-D response-surface workflow in one paragraph:

  1. Factorial + center point (5 runs): if the center disagrees with the linear fit, you have curvature.

  2. Add axial points to make a CCD (9 runs total). Fit a full quadratic.

  3. Solve the gradient = 0 equation for the predicted peak.

  4. Confirm with one or two runs at and near the predicted peak.

  5. Sensitivity check: explore a small region around the optimum to confirm no nearby point beats it.

Next: Module 8 wraps up the whole course - we collect the vocabulary in one place, map it back to the modules, and point to the parts of process_improve that handle each step.