Response surfaces and steepest ascent in two factors#
Source worksheets: yint.org/w10 and yint.org/w11 - weeks 10 and 11 of the applied DoE short course.
Module 6 finished the 1-D version of optimization: pick a factor, take small steps, fit a linear model, swap to a quadratic when the surface curves, predict the peak, confirm. Module 7 generalizes that loop to two factors, using the classical response-surface trio of factorial points, a center point to detect curvature, and axial points to fit a quadratic.
This module walks the loop end-to-end on a 2-factor process whose response surface is quadratic in both factors plus an interaction.
Tip
Two factors is the largest case you can see on a flat page. Use the contour plot as the main output of every step: it shows the direction of steepest ascent at a glance, and where the next experiment should go.
Q1 from w11 - picking sensible coded ranges#
The ecommerce team has narrowed eight factors down to two:
A = items in the menu/navigation bar (1 to 10)
B = products shown per web page (1 to 50)
The worksheet asks: what are reasonable starting values for the -1 and +1 levels?
The principle: pick a narrow initial range that you can change safely on the live site, centered near your current operating point. You can always widen later. A wide range invites the linear model to badly approximate a curving surface.
Solution
Sensible starting choices:
For A (menu items): center at the current site’s value, say 5, and use
-1 = 4and+1 = 6. A 20% perturbation is enough to learn the slope without surprising users.For B (products per page): center at 12 (a common pagination length), with
-1 = 8and+1 = 16.
Run the 2x2 plus a center point. If the center is consistent with the linear fit, take a step in the direction of steepest ascent. If the center disagrees with the fit, the surface is already curving and you need axial points (CCD) right away.
Q2 - a 2-D optimization, end to end#
Below is a concrete 2-factor system. Think of two coded process inputs x1 and x2 (temperature and reaction time, for example) driving some yield-like response y. The hidden surface is
y(x1, x2) = 70 + 8*x1 + 4*x2
- 3*x1**2 - 2*x2**2
- 2*x1*x2
+ noise(0, 0.5)
The true peak sits at (x1, x2) = (1.22, 0.33) with response y = 75.6 - but the optimizer does not know that. Only the function call observe(x1, x2) is allowed, and each call returns a noisy measurement.
[1]:
import numpy as np
import plotly.graph_objects as go
import plotly.io as pio
from process_improve.experiments import c, gather, lm, predict
pio.renderers.default = "notebook_connected"
rng = np.random.default_rng(seed=7)
def observe(x1: float, x2: float) -> float:
"""Return a noisy measurement of the hidden 2-D surface."""
truth = (70 + 8 * x1 + 4 * x2
- 3 * x1 ** 2 - 2 * x2 ** 2
- 2 * x1 * x2)
return float(truth + rng.normal(scale=0.5))
Step 1: 2x2 factorial plus one center run#
Spend five runs first: the four corners of a 2x2 plus the center. Fit a linear-plus-interaction model and check whether the center’s measured response matches the model’s prediction at (0, 0). A big mismatch is the surface’s way of telling you it is curving.
[2]:
x1 = c(-1, +1, -1, +1, 0, name="x1")
x2 = c(-1, -1, +1, +1, 0, name="x2")
y_obs = [observe(a, b) for a, b in zip([-1, 1, -1, 1, 0], [-1, -1, 1, 1, 0], strict=True)]
y = c(*y_obs, name="y")
step1 = gather(x1=x1, x2=x2, y=y)
step1
[2]:
| x1 | x2 | y | |
|---|---|---|---|
| 1 | -1.0 | -1.0 | 51.000615 |
| 2 | 1.0 | -1.0 | 71.149373 |
| 3 | -1.0 | 1.0 | 62.862931 |
| 4 | 1.0 | 1.0 | 74.554704 |
| 5 | 0.0 | 0.0 | 69.772665 |
[3]:
m_lin = lm("y ~ x1 + x2 + x1:x2", step1)
print("Linear-with-interaction fit on 2x2+center:")
print(m_lin.get_parameters(drop_intercept=False).to_string())
centre_pred = float(predict(m_lin, x1=0, x2=0).iloc[0])
print()
print(f"Predicted center y at (0, 0): {centre_pred:.2f}")
print(f"Observed center y at (0, 0): {y_obs[4]:.2f}")
print(f"Difference: {y_obs[4] - centre_pred:+.2f}")
Linear-with-interaction fit on 2x2+center:
Intercept 65.868058
x1 7.960133
x2 3.816912
x1:x2 -2.114246
Predicted center y at (0, 0): 65.87
Observed center y at (0, 0): 69.77
Difference: +3.90
Solution
The intercept fits at ~65.9 from the corners alone, but the
center run measured ~69.8 - about four units above the linear
prediction. Four units is much larger than the measurement noise
(sigma about 0.5), so the surface is curving inside the design
region. The linear model is no longer enough; we need at least
a quadratic, which means adding axial points.
Step 2: axial points -> central composite design (CCD)#
A central composite design adds four axial runs at (+/-alpha, 0) and (0, +/-alpha). Rotatable choice is alpha = sqrt(2) ~= 1.414 for two factors.
Combining the original five runs with the four new ones gives a 9-run CCD that supports a full quadratic model y = b_0 + b_1 x_1 + b_2 x_2 + b_11 x_1^2 + b_22 x_2^2 + b_12 x_1 x_2.
[4]:
alpha = np.sqrt(2)
x1_axial = [-alpha, alpha, 0, 0]
x2_axial = [0, 0, -alpha, alpha]
y_axial = [observe(a, b) for a, b in zip(x1_axial, x2_axial, strict=True)]
x1_ccd = c(-1, +1, -1, +1, 0, *x1_axial, name="x1")
x2_ccd = c(-1, -1, +1, +1, 0, *x2_axial, name="x2")
y_ccd = c(*y_obs, *y_axial, name="y")
ccd = gather(x1=x1_ccd, x2=x2_ccd, y=y_ccd)
ccd
[4]:
| x1 | x2 | y | |
|---|---|---|---|
| 1 | -1.000000 | -1.000000 | 51.000615 |
| 2 | 1.000000 | -1.000000 | 71.149373 |
| 3 | -1.000000 | 1.000000 | 62.862931 |
| 4 | 1.000000 | 1.000000 | 74.554704 |
| 5 | 0.000000 | 0.000000 | 69.772665 |
| 6 | -1.414214 | 0.000000 | 52.190468 |
| 7 | 1.414214 | 0.000000 | 75.343780 |
| 8 | 0.000000 | -1.414214 | 61.013253 |
| 9 | 0.000000 | 1.414214 | 71.410751 |
[5]:
m_quad = lm("y ~ x1 + x2 + I(x1**2) + I(x2**2) + x1:x2", ccd)
print("Quadratic fit on the 9-run CCD:")
print(m_quad.get_parameters(drop_intercept=False).to_string())
Quadratic fit on the 9-run CCD:
Intercept 69.772665
x1 8.073032
x2 3.746491
I(x1 ** 2) -3.027185
I(x2 ** 2) -1.804746
x1:x2 -2.114246
Guidance
The fitted intercept (~69.8) now sits at the measured center
value, not the linear-model extrapolation. The quadratic
coefficients b_11 ~= -3.0 and b_22 ~= -1.8 confirm the
surface curves downward away from the center in both directions
- the classic peak shape. b_12 ~= -2.1 says the two factors
interact: pushing both high together gives diminishing returns.
Step 3: find the predicted optimum#
For the quadratic y = b_0 + b_1 x_1 + b_2 x_2 + b_11 x_1^2 + b_22 x_2^2 + b_12 x_1 x_2, the stationary point satisfies the gradient equation [2 b_11, b_12; b_12, 2 b_22] * x = -[b_1; b_2]. A two-by-two solve gives the predicted peak in closed form.
[6]:
p = m_quad.get_parameters(drop_intercept=False)
A = np.array([[2 * p["I(x1 ** 2)"], p["x1:x2"]],
[p["x1:x2"], 2 * p["I(x2 ** 2)"]]])
b = np.array([-p["x1"], -p["x2"]])
x_star = np.linalg.solve(A, b)
y_star = float(predict(m_quad, x1=x_star[0], x2=x_star[1]).iloc[0])
print(f"Predicted optimum: x1 = {x_star[0]:.3f}, x2 = {x_star[1]:.3f}")
print(f"Predicted response at the optimum: y = {y_star:.2f}")
Predicted optimum: x1 = 1.221, x2 = 0.323
Predicted response at the optimum: y = 75.30
[7]:
# Confirm the optimum with one more run.
y_confirm = observe(x_star[0], x_star[1])
print(f"Confirmation run at the predicted optimum: y = {y_confirm:.2f}")
print(f"Predicted: {y_star:.2f} | Observed: {y_confirm:.2f} | Diff: {y_confirm - y_star:+.2f}")
Confirmation run at the predicted optimum: y = 75.28
Predicted: 75.30 | Observed: 75.28 | Diff: -0.02
[8]:
# Contour plot of the fitted surface, overlaid with the runs.
grid = np.linspace(-1.8, 1.8, 50)
G1, G2 = np.meshgrid(grid, grid)
Z = (p["Intercept"]
+ p["x1"] * G1 + p["x2"] * G2
+ p["I(x1 ** 2)"] * G1 ** 2
+ p["I(x2 ** 2)"] * G2 ** 2
+ p["x1:x2"] * G1 * G2)
fig = go.Figure()
fig.add_trace(go.Contour(x=grid, y=grid, z=Z, contours_coloring="lines",
line_width=1, showscale=False))
fig.add_trace(go.Scatter(x=ccd["x1"], y=ccd["x2"], mode="markers",
marker={"size": 9, "color": "black"},
name="CCD runs"))
fig.add_trace(go.Scatter(x=[x_star[0]], y=[x_star[1]], mode="markers",
marker={"size": 14, "symbol": "star", "color": "red"},
name="Predicted optimum"))
fig.update_layout(width=560, height=480,
xaxis_title="x1 (coded)", yaxis_title="x2 (coded)",
title="Fitted response surface with predicted optimum")
fig
Solution
The fitted quadratic places the optimum near
(x1, x2) ~= (1.22, 0.33) with predicted response ~75.6.
The confirmation run at that point should land within a noise
width of the prediction.
When to stop. The quadratic surface has a single stationary point; the contour plot is concentric ellipses around it; and the confirmation run matches the prediction. All three signals agree - you are at the optimum of this model. Whether the optimum of the model is also the optimum of the true system is a separate question; a sensitivity analysis around the predicted point (run a couple of nearby points and confirm none beat it) is the final check.
Check yourself
In coded units the predicted optimum sits at about
(1.22, 0.33). How would you convert that back to real-world
ecommerce parameters if the design used A: 4 to 6 for menu
items and B: 8 to 16 for products per page?
Solution
The real-world value for any factor is
real = center + coded * half_range.
For A (menu items), center = 5, half-range = 1, so
A* = 5 + 1.22 * 1 = 6.22items. In practice you would round to 6 (since you cannot have fractional menu items) and run a final confirmation at A = 6.For B (products per page), center = 12, half-range = 4, so
B* = 12 + 0.33 * 4 = 13.3products - round to 13.
Wrap-up#
The 2-D response-surface workflow in one paragraph:
Factorial + center point (5 runs): if the center disagrees with the linear fit, you have curvature.
Add axial points to make a CCD (9 runs total). Fit a full quadratic.
Solve the gradient = 0 equation for the predicted peak.
Confirm with one or two runs at and near the predicted peak.
Sensitivity check: explore a small region around the optimum to confirm no nearby point beats it.
Next: Module 8 wraps up the whole course - we collect the vocabulary in one place, map it back to the modules, and point to the parts of process_improve that handle each step.