Experimental Strategy Recommendation#

Before running any experiments, the most important question is: “How should I plan my experimental program?” The recommend_strategy function answers this by generating a complete multi-stage experimental plan - screening, optimization, and confirmation - using deterministic decision rules from Montgomery, NIST, and the Stat-Ease SCOR framework.

The recommender is fully deterministic: identical inputs always produce identical outputs. There is no randomness and no LLM - just ~50 codified rules that encode best practices from the DOE literature.

When to Use This Tool#

Before your first experiment - plan the entire workflow upfront so that budget and time are spent efficiently.
When you have many candidate factors - the tool decides whether a screening stage is needed and which design to use.
When budget is limited - it allocates runs across stages to maximize information per experiment.
When working in a specialized domain - domain-specific templates (fermentation, cell culture, pharma, etc.) adjust design choices and center-point requirements automatically.

Concepts#

Multi-stage workflows#

Most experimental programs follow a three-stage sequence:

Screening - Identify the vital few factors from many candidates. Typical designs: Plackett-Burman, Definitive Screening Design (DSD), or fractional factorial.
Optimization - Fit a response surface model for the significant factors. Typical designs: Central Composite Design (CCD), Box-Behnken, or D-optimal.
Confirmation - Run replicates at the predicted optimum to verify that the model predictions hold.

Each stage has transition rules that tell you what to do next based on the results. For example, after screening:

0–1 significant factors found: broaden factor ranges or check the measurement system.
2–5 significant factors: proceed to optimization.
6+ significant factors: sub-group factors or run additional screening.
Curvature detected at center points: augment the factorial to a CCD.

Budget allocation#

When a budget is specified, runs are allocated across stages using the 25-40-55-15 framework (Montgomery / Stat-Ease):

Screening: 25–40% of the budget
Optimization: 40–55% of the budget
Confirmation: 5–15% of the budget

Domain templates can shift these weights. For example, fermentation allocates more to optimization (50%) because biological variability demands extra center points for reliable error estimation.

Quick Start#

A 7-factor fermentation problem with a budget of 40 runs:

from process_improve.experiments.factor import Factor, Response
from process_improve.experiments.strategy import recommend_strategy

factors = [
    Factor(name="Temperature", low=25, high=40, units="degC"),
    Factor(name="pH", low=5.0, high=7.5),
    Factor(name="Glucose", low=10, high=50, units="g/L"),
    Factor(name="Yeast extract", low=1, high=10, units="g/L"),
    Factor(name="Agitation", low=100, high=400, units="rpm"),
    Factor(name="Aeration", low=0.5, high=2.0, units="vvm"),
    Factor(name="Inoculum", low=2, high=10, units="%v/v"),
]
responses = [Response(name="Yield", goal="maximize", units="g/L")]

strategy = recommend_strategy(
    factors=factors,
    responses=responses,
    budget=40,
    domain="fermentation",
)

for stage in strategy["stages"]:
    print(f"Stage {stage['stage_number']}: {stage['stage_name']}")
    print(f"  Design: {stage['design_type']}, Runs: {stage['estimated_runs']}")
    print(f"  Purpose: {stage['purpose']}")

This outputs:

Stage 1: Screening
  Design: plackett_burman, Runs: 8
  Purpose: Screen 7 candidate factors to identify the vital few.
Stage 2: Optimization
  Design: ccd, Runs: 19
  Purpose: Fit quadratic response surface model for the 3 significant factors. ...
Stage 3: Confirmation
  Design: replicates_at_optimum, Runs: 3
  Purpose: Run replicates at the predicted optimum to verify the model predictions. ...

The engine selected Plackett-Burman screening (the fermentation domain default), a CCD for response surface optimization, and 3 confirmation replicates - all within the 40-run budget.

Interpreting the Output#

recommend_strategy returns a dictionary with these keys:

Key	Description
`stages`	Ordered list of experimental stages. Each stage contains `stage_number`, `stage_name`, `design_type`, `design_params`, `factors`, `estimated_runs`, `purpose`, `success_criteria`, and `transition_rules`.
`total_estimated_runs`	Sum of estimated runs across all stages.
`budget_allocation`	Dictionary mapping stage names to allocated run counts.
`reasoning`	Step-by-step explanation of the decision logic.
`assumptions`	Key assumptions underlying the recommendation (e.g. factor ranges are wide enough, measurement system is adequate).
`risks`	Potential issues and warnings (e.g. tight budget, split-plot requirements).
`alternative_strategies`	Brief descriptions of other approaches worth considering.
`strategy_id`	Deterministic hash of the input - same inputs always produce the same ID.
`domain`	The application domain used.
`detail_level`	`"novice"` or `"intermediate"`.

To inspect transition rules after screening:

for rule in strategy["stages"][0]["transition_rules"]:
    print(f"If {rule['condition']}:")
    print(f"  -> {rule['action']}")
    print(f"  Otherwise -> {rule['fallback']}")

Working with Budget Constraints#

The budget parameter controls how many total runs are available. The engine adjusts stage complexity accordingly:

for b in [60, 40, 20, None]:
    result = recommend_strategy(factors=factors, budget=b, domain="fermentation")
    print(f"Budget={str(b):>4s}: {result['total_estimated_runs']:>2d} runs, "
          f"{len(result['stages'])} stages")

Budget=  60: 30 runs, 3 stages
Budget=  40: 30 runs, 3 stages
Budget=  20: 18 runs, 3 stages
Budget=None: 30 runs, 3 stages

With a tight budget, the engine reduces center points, chooses more economical designs, and may issue warnings in result["risks"] about underpowered designs. When budget=None, the ideal allocation is used without constraint.

Using Prior Knowledge#

If you already know something about which factors matter, pass a free-text description via the prior_knowledge parameter. The engine parses keywords to set a confidence level:

High confidence (0.9): “confirmed”, “validated”, “published”, “well-established”
Medium confidence (0.7): “literature suggests”, “preliminary data”, “pilot study”
Low confidence (0.4): “suspect”, “expected”, “based on theory”
No knowledge (0.1): “no prior data”, “first time”, “exploratory”

High confidence (>= 0.8 with supporting data) skips the screening stage entirely:

# No prior knowledge - full screening
s1 = recommend_strategy(factors=factors, budget=40, domain="fermentation")
print(f"No prior: {len(s1['stages'])} stages")

# Low confidence - still screens
s2 = recommend_strategy(
    factors=factors, budget=40, domain="fermentation",
    prior_knowledge="We suspect Temperature and pH are important.",
)
print(f"Low confidence: {len(s2['stages'])} stages")

# High confidence - screening skipped
s3 = recommend_strategy(
    factors=factors, budget=40, domain="fermentation",
    prior_knowledge=(
        "Published and validated results confirm Temperature "
        "and pH are significant."
    ),
)
print(f"High confidence: {len(s3['stages'])} stages")

No prior: 3 stages
Low confidence: 3 stages
High confidence: 1 stages

Domain-Specific Strategies#

The domain parameter selects a domain template that adjusts screening design preferences, RSM design choices, center-point counts, and budget weights. Eight domains are available:

Domain	Screening / RSM preference	Notes
`"fermentation"`	Plackett-Burman / CCD	Extra center points (5+) for biological variability.
`"cell_culture"`	DSD / Box-Behnken	Minimizes runs for expensive, slow experiments (14–21 days).
`"pharma_formulation"`	DSD / Face-centered CCD	ICH QbD framework; design space definition for regulatory submissions.
`"food_science"`	Fractional factorial / BBD	Mixture handling; avoids extreme factor combinations.
`"extraction"`	Fractional factorial / CCD	Rotatable CCD for good boundary prediction.
`"analytical_method"`	Fractional factorial / CCD	AQbD / ICH Q2/Q14; includes robustness study stage.
`"bioprocess"`	Plackett-Burman / CCD	Scale-up considerations for bench-to-production transfer.
`"general"`	Rule-engine defaults	No domain-specific adjustments.

Comparing two domains on the same factors shows how design choices differ:

for domain in ["fermentation", "cell_culture"]:
    result = recommend_strategy(factors=factors, budget=40, domain=domain)
    screening = result["stages"][0]
    print(f"{domain:>15s}: {screening['design_type']}, "
          f"{screening['estimated_runs']} screening runs")

fermentation: plackett_burman, 8 screening runs
cell_culture: definitive_screening, 15 screening runs

Fermentation uses Plackett-Burman (efficient, many-factor screening), while cell culture uses a Definitive Screening Design because it combines screening and curvature detection in a single stage - saving an entire experimental cycle when each run takes 2–3 weeks.

Hard-to-Change Factors#

When some factors are expensive or time-consuming to reset between runs (e.g. reactor temperature, equipment configuration), flag them with hard_to_change_factors. The engine wraps affected stages in a split-plot structure:

result = recommend_strategy(
    factors=factors,
    budget=40,
    domain="fermentation",
    hard_to_change_factors=["Temperature"],
)

for stage in result["stages"]:
    params = stage["design_params"]
    if params.get("split_plot"):
        print(f"{stage['stage_name']}: split-plot design")
        print(f"  Whole-plot (hard to change): {params['whole_plot_factors']}")
        print(f"  Sub-plot (easy to change):   {params['subplot_factors']}")

Screening: split-plot design
  Whole-plot (hard to change): ['Temperature']
  Sub-plot (easy to change):   ['pH', 'Glucose', 'Yeast extract', 'Agitation', ...]

With split-plot designs, runs are grouped within whole-plot factor levels to minimize the number of hard-to-change factor resets. The output risks will include a reminder that standard ANOVA gives incorrect p-values for split-plot experiments - a restricted maximum likelihood (REML) analysis is needed instead.

Multiple Responses#

When optimizing for more than one response, define each with its own goal:

responses = [
    Response(name="Yield", goal="maximize", units="g/L"),
    Response(name="Purity", goal="maximize", units="%"),
    Response(name="Cost", goal="minimize", units="USD/kg"),
]

result = recommend_strategy(
    factors=factors,
    responses=responses,
    budget=40,
    domain="fermentation",
)

The strategy structure is the same - the engine plans the experimental stages needed to build models for all responses simultaneously. After running the experiments, use optimize_responses() with desirability functions to find the best trade-off across responses.