Authoring an MCP tool#
Note
This is the step-by-step for adding a new agent-callable tool. Tracks ENG-26. See Architecture overview for how the tool layer fits together.
Anatomy of a tool#
Every tool is three things in one module:
a pydantic input contract - a
BaseModelwithmodel_config = ConfigDict(extra="forbid")so unknown keys are rejected;a wrapper function decorated with
@tool_spec(...)that takes the parsed model as its single positional argument and returns a JSON-serialisabledict;a registration so discovery can find it.
The @tool_spec decorator (process_improve/tool_spec.py) attaches the
JSON-schema spec (derived from the input_model) and registers the function
in the global _TOOL_REGISTRY. get_tool_specs() returns specs in
registry order; discover_tools() imports each subpackage’s tools module
so the decorators run.
Step by step#
Pick the home. Domain tools live in
<subpackage>/tools.py(small subpackages) or, where the surface is large, one module per tool under<subpackage>/_tools/<tool_name>.pywithtools.pyas the aggregator - this is the pattern inexperiments/(ENG-02).Define the input model. Use
Field(...)with descriptions and validation (ge/le,min_length,Literal[...]). The descriptions become the tool’s JSON schema that the LLM reads, so write them for a caller.from pydantic import BaseModel, ConfigDict, Field class SummariseInput(BaseModel): model_config = ConfigDict(extra="forbid") data: list[float] = Field(..., min_length=1, description="The values to summarise.")
Write the wrapper and decorate it. Narrow the
exceptto the canonical expected set (see Error-Handling Style Guide) and pass the result throughclean(...)so numpy / pandas types serialise:from process_improve.tool_spec import clean, tool_spec @tool_spec( name="summarise_values", description="Return the mean and standard deviation of a list of numbers.", input_model=SummariseInput, examples='# "summarise [1, 2, 3]" -> ``summarise_values(data=[1, 2, 3])``', category="univariate", ) def summarise_values(spec: SummariseInput) -> dict: try: import numpy as np # noqa: PLC0415 - keep heavy imports lazy arr = np.asarray(spec.data, dtype=float) return clean({"mean": arr.mean(), "std": arr.std(ddof=1)}) except (ValueError, TypeError) as exc: logger.exception("Tool summarise_values failed") return {"error": str(exc)}
Register it. Importing the module must run the decorator. If you use the per-tool layout, the subpackage’s
tools.pyimports each tool module in a fixed order (the order fixes the spec-emission order) and tracks the names; if you add tools inline intools.pythey register in source order. Do not reorder existing imports - the tool-spec output is asserted stable.Confirm discovery.
tool_spec.discover_tools()imports your subpackage’stoolsmodule. If the subpackage is new, add its dotted...toolspath to the discovery list intool_spec.py.
Conventions#
Keep heavy imports (numpy, pandas, statsmodels, the domain algorithm) inside the wrapper function (
# noqa: PLC0415) so importing the tools module stays cheap.Return
{"error": "..."}for expected failures; let unexpected exceptions propagate (the server redacts them).Always wrap the payload in
clean(...).
Verifying#
from process_improve.tool_spec import get_tool_specs
specs = {s["name"]: s for s in get_tool_specs()}
assert "summarise_values" in specs
assert specs["summarise_values"]["input_schema"]["additionalProperties"] is False
Add a test under tests/ that drives the tool through the same path the MCP
server uses, plus an assertion that the spec is present and well-formed (see the
existing tests/test_experiments_tools.py and tests/test_tool_spec.py).