Process Monitoring#

Class for ControlChart: robust control charts with a balance between CUSUM and Shewhart properties.

process_improve.monitoring.control_charts.rho(x, k=2.52)[source]#

Bi-weight rho function.

Fixed constant of k=2.52 is from p 289 of the paper https://onlinelibrary.wiley.com/doi/abs/10.1002/for.1125

Parameters:

x (float)
k (float)

Return type:

float

process_improve.monitoring.control_charts.psi(x, k=2.0)[source]#

Pre-clean based on the Huber y-function.

Can be interpreted as replacing unexpected high or low values by a more likely value. From p 288 of the paper https://onlinelibrary.wiley.com/doi/abs/10.1002/for.1125

Parameters:

x (float)
k (float)

Return type:

float

class process_improve.monitoring.control_charts.ControlChart(style='robust', variant='HW')[source]#

Bases: object

Create control chart instance objects.

Parameters:

style (str)
variant (str)

__init__(style='robust', variant='HW')[source]#

Create/initialize a control chart.

Args: style (str, optional): Which style control chart to calculate. Defaults to “robust”.

Other choice is ‘regular’ (i.e. not-robust) calculations. User should then ensure that no outliers are present in the data.

variant (str, optional): Many variants of control charts are available.

The default is a Holt-Winters (HW) chart, with automatic determination of control chart parameters. This chart is a blend of infinite history (CUSUM) charts, and an instantaneous (no history taken into account) Shewhart chart. The exact blend is specified by parameters ld_1 (lambda 1) and ld_2 (lambda 2).

Other variants are:

‘xbar.no.subgroup’ [Shewhart chart, with no subgroups]. In other words, each observation is independently plotted on the control chart.

‘CUSUM’ (CUmulative SUM) chart, which uses all the history of the chart.

Parameters:

style (str)
variant (str)

Return type:

None

calculate_limits(y, target=None, s=None, **kwargs)[source]#

Find for a given vector y, the control chart target and limits.

Only for the Holt-Winters method, and only when there are more than: min(20, max(10, np.ceil(0.10 * N))))

measurements, where N is the length of the input vector. In other words, the provided target and standard deviation are only used if more than 10 to 20 measurements. If target and s are numeric, then that target value and that standard deviation value are used, otherwise these values are estimated.

Parameters:

y (ndarray | Series)
target (float | None)
s (float | None)

Return type:

None

process_improve.monitoring.metrics.calculate_cpk(df, which_column, specifications=(nan, nan), trim_percentile=2.5)[source]#

Calculate the process capability, Cpk, near either the lower or the upper limit [will be automatically determined which].

Process capability, nearer the lower limit = (avg - lower_spec)/(3 x std deviation) Process capability, nearer the upper limit = (upper_spec - avg)/(3 x std deviation)

Parameters:

df (pd.DataFrame) – Raw data, at least one column is numeric.
which_column (str) – Indicates which is the column of data that should be used for the Cpk calculation.
specifications (tuple, optional) – Either a value, if the specification is constant over time; if the specification changes over time, then use two column names here, one of which is the lower specification and the second is the upper specification.
trim_percentile (float, optional) – If non-zero, then robust alternatives are used. The value specified is the percentile of data that is trimmed away; by default 2.5 percent on the left, and 2.5% on the right.

Returns:

The Cpk value.

Return type:

float