Bivariate Analysis#
Backwards-compatible re-exporter for process_improve.bivariate.
The implementation now lives in
process_improve.bivariate._elbow_peak (ENG-23 / #305): the renamed
file makes filename-ranked tooling (Jump-to-File, fuzzy search, codecov
reports) less ambiguous about which methods.py is being shown.
Every public name remains importable as before:
from process_improve.bivariate.methods import find_elbow_point, find_line_intersection
- process_improve.bivariate.methods.find_elbow_point(x, y, max_iter=41)[source]#
Find the elbow point when plotting numeric entries in x vs numeric values in list y.
Return the index into the vectors x and y [the vectors must have the same length], where the elbow point occurs. Returns -1 if every value in x or y is missing.
Using a robust linear fit, sorts the samples in X (independent variable) and takes the first 5 samples from the left, and the last 5 from the right, then fits two linear regressions and computes the intersection of the two fitted lines. The window size is then grown over max_iter (default 41) evenly spaced steps, via numpy.linspace, up to roughly half the data, accumulating one intersection point per step.
The elbow is taken as the data point whose (x, y) location is closest to the median of the accumulated intersection points; the median location is where the intersections should stabilise.
Will probably not work well on few data points. If so, try fitting a spline to the raw data and then repeat with the interpolated data.
- process_improve.bivariate.methods.find_line_intersection(m1, b1, m2, b2)[source]#
Find the intersection point of two lines.
From Stackoverflow: stackoverflow.com/questions/20677795/how-do-i-compute-the-intersection-point-of-two-lines
Returns a tuple: (x, y) where the two lines intersect, given slopes m1 and m2, and intercepts b1 and b2.
- process_improve.bivariate.methods.fit_robust_lm(x, y)[source]#
Fits a robust linear model between Numpy vectors x and y, with an intercept. Returns a length-2 array
[intercept, slope](theparamsattribute returned bystatsmodels.RLM); no extra checking on data consistency is done.See also: regression.repeated_median_slope