modeva.TestSuite.diagnose_mitigate_unfair_thresholding#

TestSuite.diagnose_mitigate_unfair_thresholding(group_config, favorable_label: int = 1, dataset: str = 'test', metric: str = None, performance_metric: str = None, proba_cutoff: int | Tuple = None)#

Attempts to mitigate model unfairness by applying feature value binning.

This method evaluates how binning feature values affects both model fairness and performance. It works by replacing values within each bin with the bin’s mean value, which can help reduce unfair treatment of similar instances.

Parameters:

group_config (dict) –

Configuration defining protected and reference groups. Supports three formats:

For numerical features:

{
    "feature": str,           # Feature name
    "protected": {            # Protected group bounds
        "lower": float,       # Lower bound
        "lower_inclusive": bool,
        "upper": float,       # Optional upper bound
        "upper_inclusive": bool
    },
    "reference": {            # Reference group bounds
        "lower": float,       # Optional lower bound
        "lower_inclusive": bool,
        "upper": float,       # Upper bound
        "upper_inclusive": bool
    }
}

For categorical features:

{
    "feature": str,                  # Feature name
    "protected": str or int,         # Protected group category
    "reference": str or int          # Reference group category
}

For probabilistic group membership:

{
    "by_weights": True,
    "protected": str,         # Column name with protected group probabilities
    "reference": str          # Column name with reference group probabilities
}

favorable_label ({0, 1}, default=1) –
- For classification: The preferred class label.
- For regression: 1 means larger predictions are preferred, 0 means smaller predictions are preferred.
dataset ({"main", "train", "test"}, default="test") – Which dataset partition to analyze
metric (str, optional) –
Fairness metric to use. Higher values indicate better fairness.

For regression (default=”SMD”):
- ”SMD”: Standardized Mean Difference (%) between groups
For classification (default=”AIR”):
- ”AIR”: Adverse Impact Ratio
- ”PR”: Precision Ratio
- ”RR”: Recall Ratio
performance_metric (str, default=None) –
Model performance metric to use.
- For classification (default=”AUC”): “ACC”, “AUC”, “F1”, “LogLoss”, and “Brier”
- For regression (default=”MSE”): “MSE”, “MAE”, and “R2”
proba_cutoff (int or tuple, default=20) – If int, it represents the number of uniform grid points of cutoff values between 0 and 1. If tuple of float, it is the custom grid points of cutoff values, and its values should be within 0 and 1.

Returns:

Contains:

key: “diagnose_mitigate_unfair_thresholding”
data: Name of the dataset used
model: Name of the model used
inputs: Input parameters used for the test
value: Nested dictionary containing the (“<threshold>”, item) pairs for each threshold, and the item is also a dictionary with:
- ”Performance”: Predictive performance scores after adjusting the threshold
- ”Fairness”: Fairness scores after adjusting the threshold
table: dictionary of dataframe with perforamnce and fairness metric scores after adjusting the threshold.
- ”Fairness”: Fairness scores table
- ”Performance”: Predictive performance table
options: Dictionary of visualizations configuration. Run results.plot() to show all plots; Run results.plot(name=xxx) to display one preferred plot; and the following names are available:
- ”<group_name>”: Line plots visualizing the performance and fairness scores against each threshold.

Return type:

ValidationResult

Notes

The method compares baseline metrics (no binning) against metrics after binning each feature. This can help identify which features, when binned, most effectively balance fairness and performance tradeoffs.

Examples

Model Fairness Analysis (Classification)