modeva.TestSuite.compare_robustness#

TestSuite.compare_robustness(dataset: str = 'test', metric: str = None, n_repeats: int = 10, perturb_features: str | Tuple = None, perturb_method: str = 'normal', noise_levels: float | int | Tuple = 0.1, random_state: int = 0)#

Performs robustness testing by comparing model performance under different perturbation levels.

This function performs robustness testing by comparing the performance of different models when subjected to various levels of perturbation. It allows for the specification of the dataset, performance metric, and perturbation characteristics, and returns a comprehensive result encapsulating the robustness scores and visualizations.

Parameters:

dataset ({"main", "train", "test"}, default="test") – Dataset partition to be used for testing.
metric (str, default=None) –
Model performance metric to use.
- For classification (default=”AUC”): “ACC”, “AUC”, “F1”, “LogLoss”, and “Brier”
- For regression (default=”MSE”): “MSE”, “MAE”, and “R2”
n_repeats (int, default=10) – Number of times to repeat the perturbation test.
perturb_features (str or tuple, default=None) – Specific features to perturb. If None, all features are perturbed.
perturb_method ({"normal", "quantile"}, default="normal") –
Method to generate perturbations:
- ”normal”: Gaussian noise
- ”quantile”: Quantile-based perturbation
noise_levels (float or tuple, default=0.1) – Magnitude of perturbation to apply. Can be single value or multiple levels.
random_state (int, default=0) – Seed for random number generation to ensure reproducibility.

Returns:

A container object with the following components:

key: “compare_robustness”
data: Name of the dataset used
model: List of model names being compared
inputs: Input parameters used for the test
value: Dictionary of (“<model_name>”, item) pairs, which item is also a dictionary with
- noise_level (e.g., 0.1, 0.2): Performance metrics across noise levels;
- data_info: Sample indices for worst and remaining cases (only for the first noise level)
  data_results = ds.data_drift_test(**results.value["MoLGBMRegressor"][0.2]["data_info"]) data_results.plot("summary") data_results.plot(("density", "MedInc"))
table: DataFrame containing detailed performance metrics for each noise level
options: Dictionary of visualizations configuration for a bar plot visualizing the performance scores against each cluster. Run results.plot() to show this plot.

Return type:

ValidationResult

Examples

Robustness Analysis (Classification)

Robustness Analysis (Regression)