modeva.testsuite.utils.slicing_utils.get_data_info#
- modeva.testsuite.utils.slicing_utils.get_data_info(res_value)#
Extract data information from the result values.
It is designed for extracting the “good” / “bad” samples of slicing-based tests, and the results can be further used for testing data distribution drift.
- Parameters:
res_value (list of dict) – List containing result values with feature and sample information.
- Returns:
A dictionary containing data information for each feature. The structure is as follows:
{ 'feature_name': { 'dataset1': str, # Name of the dataset 'dataset2': str, # Name of the dataset 'sample_idx1': list, # List of sample IDs for "good" samples 'sample_idx2': list, # List of sample IDs for "bad" samples 'name1': str, # Label for "good" samples 'name2': str # Label for "bad" samples } }
- Return type:
dict
Examples
results = ts.diagnose_slicing_robustness(features="PAY_1", perturb_features=("PAY_1", "EDUCATION",), noise_levels=0.1, metric="AUC", method="auto-xgb1", threshold=0.7) data_info = get_data_info(res_value=results.value)["PAY_1"]