Test Suite#

Post-hoc Explanation#

TestSuite.explain_pfi

Calculate Permutation Feature Importance (PFI) for model features.

TestSuite.explain_hstatistic

Calculate H-statistics for all feature pairs to measure feature interactions.

TestSuite.explain_pdp

Calculate and visualize Partial Dependence Plot (PDP) for specified model features.

TestSuite.explain_ale

Calculate Accumulated Local Effects (ALE) plots for one or two features.

TestSuite.explain_lime

Generate a LIME (Local Interpretable Model-agnostic Explanations) explanation for a specific sample.

TestSuite.explain_shap

Generate SHAP (SHapley Additive exPlanations) values for local model explanation.

Inherent Interpretation#

TestSuite.interpret_coef

Extracts and visualizes the coefficients of linear model features.

TestSuite.interpret_fi

Calculate and visualize global feature importance for the model.

TestSuite.interpret_effects

Analyze and visualize how one or two features influence model predictions through main effects or interaction effects.

TestSuite.interpret_local_fi

Calculates and visualizes feature importance scores for a single sample.

TestSuite.interpret_local_linear_fi

Calculate and visualize local feature importance for a specific data sample using linear approximation.

TestSuite.interpret_llm_summary

This method generates a table populated with unwrapper summary statistics.

TestSuite.interpret_llm_pc

Generate and visualize parallel coordinate plots for Local Linear Model (LLM) coefficients.

TestSuite.interpret_llm_profile

Calculate local feature importance for a specific feature using LLM profiles.

TestSuite.interpret_llm_violin

Generates the LLM coefficients and statistics.

TestSuite.interpret_global_tree

Generate a visualization of the complete decision tree model structure.

TestSuite.interpret_local_tree

Generate a visualization of the decision path for a specific sample through the decision tree.

TestSuite.interpret_local_moe_weights

Calculate and visualize expert weights for a specific sample.

TestSuite.interpret_effects_moe_average

Analyze feature effects averaged across all mixture-of-experts clusters.

TestSuite.interpret_moe_cluster_analysis

Analyze and summarize characteristics of mixture-of-experts clusters.

Diagnostics#

TestSuite.diagnose_accuracy_table

Evaluate model performance on training and test datasets.

TestSuite.diagnose_residual_analysis

Analyze the relationship between model residuals and a specified feature.

TestSuite.diagnose_residual_cluster

Analyze model residuals by clustering data points and evaluating performance within clusters.

TestSuite.diagnose_residual_interpret

Analyzes feature importance by examining their relationship with prediction residuals.

TestSuite.diagnose_slicing_accuracy

Identify low-accuracy regions based on specified slicing features.

TestSuite.diagnose_slicing_overfit

Identify overfit regions based on one or two slicing features.

TestSuite.diagnose_slicing_reliability

Get unreliable regions based on one or two slicing features.

TestSuite.diagnose_slicing_robustness

Get unreliable regions based on one or two slicing features.

TestSuite.diagnose_slicing_fairness

Evaluate a model's slicing fairness metric across different protected-reference groups.

TestSuite.diagnose_robustness

Evaluate model robustness by measuring performance under feature perturbations.

TestSuite.diagnose_reliability

Evaluates model reliability using split conformal prediction.

TestSuite.diagnose_resilience

Evaluate model resilience by analyzing performance on challenging data subsets.

TestSuite.diagnose_fairness

Evaluate fairness metrics across different protected and reference groups.

TestSuite.diagnose_mitigate_unfair_thresholding

Attempts to mitigate model unfairness by applying feature value binning.

TestSuite.diagnose_mitigate_unfair_binning

Mitigate model unfairness through feature value binning.

Model Comparison#

TestSuite.compare_accuracy_table

Compare predictive performance metrics across multiple models.

TestSuite.compare_slicing_accuracy

Compares model performance across different data slices based on a specified feature.

TestSuite.compare_slicing_overfit

Compares model performance across different data slices to identify potential overfit regions.

TestSuite.compare_slicing_reliability

Compares reliability metrics across different slices of data for multiple models.

TestSuite.compare_slicing_robustness

Compares model robustness across different data slices by analyzing performance stability under perturbations.

TestSuite.compare_slicing_fairness

Evaluates fairness metrics across different protected and reference groups by slicing the data.

TestSuite.compare_robustness

Performs robustness testing by comparing model performance under different perturbation levels.

TestSuite.compare_reliability

Compares reliability performance of multiple models under data shifts by evaluating prediction intervals/sets.

TestSuite.compare_resilience

Compare model resilience performance under data shifts across multiple models.

TestSuite.compare_residual_cluster

Compare model residuals by clustering data points and evaluating performance within clusters.

TestSuite.compare_fairness

Compares fairness metrics across multiple models.

Utilities#

TestSuite.get_dataset

Return the dataset object.

TestSuite.set_dataset

Set dataset for test suite.

TestSuite.get_model

Return the model object.

TestSuite.set_model

Set model for test suite.

TestSuite.get_main_effects

Return the list of main effects.

TestSuite.get_interactions

Return the list of pairwise interactions.

TestSuite.list

List all the experiments saved in database.

TestSuite.register

Register a test into MLFlow.

TestSuite.delete_registed_test

Load config and result of registered tests.

TestSuite.load_registered_test

Load config and result of registered tests.

TestSuite.list_registered_tests

Return the list all registered tests.

TestSuite.display_test_results

Get ValidationResult object of registered test.

TestSuite.export_report

Export report to html