modeva.TestSuite.interpret_moe_cluster_analysis#

TestSuite.interpret_moe_cluster_analysis(dataset: str = 'test', metric: str = None)#

Analyze and summarize characteristics of mixture-of-experts clusters.

Parameters:

dataset ({"main", "train", "test"}, default="test") – The dataset to analyze cluster assignments.
metric (str, metric=None) –
Model performance metric to use.
- For classification (default=”AUC”): “ACC”, “AUC”, “F1”, “LogLoss”, and “Brier”
- For regression (default=”MSE”): “MSE”, “MAE”, and “R2”

Returns:

Contains cluster analysis results:

key: “interpret_cluster_analysis”
data: Name of the dataset used
model: Name of the model used
inputs: Input parameters
value: Nested dictionary containing the (“<expert_id>”, item) pairs for each group, and the item is also a dictionary with:
- ”size”: Number of samples in cluster
- ”score”: The performance metric of this cluster
- ”center”: Cluster centroid coordinates
- ”data_info”: Sample indices for in/out of cluster comparison, which can be further used for data distribution test, e.g.,
  data_results = ds.data_drift_test(**results.value[2]["data_info"]) data_results.plot("summary") data_results.plot(("density", "MedInc"))
table: DataFrame with performance metrics for each cluster
options: Dictionary of visualizations configuration. Run results.plot(name=xxx) to show all plots; Run results.plot(name=xxx) to display one preferred plot; and the following names are available:
- ”cluster_performance”: Bar plot visualizing the performance scores of final MOE model against each cluster.

Return type:

ValidationResult

Examples