modeva.TestSuite.diagnose_accuracy_table#

TestSuite.diagnose_accuracy_table(train_dataset: str = 'train', test_dataset: str = 'test', metric: str | Tuple = None)#

Evaluate model performance on training and test datasets.

Calculates specified performance metrics for both training and test sets, along with the performance gap between them. This can help identify potential overfitting or underfitting issues.

For binary classification, the following plots will be generated.

  • “roc_auc”: Generate the ROC AUC curve.

  • “confusion_matrix”: Generate the confusion matrix.

  • “precision_recall”: Generate the precision-recall curve.

Parameters:
  • train_dataset ({"main", "train", "test"}, default="train") – Dataset to use for training metrics. Must be one of the predefined dataset splits.

  • test_dataset ({"main", "train", "test"}, default="test") – Dataset to use for testing metrics. Must be one of the predefined dataset splits.

  • metric (str or tuple of str, default=None) –

    Performance metric(s) to calculate. If None:

    • For regression: Uses MSE, MAE, and R2

    • For classification: Uses ACC, AUC, F1, LogLoss, and Brier

Returns:

A result object containing:

  • key: “diagnose_accuracy_table”

  • data: Name of the dataset used

  • model: Name of the model used

  • inputs: Input parameters used for the test

  • value: Dictionary of (“<metric_name>”, item), each item is also a dictionary with:

    • ”<train_dataset>”: The metric value of training dataset.

    • ”<test_dataset>”: The metric value of testing dataset.

    • ”GAP”: The performance gap is calculated as (test_score - train_score).

  • table: DataFrame of metric results

  • options: Dictionary of visualizations configuration. Run results.plot() to show all plots; Run results.plot(name=xxx) to display one preferred plot; and the following names are available:

    • (“roc_auc”, “<train_dataset>”): roc curve for training dataset

    • (“roc_auc”, “<test_dataset>”): roc curve for testing dataset

    • (“precision_recall”, “<train_dataset>”): precision recall curve for training dataset

    • (“precision_recall”, “<test_dataset>”): precision recall for testing dataset

    • (“confusion_matrix”, “<train_dataset>”): roc curve for training dataset

    • (“confusion_matrix”, “<test_dataset>”): confusion matrix for testing dataset

Return type:

ValidationResult

Examples

Performance Metrics (Classification)

Performance Metrics (Classification)

Performance Metrics (Regression)

Performance Metrics (Regression)