modeva.TestSuite.diagnose_residual_analysis#

TestSuite.diagnose_residual_analysis(features: str = None, use_prediction: bool = False, dataset: str = 'test', sample_size: int = 2000, random_state: int = 0)#

Analyze the relationship between model residuals and a specified feature.

Creates a scatter plot showing the residuals (actual - predicted values) against a chosen feature or target variable. This can help identify patterns or heteroscedasticity in the model’s predictions. For classification tasks, residuals are calculated using predicted probabilities of the positive class.

Parameters:
  • features (str, default=None) – The name of the feature to plot on the x-axis. Can be ignored when use_prediction is True.

  • use_prediction (bool, default=False) – Whether to use the model prediction (predicted probability for classification) as x-axis.

  • dataset ({"main", "train", "test"}, default="test") – Which dataset partitionto use for the analysis.

  • sample_size (int, default=2000) – Maximum number of points to plot. If the dataset is larger, a random subsample of this size will be used to improve visualization clarity.

  • random_state (int, default=0) – Random seed for reproducible subsampling.

Returns:

A container object with the following attributes:

  • key: “diagnose_residual”

  • data: Name of the dataset used

  • model: Name of the model used

  • inputs: Input parameters used for the test

  • value : Dictionary containing the x-axis values and residuals used in the plot

  • table : DataFrame containing the plotted data

  • options: Dictionary of visualizations configuration for a scatter plot where x-axis is the selected feature value, and y-axis is prediction residual (y - y_hat). Run results.plot() to show this plot.

Return type:

ValidationResult

Notes

For classification models, residuals are calculated as the difference between the actual class labels and the predicted probabilities for the positive class. For regression models, residuals are the difference between actual and predicted values.

Examples

Residual Analysis (Classification)

Residual Analysis (Classification)

Residual Analysis (Regression)

Residual Analysis (Regression)