modeva.TestSuite.explain_pdp#

TestSuite.explain_pdp(features: str | Tuple[str] = None, dataset: str = 'test', sample_size: int = 5000, percentiles: Tuple = (0, 1), grid_resolution: int = 20, response_method: str = 'auto', random_state: int = 0)#

Calculate and visualize Partial Dependence Plot (PDP) for specified model features.

Partial Dependence Plots (PDP) show the marginal effect of one or two features on the predicted outcome of a machine learning model. They illustrate how the model’s predictions change as a feature varies over its range, while averaging out the effects of all other features. This makes PDPs a valuable tool for understanding feature importance and their relationships with the target variable in a model-agnostic way.

Parameters:
  • features (str or tuple of str) –

    Name of single feature or tuple of two feature names to analyze their effects on model output.

    • If features=(“X1”, ) or “X1”, visualize the main effect for X1.

    • If features=(“X1”, “X2”), visualize the interaction for X1 and X2.

    • If features=((“X1”, ), (“X2”, )), visualize the main effect for X1 and X2 separately.

    Note: Batch mode for 2D effect plot is not supported. If None, all 1D features will be used.

  • dataset ({"main", "train", "test"}, default="test") – The dataset used for calculating the PDP results.

  • sample_size (int, default=5000) – Number of random samples to use for speeding up calculation. If None, all data points will be used.

  • percentiles (tuple, default=(0, 1)) – The lower and upper percentile used to create the extreme values for the grid. Must be in [0, 1].

  • grid_resolution (int, default=20) – The number of equally spaced points on the grid for each target feature.

  • response_method ({"auto", "decision_function", "predict_proba"}, default="auto") –

    Prediction method to use for binary classification tasks:

    • ”auto”: Uses ‘predict_proba’ if available, otherwise ‘decision_function’

    • ”predict_proba”: Probability of the positive class

    • ”decision_function”: Model’s decision function output

  • random_state (int, default=0) – Random seed for controlling reproducibility in subsampling.

Returns:

PDP result containing:

  • key: “explain_pdp”

  • data: Name of the dataset used

  • model: Name of the model used

  • inputs: Input parameters used for the analysis

  • value: Dictionary containing

    • ”Value”: X grid values, can be a single 1D-array (1D) or list or 2 1D-arrays (2D);

    • ”Effect”: PD values corresponding to grid values, can be a single 1D-array (1D) or 2D-array (2D)

  • table: DataFrame of PDP results

  • options: Dictionary of visualizations configuration for a line (1D numerical) / bar (1D categorical) / heatmap (2D) effect plot. Run results.plot() to show all plots; To display one preferred plot by results.plot(name=xxx), and the following names are available:

    • None: Effect plots of all effects specified in features.

    • ”<effect_name>”: Effect plot of the selected main effect or pairwise interaction.

Return type:

ValidationResult

Notes

For single features, generates a line or bar plot depending on feature type. For two features, generates a heatmap showing the interaction effects.

Examples

Global Explainability

Global Explainability