Model Probability Calibration#

Probability calibration adjusts a model’s predicted probabilities to align more closely with actual observed probabilities. This is particularly important in classification tasks where raw model outputs may not accurately reflect confidence levels.

  • Platt Scaling (method=’sigmoid’): Fits a logistic regression model to transform predicted probabilities.

  • Isotonic Regression (method=’isotonic’): A non-parametric calibration method that ensures monotonicity. Isotonic regression requires a sufficient number of samples to avoid overfitting.

The calibrate_proba method allows users to fit a calibration model on raw probability outputs.

1. Prepare data and model#

from modeva import DataSet
from modeva.models import MoXGBClassifier

ds = DataSet()
ds.load(name="TaiwanCredit")
ds.set_random_split()

model = MoXGBClassifier(name="Raw XGB", max_depth=2)
model.fit(ds.train_x, ds.train_y)

2. Calibration#

model.calibrate_proba(X_test, y_test, method='sigmoid')

The predict_proba method then applies the calibration:

3. Get calibrate predict proba#

calibrated_probs = model.predict_proba(X_test, calibration=True)

Probability calibration assumes that the validation data is representative of the test distribution. If the data distribution shifts significantly, recalibration may be necessary.

Examples#