modeva.models.MoGLMTreeBoostRegressor#

class modeva.models.MoGLMTreeBoostRegressor(name: str = None, n_estimators=100, max_depth=1, learning_rate=1.0, n_epoch_no_change=5, min_samples_leaf=50, min_impurity_decrease=0, split_custom=None, n_screen_grid=1, n_feature_search=5, n_split_grid=20, reg_lambda=0.1, clip_predict=False, simplified=True, verbose=False, val_ratio=0.2, random_state=0)#

GLMTree Boosting regressor using residual-based boosting.

A gradient boosting regressor that uses GLMTree base models to iteratively fit residuals. Each tree has both linear and non-linear components to capture complex relationships while maintaining interpretability.

Parameters:
  • name (str, default=None) – Model identifier name for reference.

  • n_estimators (int, default=100) – Number of boosting rounds (trees) to fit.

  • max_depth (int, default=1) – Maximum tree depth. Model is most interpretable when depth=1.

  • learning_rate (float, default=1.0) – Shrinkage rate applied to each tree’s contribution.

  • n_epoch_no_change (int, default=5) – Early stopping rounds - training stops if validation loss doesn’t improve.

  • min_samples_leaf (int, default=50) – Minimum samples required in a leaf node.

  • min_impurity_decrease (float, default=0) – Minimum required decrease in impurity to split a node.

  • split_custom (dict, default=None) – Custom split points specified per feature.

  • n_screen_grid (int, default=1) – Grid size for initial split point screening.

  • n_feature_search (int, default=5) – Number of features to consider after screening.

  • n_split_grid (int, default=20) – Grid size for fine-grained split point search.

  • reg_lambda (float, default=0.1) – L1 regularization strength parameter.

  • clip_predict (bool, default=False) – Whether to clip predictions to training data range.

  • simplified (bool, default=True) – Use simplified partial linear regression for splits.

  • val_ratio (float, default=0.2) – Proportion of data used for validation.

  • verbose (bool, default=False) – Whether to print training progress.

  • random_state (int, default=0) – Random seed for reproducibility.

estimators_#

Fitted GLMTree models.

Type:

list

n_features_in_#

Number of input features.

Type:

int

calibrate_interval(X, y, alpha=0.1, max_depth: int = 5)#

Fit a conformal prediction model to the given data.

This method computes the model’s prediction interval calibrated to the given data.

If the model is a regressor, splits the data with 50% for fitting lower (alpha / 5) and upper (1 - alpha / 2) gradient boosting trees-based quantile regression to the model’s residual; and 50% for calibration.

If the model is a binary classifiers, it computes the calibration quantile based on predicted probabilities for the positive class.

Parameters:
  • X (X : np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.

  • y (array-like of shape (n_samples, )) – Target values.

  • alpha (float, default=0.1) – Expected miscoverage for the conformal prediction.

  • max_depth (int, default=5) – Maximum depth of the gradient boosting trees for regression tasks. Only used when task_type is REGRESSION.

Raises:

ValueError – If the model is neither a regressor nor a classifier.:

fit(X, y, sample_weight=None)#

fit the GLMTree Boosting model

Parameters:
  • X (array-like of shape (n_samples, n_features)) – containing the input dataset

  • y (array-like of shape (n_samples,)) – containing target values

  • sample_weight (array-like of shape (n_samples,)) – containing the weight of each sample

Returns:

self : Estimator instance.

Return type:

object

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

load(file_name: str)#

Load the model into memory from file system.

Parameters:

file_name (str) – The path and name of the file.

Return type:

estimator object

predict(X)#

Model predictions, calling the child class’s ‘_predict’ method.

Parameters:

X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.

Returns:

np.ndarray

Return type:

The (calibrated) final prediction

predict_interval(X)#

Predict the prediction interval for the given data based on the conformal prediction model.

It splits the data with 50% for fitting lower (alpha / 5) and upper (1 - alpha / 2) gradient boosting trees-based quantile regression to the model’s residual; and 50% for calibration.

Parameters:

X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.

Returns:

np.ndarray – in the format [n_samples, 2] for regressors or a flattened array for classifiers.

Return type:

The lower and upper bounds of the prediction intervals for each sample

Raises:

ValueError – If fit_conformal has not been called to fit the conformal prediction model: before calling this method.

save(file_name: str)#

Save the model into file system.

Parameters:

file_name (str) – The path and name of the file.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

estimator instance