modeva.models.MoGLMTreeBoostRegressor#

class modeva.models.MoGLMTreeBoostRegressor(name: str = None, n_estimators=100, max_depth=1, learning_rate=1.0, n_epoch_no_change=5, min_samples_leaf=50, min_impurity_decrease=0, split_custom=None, n_screen_grid=1, n_feature_search=5, n_split_grid=20, reg_lambda=0.1, clip_predict=False, simplified=True, verbose=False, val_ratio=0.2, random_state=0)#

GLMTree Boosting regressor using residual-based boosting.

A gradient boosting regressor that uses GLMTree base models to iteratively fit residuals. Each tree has both linear and non-linear components to capture complex relationships while maintaining interpretability.

Parameters:

name (str, default=None) – Model identifier name for reference.
n_estimators (int, default=100) – Number of boosting rounds (trees) to fit.
max_depth (int, default=1) – Maximum tree depth. Model is most interpretable when depth=1.
learning_rate (float, default=1.0) – Shrinkage rate applied to each tree’s contribution.
n_epoch_no_change (int, default=5) – Early stopping rounds - training stops if validation loss doesn’t improve.
min_samples_leaf (int, default=50) – Minimum samples required in a leaf node.
min_impurity_decrease (float, default=0) – Minimum required decrease in impurity to split a node.
split_custom (dict, default=None) – Custom split points specified per feature.
n_screen_grid (int, default=1) – Grid size for initial split point screening.
n_feature_search (int, default=5) – Number of features to consider after screening.
n_split_grid (int, default=20) – Grid size for fine-grained split point search.
reg_lambda (float, default=0.1) – L1 regularization strength parameter.
clip_predict (bool, default=False) – Whether to clip predictions to training data range.
simplified (bool, default=True) – Use simplified partial linear regression for splits.
val_ratio (float, default=0.2) – Proportion of data used for validation.
verbose (bool, default=False) – Whether to print training progress.
random_state (int, default=0) – Random seed for reproducibility.

estimators_#

Fitted GLMTree models.

Type:: list

n_features_in_#

Number of input features.

Type:: int

calibrate_interval(X, y, alpha=0.1, max_depth: int = 5)#

Fit a conformal prediction model to the given data.

This method computes the model’s prediction interval calibrated to the given data.

If the model is a regressor, splits the data with 50% for fitting lower (alpha / 5) and upper (1 - alpha / 2) gradient boosting trees-based quantile regression to the model’s residual; and 50% for calibration.

If the model is a binary classifiers, it computes the calibration quantile based on predicted probabilities for the positive class.

Parameters:

X (X : np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
y (array-like of shape (n_samples, )) – Target values.
alpha (float, default=0.1) – Expected miscoverage for the conformal prediction.
max_depth (int, default=5) – Maximum depth of the gradient boosting trees for regression tasks. Only used when task_type is REGRESSION.

Raises:

ValueError – If the model is neither a regressor nor a classifier.:

fit(X, y, sample_weight=None)#

fit the GLMTree Boosting model

Parameters:

X (array-like of shape (n_samples, n_features)) – containing the input dataset
y (array-like of shape (n_samples,)) – containing target values
sample_weight (array-like of shape (n_samples,)) – containing the weight of each sample

Returns:

self : Estimator instance.

Return type:

object

get_params(deep=True)#

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

load(file_name: str)#

Load the model into memory from file system.

Parameters:: file_name (str) – The path and name of the file.
Return type:: estimator object

predict(X)#

Model predictions, calling the child class’s ‘_predict’ method.

Parameters:: X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
Returns:: np.ndarray
Return type:: The (calibrated) final prediction

predict_interval(X)#

Predict the prediction interval for the given data based on the conformal prediction model.

It splits the data with 50% for fitting lower (alpha / 5) and upper (1 - alpha / 2) gradient boosting trees-based quantile regression to the model’s residual; and 50% for calibration.

Parameters:: X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
Returns:: np.ndarray – in the format [n_samples, 2] for regressors or a flattened array for classifiers.
Return type:: The lower and upper bounds of the prediction intervals for each sample
Raises:: ValueError – If fit_conformal has not been called to fit the conformal prediction model: before calling this method.

save(file_name: str)#

Save the model into file system.

Parameters:: file_name (str) – The path and name of the file.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: estimator instance