modeva.models.MoGLMTreeBoostRegressor#
- class modeva.models.MoGLMTreeBoostRegressor(name: str = None, n_estimators=100, max_depth=1, learning_rate=1.0, n_epoch_no_change=5, min_samples_leaf=50, min_impurity_decrease=0, split_custom=None, n_screen_grid=1, n_feature_search=5, n_split_grid=20, reg_lambda=0.1, clip_predict=False, simplified=True, verbose=False, val_ratio=0.2, random_state=0)#
GLMTree Boosting regressor using residual-based boosting.
A gradient boosting regressor that uses GLMTree base models to iteratively fit residuals. Each tree has both linear and non-linear components to capture complex relationships while maintaining interpretability.
- Parameters:
name (str, default=None) – Model identifier name for reference.
n_estimators (int, default=100) – Number of boosting rounds (trees) to fit.
max_depth (int, default=1) – Maximum tree depth. Model is most interpretable when depth=1.
learning_rate (float, default=1.0) – Shrinkage rate applied to each tree’s contribution.
n_epoch_no_change (int, default=5) – Early stopping rounds - training stops if validation loss doesn’t improve.
min_samples_leaf (int, default=50) – Minimum samples required in a leaf node.
min_impurity_decrease (float, default=0) – Minimum required decrease in impurity to split a node.
split_custom (dict, default=None) – Custom split points specified per feature.
n_screen_grid (int, default=1) – Grid size for initial split point screening.
n_feature_search (int, default=5) – Number of features to consider after screening.
n_split_grid (int, default=20) – Grid size for fine-grained split point search.
reg_lambda (float, default=0.1) – L1 regularization strength parameter.
clip_predict (bool, default=False) – Whether to clip predictions to training data range.
simplified (bool, default=True) – Use simplified partial linear regression for splits.
val_ratio (float, default=0.2) – Proportion of data used for validation.
verbose (bool, default=False) – Whether to print training progress.
random_state (int, default=0) – Random seed for reproducibility.
- estimators_#
Fitted GLMTree models.
- Type:
list
- n_features_in_#
Number of input features.
- Type:
int
- calibrate_interval(X, y, alpha=0.1, max_depth: int = 5)#
Fit a conformal prediction model to the given data.
This method computes the model’s prediction interval calibrated to the given data.
If the model is a regressor, splits the data with 50% for fitting lower (alpha / 5) and upper (1 - alpha / 2) gradient boosting trees-based quantile regression to the model’s residual; and 50% for calibration.
If the model is a binary classifiers, it computes the calibration quantile based on predicted probabilities for the positive class.
- Parameters:
X (X : np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
y (array-like of shape (n_samples, )) – Target values.
alpha (float, default=0.1) – Expected miscoverage for the conformal prediction.
max_depth (int, default=5) – Maximum depth of the gradient boosting trees for regression tasks. Only used when task_type is REGRESSION.
- Raises:
ValueError – If the model is neither a regressor nor a classifier.:
- fit(X, y, sample_weight=None)#
fit the GLMTree Boosting model
- Parameters:
X (array-like of shape (n_samples, n_features)) – containing the input dataset
y (array-like of shape (n_samples,)) – containing target values
sample_weight (array-like of shape (n_samples,)) – containing the weight of each sample
- Returns:
self : Estimator instance.
- Return type:
object
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- load(file_name: str)#
Load the model into memory from file system.
- Parameters:
file_name (str) – The path and name of the file.
- Return type:
estimator object
- predict(X)#
Model predictions, calling the child class’s ‘_predict’ method.
- Parameters:
X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
- Returns:
np.ndarray
- Return type:
The (calibrated) final prediction
- predict_interval(X)#
Predict the prediction interval for the given data based on the conformal prediction model.
It splits the data with 50% for fitting lower (alpha / 5) and upper (1 - alpha / 2) gradient boosting trees-based quantile regression to the model’s residual; and 50% for calibration.
- Parameters:
X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
- Returns:
np.ndarray – in the format [n_samples, 2] for regressors or a flattened array for classifiers.
- Return type:
The lower and upper bounds of the prediction intervals for each sample
- Raises:
ValueError – If fit_conformal has not been called to fit the conformal prediction model: before calling this method.
- save(file_name: str)#
Save the model into file system.
- Parameters:
file_name (str) – The path and name of the file.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance