modeva.models.MoGLMTreeRegressor#

class modeva.models.MoGLMTreeRegressor(name: str = None, max_depth=3, min_samples_leaf=50, min_impurity_decrease=0, split_custom=None, n_screen_grid=1, n_feature_search=10, n_split_grid=20, clip_predict=False, reg_lambda=0.1, simplified=True, random_state=0)#

A tree-based model that fits linear regression models in the leaves.

This model recursively partitions the feature space and fits linear regression models in each leaf node. It combines the interpretability of decision trees with the flexibility of linear regression.

Parameters:

name (str, default=None) – The name of the model.
max_depth (int, default=3) – The max number of depth.
min_impurity_decrease (float, default=0) – Minimum impurity decrease when splitting a node.
min_samples_leaf (int, default=50) – Minimum number of samples for constructing a leaf node.
split_custom (dict, default=None) – The custom split points for each feature.
n_screen_grid (int, default=1) – The number of candidate split points for rough screening.
n_feature_search (int, default=10) – The number of candidate features selected by rough screening.
n_split_grid (int, default=20) – The number of candidate split points for fine-grained search.
reg_lambda (float, default=0.1) – The level of L1 regularization strength.
clip_predict (bool, default=False) – Whether to clip the prediction results if it is outside the training data prediction.
simplified (bool, default=True) – Whether to use partial linear regression for search the split feature and points.
random_state (int, default=0) – Determines random number generation for weights and bias initialization.

tree_#

The internal tree structure.

Type:: object

calibrate_interval(X, y, alpha=0.1, max_depth: int = 5)#

Fit a conformal prediction model to the given data.

This method computes the model’s prediction interval calibrated to the given data.

If the model is a regressor, splits the data with 50% for fitting lower (alpha / 5) and upper (1 - alpha / 2) gradient boosting trees-based quantile regression to the model’s residual; and 50% for calibration.

If the model is a binary classifiers, it computes the calibration quantile based on predicted probabilities for the positive class.

Parameters:

X (X : np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
y (array-like of shape (n_samples, )) – Target values.
alpha (float, default=0.1) – Expected miscoverage for the conformal prediction.
max_depth (int, default=5) – Maximum depth of the gradient boosting trees for regression tasks. Only used when task_type is REGRESSION.

Raises:

ValueError – If the model is neither a regressor nor a classifier.:

get_params(deep=True)#

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

load(file_name: str)#

Load the model into memory from file system.

Parameters:: file_name (str) – The path and name of the file.
Return type:: estimator object

predict(X)#

Model predictions, calling the child class’s ‘_predict’ method.

Parameters:: X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
Returns:: np.ndarray
Return type:: The (calibrated) final prediction

predict_interval(X)#

Predict the prediction interval for the given data based on the conformal prediction model.

It splits the data with 50% for fitting lower (alpha / 5) and upper (1 - alpha / 2) gradient boosting trees-based quantile regression to the model’s residual; and 50% for calibration.

Parameters:: X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
Returns:: np.ndarray – in the format [n_samples, 2] for regressors or a flattened array for classifiers.
Return type:: The lower and upper bounds of the prediction intervals for each sample
Raises:: ValueError – If fit_conformal has not been called to fit the conformal prediction model: before calling this method.

save(file_name: str)#

Save the model into file system.

Parameters:: file_name (str) – The path and name of the file.

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: estimator instance