modeva.models.MoMoEClassifier#
- class modeva.models.MoMoEClassifier(name: str = None, n_clusters: int = 10, centroids: ndarray = None, calibration: bool = True, cv=3, feature_names: list = None, cluster_features: list = None, expert: str = 'xgboost', *args, **kwargs)#
A Mixture of Experts (MoE) classifier that combines multiple expert models for classification tasks.
This classifier creates a weighted ensemble of expert models, where each expert specializes in different regions of the input space. The final prediction is computed by combining expert predictions weighted by the gating network’s outputs.
- Parameters:
name (str, default=None) – Identifier for the model instance.
n_clusters (int, default=10) – Number of expert models (clusters) to create.
centroids (np.ndarray, default=None) – Pre-defined cluster centers of shape (n_clusters, n_features). If provided, skips the clustering step and uses these centers directly.
calibration (bool, default=True) – Whether to calibrate the gating network’s probability estimates using cross-validation.
cv (int, cross-validation generator or iterable, default=3) – Cross-validation strategy for probability calibration. Can be: - int: number of folds for K-Fold cross-validation - cross-validation generator: custom splitting strategy - iterable: yields (train, test) splits as indices
feature_names (list or None, default=None) – The list of feature names. If None, will use “X0”, “X1”, “X2”, etc.
cluster_features (list or None, default=None) – The list of feature names used for clustering. If None, will use all features for clustering.
expert ({"xgboost", "lightgbm", "catboost"}, default="xgboost") – The backend estimator used for each cluster.
*args – Variable length argument list passed to the underlying XGBoost model.
**kwargs – Arbitrary keyword arguments passed to the underlying XGBoost model.
- calibrate_interval(X, y, alpha=0.1)#
Fit a conformal prediction model to the given data.
This method computes the model’s prediction interval calibrated to the given data.
It computes the calibration quantile based on predicted probabilities for the positive class.
- Parameters:
X (X : np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
y (array-like of shape (n_samples, )) – Target values.
alpha (float, default=0.1) – Expected miscoverage for the conformal prediction.
- Raises:
ValueError – If the model is neither a regressor nor a classifier.:
- calibrate_proba(X, y, sample_weight=None, method='sigmoid')#
Fit the calibration method on the model’s predictions.
- Parameters:
X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
y (np.ndarray of shape (n_samples, )) – Ground truth labels.
sample_weight (array-like, shape (n_samples,), default=None) – Sample weights.
method ({'sigmoid', 'isotonic'}, default='sigmoid') –
The calibration method.
’sigmoid’: Platt’s method, i.e., fit a logistic regression on predicted probabilities and y
’isotonic’: Fit an isotonic regression on predicted probabilities and y.
- Returns:
self
- Return type:
Calibrated estimator
- decision_function(X, calibration: bool = True)#
Computes the decision function for the given input data.
- Parameters:
X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
calibration (bool, default=True) – If True, will use calibrated probability if calibration is done. Otherwise, will use raw probability.
- Returns:
logit_prediction – Array of (calibrated) logit predictions.
- Return type:
array, shape (n_samples,) or (n_samples, n_classes)
- fit(X, y, sample_weight=None)#
Fits the estimator to the provided data.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Training data.
y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – Target values.
sample_weight (array-like, shape (n_samples,), default=None) – Sample weights.
- Returns:
self – Fitted model instance.
- Return type:
object
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
mapping of string to any
- load(file_name: str)#
Load the model into memory from file system.
- Parameters:
file_name (str) – The path and name of the file.
- Return type:
estimator object
- predict(X, calibration: bool = True)#
Model predictions, calling the child class’s ‘_predict’ method.
- Parameters:
X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
calibration (bool, default=True) – If True, will use calibrated probability if calibration is done. Otherwise, will use raw probability.
- Returns:
np.ndarray
- Return type:
The (calibrated) final prediction
- predict_interval(X)#
Predict the prediction set for the given data based on the conformal prediction model.
This method computes the model prediction interval (regression) or prediction sets (classification) using conformal prediction.
- Parameters:
X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
- Returns:
np.ndarray – in the format [n_samples, 2] for regressors or a flattened array for classifiers.
- Return type:
The lower and upper bounds of the prediction intervals for each sample
- Raises:
ValueError – If fit_conformal has not been called to fit the conformal prediction model: before calling this method.
- predict_proba(X, calibration: bool = True)#
Predict (calibrated) probabilities for X.
- Parameters:
X (np.ndarray of shape (n_samples, n_features)) – Feature matrix for prediction.
calibration (bool, default=True) – If True, will return calibrated probability if calibration is done. Otherwise, will return raw probability.
- Returns:
np.ndarray
- Return type:
The (calibrated) predicted probabilities
- save(file_name: str)#
Save the model into file system.
- Parameters:
file_name (str) – The path and name of the file.
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params (dict) – Estimator parameters.
- Returns:
self
- Return type:
estimator instance