modeva.models.ModelTunePSO#

class modeva.models.ModelTunePSO(dataset, model)#

Bases: object

A class for performing hyperparameter tuning using the PSO algorithm.

run(param_bounds: Dict, param_types: Dict = None, dataset: str = 'train', n_iter: int = 10, n_particles: int = 10, metric: str | Tuple = None, n_jobs: int = None, cv=None, error_score=nan, random_state: int = 0)#

Particle Swarm Optimization (PSO) for model tuning.

This method performs hyperparameter optimization using PSO search on the specified model and dataset. It evaluates the model’s performance based on the provided metrics and returns the results in a structured format.

Parameters:
  • param_bounds (dict) –

    Dictionary with parameters names (str) as keys and lists of parameter min and max values (for numerical parameters) or list of parameter categories (for categorical parameters). For example,

    {'max_depth': [1, 5],
     'n_estimators': [100, 200],
     'eta': [0.01, 1.0],
     'tree_method': ["exact", "approx", "hist"],
     'centroids': [np.repeat(ds.train_x.min(0).reshape(1, -1), repeats=n_clusters, axis=0).ravel(),
                   np.repeat(ds.train_x.max(0).reshape(1, -1), repeats=n_clusters, axis=0).ravel()]
     }
    

    In this setting, parameters can also be an 1D array.

  • param_types (dict, default=None) –

    Dictionary with parameters names (str) as keys and types of parameter. Available types include “float”, “int”, and “categorical”. For example,

    {'max_depth': "int",
     'n_estimators': "int",
     'tree_method': "categorical"}
    

    Note that this is optional, and it may not contain all the keys shown in param_bounds; if a parameter’s type is not specified, then it is treated as float.

  • dataset ({"main", "train", "test"}, default="train") – The data set for model fitting.

  • n_iter (int, default=10) – Number of iterations of PSO. n_iter trades off runtime vs quality of the solution.

  • n_particles (int, default=10) – Number of particles in each iteration of PSO.

  • metric (str or tuple, default=None) – The performance metric(s). If None, we calculate the MSE, MAE, and R2 for regression; ACC, AUC, F1, LogLoss, and Brier for classification. Note that only the first one is used as the optimization objective.

  • cv (int, cross-validation generator or an iterable, default=None) –

    Determines the cross-validation splitting strategy. Possible inputs for cv are:

    • None, to use the default 5-fold cross validation,

    • integer, to specify the number of folds in a (Stratified)KFold,

    • CV splitter,

    • An iterable yielding (train, test) splits as arrays of indices.

  • n_jobs (int, default=None) – Number of jobs to run in parallel. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See Glossary for more details.

  • error_score ('raise' or numeric, default=np.nan) – Value to assign to the score if an error occurs in estimator fitting. If set to ‘raise’, the error is raised. If a numeric value is given, FitFailedWarning is raised. This parameter does not affect the refit step, which will always raise the error.

  • random_state (int, default=0) – The random seed for reproducibility.

Returns:

A container object with the following components:

  • key: “model_tune_pso”

  • data: Name of the dataset used

  • model: Name of the model used

  • inputs: Input parameters

  • value: Dictionary containing the optimization history

  • table: Tabular format of the optimization history

  • options: Dictionary of visualizations configuration. Run results.plot() to show all plots; Run results.plot(name=xxx) to display one preferred plot; and the following names are available:

    • ”parallel”: Parallel plot of the hyperparameter settings and final performance.

    • ”(<parameter>, <metric>)”: Bar plot showing the performance metric against parameter values.

Return type:

ValidationResult

Examples

Particle Swarm Optimization Search

Particle Swarm Optimization Search