Mixture of Expert (MoE) Classification#

Installation

# To install the required package, use the following command:
# !pip install modeva

Authentication

# To get authentication, use the following command: (To get full access please replace the token to your own token)
# from modeva.utils.authenticate import authenticate
# authenticate(auth_code='eaaa4301-b140-484c-8e93-f9f633c8bacb')

Import required modules

from modeva import DataSet
from modeva import TestSuite
from modeva.models import MoMoEClassifier

Load and prepare dataset for regression

ds = DataSet()
ds.load(name="TaiwanCredit")
ds.set_random_split()
ds.set_target("FlagDefault")

Train models#

model = MoMoEClassifier(max_depth=2)
model.fit(ds.train_x, ds.train_y)
MoMoEClassifier(base_score=None, booster=None, callbacks=None,
                colsample_bylevel=None, colsample_bynode=None,
                colsample_bytree=None, device=None, early_stopping_rounds=None,
                enable_categorical=False, eval_metric=None, feature_types=None,
                gamma=None, grow_policy=None, importance_type=None,
                interaction_constraints=None, learning_rate=None, max_bin=None,
                max_cat_threshold=None, max_cat_to_onehot=None,
                max_delta_step=None, max_depth=2, max_leaves=None,
                min_child_weight=None, missing=nan, monotone_constraints=None,
                multi_strategy=None, n_estimators=None, n_jobs=None,
                name='MoMoEClassifier', num_parallel_tree=None, ...)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.


Basic accuracy analysis#

ts = TestSuite(ds, model)
results = ts.diagnose_accuracy_table()
results.table
AUC ACC F1 LogLoss Brier
train 0.8680 0.8464 0.5604 0.3622 0.1120
test 0.7765 0.8195 0.4561 0.4269 0.1334
GAP -0.0915 -0.0269 -0.1043 0.0647 0.0214


Local MOE weights interpretation#

results = ts.interpret_local_moe_weights()
results.plot()


Data drift test between cluster “1” with the rest samples#

results = ts.interpret_moe_cluster_analysis()
data_results = ds.data_drift_test(**results.value[1]["data_info"],
                                  distance_metric="PSI",
                                  psi_method="uniform",
                                  psi_bins=10)
data_results.plot("summary")


Interpret feature importance#

results = ts.interpret_fi()

Expert No. 0

results.plot("0")


Interpret effect importance#

results = ts.interpret_ei()

Expert No. 0

results.plot("0")


Interpret effects#

results = ts.interpret_effects(features="PAY_1")

Expert No. 0

results.plot("0")


Expert of all clusters

results.plot("all")


Cluster performance analysis#

results = ts.interpret_moe_cluster_analysis()
results.plot()


Distribution difference summary between cluster-0 with the rest

data_results = ds.data_drift_test(**results.value[0]["data_info"],
                                  distance_metric="PSI",
                                  psi_method="uniform",
                                  psi_bins=10)
data_results.plot("summary")


Distributional difference for PAY_1 between cluster-0 with the rest

data_results.plot(("density", "PAY_1"))


Local feature importance analysis#

results = ts.interpret_local_fi(dataset='train', sample_index=1)

Expert No. 0

results.plot("0")


Local effect importance analysis#

results = ts.interpret_local_ei(dataset='train', sample_index=1)

Expert No. 0

results.plot("0")


Total running time of the script: (0 minutes 38.270 seconds)

Gallery generated by Sphinx-Gallery