Lowcode Example: Experimentation based on SimuCredit Data¶
Setting up Modeva¶
In [1]:
## ----------------------------------------------------------------
## Install or update packages (recommended to run in Terminal)
## ----------------------------------------------------------------
!pip show modeva
# !pip uninstall modeva
# !pip install modeva
Name: MoDeva Version: 1.0.11 Summary: Integrated tool for model development and validation. Home-page: Author: Author-email: admin@modeva.ai License: proprietary Location: C:\Users\s6416\anaconda3\envs\python-3.11\Lib\site-packages\modeva-1.0.11-py3.11.egg Requires: catboost, dill, httpx, ipython, ipyvuetify, ipywidgets, lightgbm, lime, mlflow, mocharts, momentchi2, notebook, numpy, pandas, pyswarms, python-dateutil, scikit-learn, scikit-learn-extra, scipy, shap, supabase, torch, tqdm, umap-learn, xgboost Required-by:
In [2]:
# To get authentication, use the following command: (To get full access please replace the token to your own token)
from modeva.utils.authenticate import authenticate
# authenticate(auth_code='eaaa4301-b140-484c-8e93-f9f633c8bacb')
In [3]:
## ----------------------------------------------------------------
## Optional: Clear Modeva-mlflow directory
## This will clear existing Modeva datasets and experiments
## ----------------------------------------------------------------
# from modeva.utils.mlflow import clear_mlflow_home
# clear_mlflow_home()
Registry Hub¶
- Low-code panel for managing dataset and experiment registry (You can come back to this panel from time to time).
- Each time running Modeva, it requires authentication (Contact admin@modeva.ai for requesting the sequence number).
In [4]:
from modeva.dashboard.api import registry_hub
registry_hub()
✓ Auth code found in local storage. ✓ Auth code saved to local storage for future use. Authenticating Modeva...
✓ License is active and valid. ✓ Authenticated successfully!
Out[4]:
Data Loading¶
In [5]:
## --------------------------------------------------------
## Optional: create OOD dataset, save as CSV, then
## upload as Extra data via registry_hub
## --------------------------------------------------------
# from modeva.data.utils.loading import load_builtin_data
# df = load_builtin_data("SimuCredit")
from modeva import DataSet
ds = DataSet()
ds.load_registered_data(name="Demo-SimuCredit")
df = ds.to_df()
df[(df['Gender'] == 0) & (df['Race'] == 0)].to_csv('SimuCredit_OOD1.csv', index=False)
df[(df['Gender'] == 1) & (df['Race'] == 1)].to_csv('SimuCredit_OOD2.csv', index=False)
# Go to registry_hub() above to upload OOD csv files as Extra to Demo-SimuCredit
New Experiment and Data Processing¶
In [6]:
from modeva.dashboard.experiment import Experiment
exp = Experiment(name='Exp20240901-SimuCredit')
## Optional - run exp.clear() to clear existing records in an existing experiment
# exp.clear()
In [7]:
# Load Demo-SimuCredit data
# exp.data_load('Demo-SimuCredit')
## Optional - View the 'main' dataset
exp.ds
Out[7]:
Mortgage | Balance | Amount Past Due | Delinquency | Inquiry | Open Trade | Utilization | Gender | Race | Status | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 139734.22 | 2717.87 | 0.00 | 0 | 0 | 0 | 0.785162 | 0 | 0 | 0 |
1 | 243359.62 | 193.60 | 0.00 | 0 | 0 | 0 | 0.254759 | 0 | 0 | 1 |
2 | 187784.19 | 395.05 | 0.00 | 0 | 1 | 0 | 0.360995 | 0 | 0 | 1 |
3 | 594626.89 | 180.94 | 0.00 | 0 | 0 | 0 | 0.128144 | 0 | 1 | 1 |
4 | 166771.42 | 1241.13 | 0.00 | 0 | 0 | 0 | 0.702958 | 0 | 0 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
19995 | 226806.34 | 670.99 | 0.00 | 0 | 1 | 0 | 0.922122 | 0 | 1 | 1 |
19996 | 308625.65 | 3223.94 | 0.00 | 0 | 0 | 0 | 0.989716 | 1 | 1 | 1 |
19997 | 375035.34 | 133.05 | 131.15 | 1 | 0 | 0 | 0.092523 | 0 | 0 | 1 |
19998 | 165377.42 | 2256.07 | 0.00 | 0 | 0 | 0 | 0.630330 | 1 | 0 | 1 |
19999 | 299811.81 | 2420.01 | 1461.61 | 1 | 0 | 0 | 0.899019 | 1 | 1 | 0 |
20000 rows × 10 columns
In [8]:
exp.data_summary()
Out[8]:
In [9]:
exp.eda_2d()
Out[9]:
In [10]:
exp.eda_3d()
Out[10]:
In [11]:
exp.eda_multivariate()
Out[11]:
In [12]:
exp.data_process()
Out[12]:
Model Training and Tuning¶
In [13]:
exp.model_train()
Out[13]:
In [14]:
exp.model_tune()
Out[14]:
In [15]:
exp.model_leaderboard()
Out[15]:
Model Explainability¶
In [16]:
exp.test_explainability()
Out[16]:
Model Testing¶
In [17]:
exp.model_test()
Out[17]:
In [18]:
exp.test_weakness()
Out[18]:
Model Benchmarking¶
In [19]:
exp.model_compare()
Out[19]:
Factsheet and Report¶
In [20]:
exp.list_testsuite()
Out[20]:
Name | Start Time | Tags | |
---|---|---|---|
0 | Resilience-Result | 2025-04-25 10:41:03 | {'dataset': 'Demo-SimuCredit_md_Exp20240901-Si... |
1 | Robustness-Result | 2025-04-25 10:41:02 | {'dataset': 'Demo-SimuCredit_md_Exp20240901-Si... |
2 | Reliability-Result | 2025-04-25 10:41:02 | {'dataset': 'Demo-SimuCredit_md_Exp20240901-Si... |
3 | Performance-Result | 2025-04-25 10:41:02 | {'dataset': 'Demo-SimuCredit_md_Exp20240901-Si... |
4 | Local-Result | 2025-04-25 10:40:54 | {'dataset': 'Demo-SimuCredit_md_Exp20240901-Si... |
5 | Global-Result | 2025-04-25 10:40:54 | {'dataset': 'Demo-SimuCredit_md_Exp20240901-Si... |
6 | PCA-Result | 2025-04-25 10:33:52 | {'dataset': 'Demo-SimuCredit_md_Exp20240901-Si... |
7 | Correlation-Result | 2025-04-25 10:33:51 | {'dataset': 'Demo-SimuCredit_md_Exp20240901-Si... |
8 | EDA3D-Result | 2025-04-25 10:33:47 | {'dataset': 'Demo-SimuCredit_md_Exp20240901-Si... |
9 | EDA2D-Result | 2025-04-25 10:33:40 | {'dataset': 'Demo-SimuCredit_md_Exp20240901-Si... |
In [21]:
exp.report()
Out[21]:
testsuite-generated Report: Exp20240901-SimuCredit
Resilience-Result
Data:
Demo-SimuCredit_md
Model:
XGBoost
Inputs:
{'dataset': 'test', 'method': 'worst-sample', 'metric': 'AUC', 'alphas': None, 'n_clusters': 10, 'random_state': 0}
index | AUC |
---|---|
0.1 | 0 |
0.2 | 0 |
0.3 | 0.108096 |
0.4 | 0.320467 |
0.5 | 0.489588 |
0.6 | 0.612286 |
0.7 | 0.700268 |
0.8 | 0.76505 |
0.9 | 0.812749 |
1.0 | 0.847003 |
Robustness-Result
Data:
Demo-SimuCredit_md
Model:
XGBoost
Inputs:
{'dataset': 'test', 'metric': 'AUC', 'n_repeats': 10, 'perturb_features': None, 'perturb_method': 'quantile', 'noise_levels': [0.1, 0.2, 0.3, 0.4], 'threshold': 0.1, 'random_state': 0}
index | 0.0 | 0.1 | 0.2 | 0.3 | 0.4 |
---|---|---|---|---|---|
0 | 0.847003 | 0.839262 | 0.832116 | 0.820499 | 0.810964 |
1 | 0.847003 | 0.837907 | 0.825354 | 0.812914 | 0.800196 |
2 | 0.847003 | 0.838945 | 0.827252 | 0.817592 | 0.804479 |
3 | 0.847003 | 0.842406 | 0.830484 | 0.823731 | 0.81147 |
4 | 0.847003 | 0.841401 | 0.834411 | 0.821951 | 0.815282 |
5 | 0.847003 | 0.839763 | 0.827642 | 0.816602 | 0.807143 |
6 | 0.847003 | 0.838488 | 0.829598 | 0.820741 | 0.805865 |
7 | 0.847003 | 0.84043 | 0.830037 | 0.821196 | 0.811785 |
8 | 0.847003 | 0.838517 | 0.826133 | 0.814623 | 0.807822 |
9 | 0.847003 | 0.838031 | 0.825894 | 0.81224 | 0.800883 |
Reliability-Result
Data:
Demo-SimuCredit_md
Model:
XGBoost
Inputs:
{'train_dataset': 'test', 'test_dataset': 'test', 'test_size': 0.5, 'alpha': 0.1, 'max_depth': 5, 'width_threshold': 0.1, 'random_state': 0}
index | Avg.Width | Avg.Coverage |
---|---|---|
0 | 1.329 | 0.9055 |
Performance-Result
Data:
Demo-SimuCredit_md
Model:
XGBoost
Inputs:
{'features': 'Status', 'use_prediction': False, 'dataset': 'test', 'sample_size': 2000, 'random_state': 0}
index | AUC | ACC | F1 | LogLoss | Brier |
---|---|---|---|---|---|
train | 0.851216 | 0.770625 | 0.79919 | 0.473889 | 0.156377 |
test | 0.847003 | 0.7735 | 0.799202 | 0.481288 | 0.157991 |
GAP | -0.004213 | 0.002875 | 0.000012 | 0.007399 | 0.001614 |
Local-Result
Data:
Demo-SimuCredit_md
Model:
XGBoost
Inputs:
{'dataset': 'test', 'sample_idx': 0}
Global-Result
Data:
Demo-SimuCredit_md
Model:
XGBoost
Inputs:
{'dataset': 'test', 'feature': 'Utilization'}
PCA-Result
Data:
Demo-SimuCredit_md
Model:
None
Inputs:
{'features': None, 'n_components': 5, 'dataset': 'main', 'sample_size': None, 'categorical_encoding': 'ordinal', 'random_state': 0}
index | PC1 | PC2 | PC3 | PC4 | PC5 |
---|---|---|---|---|---|
0 | -0.52621 | 1.537869 | -1.348023 | -0.889494 | 0.241198 |
1 | -1.352178 | -0.755415 | -0.245939 | 0.197904 | -0.104968 |
2 | -0.480909 | -0.828502 | -0.750361 | -0.331259 | -0.132894 |
3 | -1.454527 | -0.987473 | 3.341614 | 0.443364 | -0.139922 |
4 | -0.806076 | 0.639831 | -1.108934 | -0.652187 | 0.579872 |
5 | -1.160295 | -0.067055 | 0.123406 | -0.333044 | 0.347544 |
6 | -0.996448 | 0.478285 | 1.758346 | -0.18093 | -0.391355 |
7 | 0.109891 | 0.407718 | -0.229437 | 1.07627 | 0.203646 |
8 | 2.432506 | 0.261122 | -1.394709 | 0.075158 | -0.631031 |
9 | -0.857556 | 0.561863 | 1.005858 | -0.721243 | 0.921994 |
10 | 0.131925 | 2.054675 | 0.827086 | -0.480118 | 0.672351 |
11 | 1.071816 | 0.859586 | -0.590065 | 0.891606 | 1.039925 |
12 | -1.147963 | -0.061632 | 0.130431 | -0.060046 | -0.268612 |
13 | -0.470855 | 0.138432 | -0.426058 | 1.039255 | -0.663208 |
14 | 0.069752 | -1.654225 | 0.299082 | 1.799283 | 0.640279 |
Correlation-Result
Data:
Demo-SimuCredit_md
Model:
None
Inputs:
{'features': None, 'dataset': 'main', 'method': 'pearson', 'sample_size': None, 'random_state': 0}
index | Mortgage | Balance | Amount Past Due | Delinquency | Inquiry | Open Trade | Utilization | Gender | Race | Status |
---|---|---|---|---|---|---|---|---|---|---|
Mortgage | 1 | 0.005831 | -0.000986 | 0.010549 | -0.002618 | 0.00038 | 0.003176 | 0.130764 | 0.115965 | 0.129838 |
Balance | 0.005831 | 1 | 0.465837 | 0.010322 | 0.003535 | 0.000986 | 0.569253 | 0.12668 | 0.112226 | 0.125015 |
Amount Past Due | -0.000986 | 0.465837 | 1 | 0.384252 | 0.230951 | 0.1676 | 0.260424 | 0.109107 | 0.097218 | 0.13266 |
Delinquency | 0.010549 | 0.010322 | 0.384252 | 1 | 0.614043 | 0.453562 | 0.003164 | 0.00021 | 0.000048 | 0.097463 |
Inquiry | -0.002618 | 0.003535 | 0.230951 | 0.614043 | 1 | 0.737649 | 0.002582 | 0.000557 | 0.000157 | 0.058305 |
Open Trade | 0.00038 | 0.000986 | 0.1676 | 0.453562 | 0.737649 | 1 | 0.000353 | 0.000405 | 0.000365 | 0.038481 |
Utilization | 0.003176 | 0.569253 | 0.260424 | 0.003164 | 0.002582 | 0.000353 | 1 | 0.130823 | 0.11596 | 0.129884 |
Gender | 0.130764 | 0.12668 | 0.109107 | 0.00021 | 0.000557 | 0.000405 | 0.130823 | 1 | 0.004251 | 0.107363 |
Race | 0.115965 | 0.112226 | 0.097218 | 0.000048 | 0.000157 | 0.000365 | 0.11596 | 0.004251 | 1 | 0.178891 |
Status | 0.129838 | 0.125015 | 0.13266 | 0.097463 | 0.058305 | 0.038481 | 0.129884 | 0.107363 | 0.178891 | 1 |
EDA3D-Result
Data:
Demo-SimuCredit_md
Model:
None
Inputs:
{'feature_x': 'Mortgage', 'feature_y': 'Balance', 'feature_z': 'Status', 'feature_color': None, 'dataset': 'main', 'sample_method': 'random', 'sample_size': 200, 'random_state': 0}
EDA2D-Result
Data:
Demo-SimuCredit_md
Model:
None
Inputs:
{'feature_x': 'Mortgage', 'feature_y': 'Status', 'feature_color': None, 'dataset': 'main', 'sample_method': 'random', 'sample_size': 200, 'smoother_order': 2, 'random_state': 0}
In [22]:
exp.export_report()
In [ ]: