modeva.DataSet.eda_3d#

DataSet.eda_3d(feature_x: str, feature_y: str, feature_z: str, feature_color: str = None, dataset: str = 'main', sample_size: int = 1000, random_state: int = 0)#

Creates an interactive 3D scatter plot visualization for exploring relationships between three features.

This function generates a 3D scatter plot using the specified features, with an optional fourth feature represented by color. It supports subsampling for large datasets and handles both numerical and categorical features for coloring. The visualization is powered by mocharts library and provides interactive features like tooltips and adjustable viewports.

Parameters:

feature_x (str) – Name of the feature to be plotted on the x-axis.
feature_y (str) – Name of the feature to be plotted on the y-axis.
feature_z (str) – Name of the feature to be plotted on the z-axis.
feature_color (str, optional) – Name of the feature used for coloring points. If numerical, creates a color gradient; if categorical, creates distinct colors per category.
dataset ({"main", "train", "test"}, default="main") – Specifies which dataset partition to visualize.
sample_method ({"random"}, default="random") – Method used for subsampling the data. Currently only supports random sampling.
sample_size (int, optional, default=1000) – Maximum number of points to plot. If dataset is larger, points will be randomly sampled. Set to None to use all points.
random_state (int, default=0) – Seed for random number generator used in sampling.

Returns:

A container object with the following components:

key: “data_eda_3d”
data: Name of the dataset used
inputs: Dictionary of input parameters
options: Dictionary of visualizations configuration for a 2D scatter plot. Run results.plot() to show this plot.

Return type:

ValidationResult

Examples

Exploratory Data Analysis