modeva.DataSet.get_data#

DataSet.get_data(dataset: str = 'main', active_sample: bool = False, active_feature: bool = False)#

Get the preprocessed data in the format of np.ndarray (all variables including X, y, sample_weight, etc.)

All sample index are returned, and it is not affected by the changes in active_sample_index. It is designed for data preprocessing steps. For modeling related steps, use get_X_y_data instead.

Parameters:
  • dataset ({"main", "train", "test"}, default="main") – The name of data split. It can also be other manually registered data split, if exists. Use the function get_data_list to check all available data splits.

  • active_sample (bool, default=False) – If True, will only return the active rows (samples).

  • active_feature (bool, default=False) – If True, will only return the active columns (features).

Return type:

np.ndarray of the given dataset’s preprocessed version.