ibmfl.data

Base Class

class ibmfl.data.data_handler.DataHandler(**kwargs)[source]

Base class to load and pre-process data.

__init__(**kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

__weakref__

list of weak references to the object (if defined)

abstract get_data(**kwargs)[source]

Access the local dataset and return the training and testing dataset as a tuple.

Parameters

kwargs

Returns

tuple. (training_set, testing_set)

get_preprocessor(sample_data_schema, preprocessor_name, **kwargs)[source]

Set the data preprocessor of the data handler class as the requested type of preprocessor. The supported preprocessors include normalizer, standardscaler and minmaxscaler. All provided based on sklearn.preprocessing module. The preprocessor can be applied to perform the required preprocessing step for the party’s local dataset via transform method.

Parameters
  • sample_data_schema (np.array) – Provided data with only feature values to initialize the preprocessor. Assuming the dataset has shape (num_samples, num_features).

  • preprocessor_name (str) – The requested preprocessor name in lowercase.

  • kwargs (dict) – Additional parameters to obtain the preprocessor.

Returns

None

get_statistics_of_training_data(sample_data_schema, lst_stats_name, **kwargs)[source]

Return the corresponding statistics, which is specified by the provided list of statistics names, of the local training dataset.

Parameters
  • sample_data_schema (np.array) – Provided data with only feature values. Assuming the dataset has shape (num_samples, num_features).

  • lst_stats_name (list of str) – A list of statistics names, all in lowercase form, for example, [‘min’], [‘mean’, ‘variance’], etc.

  • kwargs (dict) – Additional parameters to obtain the statistics.

Returns

The requested statistics based on the local dataset.

Return type

dict

Reinforcement Learning Environment Handler

Module to where data handler are implemented.

class ibmfl.data.env_data_handler.EnvDataHandler(**kwargs)[source]

Base class to load data and environment for reinforcement learning.

abstract get_data(**kwargs)[source]

Read train data and test data for reinforcement learning :return:

abstract get_env_class_ref() → ibmfl.data.env_spec.EnvHandler[source]

Get environment reference for RL trainer, the instance is created in model class as part of trainer initialization

Environment Handler Base Class

EnvHandler for OpenAI gym interface

class ibmfl.data.env_spec.EnvHandler(data=None, env_config=None)[source]

Base class for Environment Handler of Reinforcement learning algorithms

__init__(data=None, env_config=None)[source]

Initializes an EnvHandler object

Parameters

config (dict) – Start state configuration of environment

abstract render(mode='human')[source]

Render one frame of the environment

abstract reset()[source]

Resets the state of the environment and returns an initial observation.

Returns:

observation (object): the initial observation.

abstract step(action)[source]

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state. Accepts an action and returns a tuple (observation, reward, done, info). Args:

action (object): an action provided by the agent

Returns:

observation (object): agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further

step() calls will return undefined results

info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)

Pandas Data Handler

class ibmfl.data.pandas_data_handler.PandasDataHandler(**kwargs)[source]

Base class to load and pre-process data.

__init__(**kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

abstract get_data()[source]

Read data and return as Pandas data frame.

Returns

A dataset structure

Return type

pandas.core.frame.DataFrame

abstract get_dataset_info(**kwargs)[source]

Read and extract data information

Returns

some information about the dataset (i.e. a dictionary that contains the list of features)

Return type

dict

get_min(dp_flag=False, **kwargs)[source]

Assuming the dataset is loaded as type pandas.DataFrame, and has shape(num_samples, num_features).

Parameters
  • dp_flag (boolean) – Flag for differential private answer. By default is set to False.

  • kwargs (dict) – Dictionary of differential privacy arguments for computing the minimum value of each feature across all samples, e.g., epsilon and delta, etc.

Returns

A vector of shape (1, num_features) stores the minimum value of each feature across all samples.

Return type

pandas.Series where each entry matches the original type of the corresponding feature.