opendataval.dataval.DataOob#

class opendataval.dataval.DataOob(*args, **kwargs)#

Data Out-of-Bag data valuation implementation.

Input evaluation metrics are valid if we compare one data point across several predictions. Examples include: accuracy and L2 distance

References#

Parameters#

num_modelsint, optional

Number of models to bag/aggregate, by default 1000

proportionfloat, optional

Proportion of data points in the in-bag sample. sample_size = len(dataset) * proportion, by default 1.0

random_stateRandomState, optional

Random initial state, by default None

__init__(num_models: int = 1000, proportion: int = 1.0, random_state: RandomState | None = None)#

Methods

__init__([num_models, proportion, random_state])

evaluate(y, y_hat)

Evaluate performance of the specified metric between label and predictions.

evaluate_data_values()

Return data values for each training data point.

input_data(x_train, y_train, x_valid, y_valid)

Store and transform input data for Data Out-Of-Bag Evaluator.

input_fetcher(fetcher)

Input data from a DataFetcher object.

input_metric(metric)

Input the evaluation metric.

input_model(pred_model)

Input the prediction model.

input_model_metric(pred_model, metric)

Input the prediction model and the evaluation metric.

setup(fetcher[, pred_model, metric])

Inputs model, metric and data into Data Evaluator.

train(fetcher[, pred_model, metric])

Store and transform data, then train model to predict data values.

train_data_values(*args, **kwargs)

Trains model to predict data values.

Attributes

Evaluators

data_values

Cached data values.