opendataval.dataval.InfluenceSubsample#

class opendataval.dataval.InfluenceSubsample(*args, **kwargs)#

Influence computed through subsamples implementation.

Compute influence of each training example on for the validation dataset through closely-related subsampled influence.

References#

Parameters#

samplesint, optional

Number of models to fit to take to find data values, by default 1000

proportionfloat, optional

Proportion of data points to be in each sample, cardinality of each subset is \((p)(num_points)\), by default 0.7 as specified by V. Feldman and C. Zhang

random_stateRandomState, optional

Random initial state, by default None

__init__(num_models: int = 1000, proportion: float = 0.7, random_state: RandomState | None = None)#

Methods

__init__([num_models, proportion, random_state])

evaluate(y, y_hat)

Evaluate performance of the specified metric between label and predictions.

evaluate_data_values()

Return data values for each training data point.

input_data(x_train, y_train, x_valid, y_valid)

Store and transform input data for Influence Subsample Data Valuation.

input_fetcher(fetcher)

Input data from a DataFetcher object.

input_metric(metric)

Input the evaluation metric.

input_model(pred_model)

Input the prediction model.

input_model_metric(pred_model, metric)

Input the prediction model and the evaluation metric.

setup(fetcher[, pred_model, metric])

Inputs model, metric and data into Data Evaluator.

train(fetcher[, pred_model, metric])

Store and transform data, then train model to predict data values.

train_data_values(*args, **kwargs)

Trains model to predict data values.

Attributes

Evaluators

data_values

Cached data values.