opendataval.dataval.InfluenceSubsample#

class opendataval.dataval.InfluenceSubsample(*args, **kwargs)#

Influence computed through subsamples implementation.

Compute influence of each training example on for the validation dataset through closely-related subsampled influence.

References#

samplesint, optional: Number of models to fit to take to find data values, by default 1000
proportionfloat, optional: Proportion of data points to be in each sample, cardinality of each subset is \((p)(num_points)\), by default 0.7 as specified by V. Feldman and C. Zhang
random_stateRandomState, optional: Random initial state, by default None

__init__(num_models: int = 1000, proportion: float = 0.7, random_state: RandomState | None = None)#

Methods

`__init__`([num_models, proportion, random_state])
`evaluate`(y, y_hat)	Evaluate performance of the specified metric between label and predictions.
`evaluate_data_values`()	Return data values for each training data point.
`input_data`(x_train, y_train, x_valid, y_valid)	Store and transform input data for Influence Subsample Data Valuation.
`input_fetcher`(fetcher)	Input data from a DataFetcher object.
`input_metric`(metric)	Input the evaluation metric.
`input_model`(pred_model)	Input the prediction model.
`input_model_metric`(pred_model, metric)	Input the prediction model and the evaluation metric.
`setup`(fetcher[, pred_model, metric])	Inputs model, metric and data into Data Evaluator.
`train`(fetcher[, pred_model, metric])	Store and transform data, then train model to predict data values.
`train_data_values`(args, *kwargs)	Trains model to predict data values.

Attributes

`Evaluators`
`data_values`	Cached data values.