opendataval.dataval.InfluenceSubsample#
- class opendataval.dataval.InfluenceSubsample(*args, **kwargs)#
Influence computed through subsamples implementation.
Compute influence of each training example on for the validation dataset through closely-related subsampled influence.
References#
Parameters#
- samplesint, optional
Number of models to fit to take to find data values, by default 1000
- proportionfloat, optional
Proportion of data points to be in each sample, cardinality of each subset is \((p)(num_points)\), by default 0.7 as specified by V. Feldman and C. Zhang
- random_stateRandomState, optional
Random initial state, by default None
- __init__(num_models: int = 1000, proportion: float = 0.7, random_state: RandomState | None = None)#
Methods
__init__
([num_models, proportion, random_state])evaluate
(y, y_hat)Evaluate performance of the specified metric between label and predictions.
evaluate_data_values
()Return data values for each training data point.
input_data
(x_train, y_train, x_valid, y_valid)Store and transform input data for Influence Subsample Data Valuation.
input_fetcher
(fetcher)Input data from a DataFetcher object.
input_metric
(metric)Input the evaluation metric.
input_model
(pred_model)Input the prediction model.
input_model_metric
(pred_model, metric)Input the prediction model and the evaluation metric.
setup
(fetcher[, pred_model, metric])Inputs model, metric and data into Data Evaluator.
train
(fetcher[, pred_model, metric])Store and transform data, then train model to predict data values.
train_data_values
(*args, **kwargs)Trains model to predict data values.
Attributes
Evaluators
data_values
Cached data values.