opendataval.dataval.InfluenceSubsample#
- class opendataval.dataval.InfluenceSubsample(*args, **kwargs)#
Influence computed through subsamples implementation.
Compute influence of each training example on for the validation dataset through closely-related subsampled influence.
References#
Parameters#
- samplesint, optional
Number of models to fit to take to find data values, by default 1000
- proportionfloat, optional
Proportion of data points to be in each sample, cardinality of each subset is \((p)(num_points)\), by default 0.7 as specified by V. Feldman and C. Zhang
- random_stateRandomState, optional
Random initial state, by default None
- __init__(num_models: int = 1000, proportion: float = 0.7, random_state: RandomState | None = None)#
Methods
__init__([num_models, proportion, random_state])evaluate(y, y_hat)Evaluate performance of the specified metric between label and predictions.
evaluate_data_values()Return data values for each training data point.
input_data(x_train, y_train, x_valid, y_valid)Store and transform input data for Influence Subsample Data Valuation.
input_fetcher(fetcher)Input data from a DataFetcher object.
input_metric(metric)Input the evaluation metric.
input_model(pred_model)Input the prediction model.
input_model_metric(pred_model, metric)Input the prediction model and the evaluation metric.
setup(fetcher[, pred_model, metric])Inputs model, metric and data into Data Evaluator.
train(fetcher[, pred_model, metric])Store and transform data, then train model to predict data values.
train_data_values(*args, **kwargs)Trains model to predict data values.
Attributes
Evaluatorsdata_valuesCached data values.