opendataval.dataval.ShapEvaluator#

class opendataval.dataval.ShapEvaluator(*args, **kwargs)#

Abstract class for all semivalue-based methods of computing data values.

References#

samplerSampler, optional: Sampler used to compute the marginal contribution, by default uses TMC-Shapley with a Gelman-Rubin statistic terminator. Samplers are found in sampler

samplerSampler, optional: Sampler used to compute the marginal contributions. Can be found in opendataval/margcontrib/sampler.py, by default GrTMCSampler and uses additonal arguments as constructor for sampler.
gr_thresholdfloat, optional: Convergence threshold for the Gelman-Rubin statistic. Shapley values are NP-hard so we resort to MCMC sampling, by default 1.05
max_mc_epochsint, optional: Max number of outer epochs of MCMC sampling, by default 100
models_per_epochint, optional: Number of model fittings to take per epoch prior to checking GR convergence, by default 100
min_modelsint, optional: Minimum samples before checking MCMC convergence, by default 1000
min_cardinalityint, optional: Minimum cardinality of a training set, must be passed as kwarg, by default 5
cache_namestr, optional: Unique cache_name of the model to cache marginal contributions, set to None to disable caching, by default “” which is set to a unique value for a object
random_stateRandomState, optional: Random initial state, by default None

Methods

`__init__`([sampler])
`compute_weight`()	Compute the weights for each cardinality of training set.
`evaluate`(y, y_hat)	Evaluate performance of the specified metric between label and predictions.
`evaluate_data_values`()	Return data values for each training data point.
`input_data`(x_train, y_train, x_valid, y_valid)	Store and transform input data for semi-value samplers.
`input_fetcher`(fetcher)	Input data from a DataFetcher object.
`input_metric`(metric)	Input the evaluation metric.
`input_model`(pred_model)	Input the prediction model.
`input_model_metric`(pred_model, metric)	Input the prediction model and the evaluation metric.
`setup`(fetcher[, pred_model, metric])	Inputs model, metric and data into Data Evaluator.
`train`(fetcher[, pred_model, metric])	Store and transform data, then train model to predict data values.
`train_data_values`(args, *kwargs)	Uses sampler to trains model to find marginal contribs and data values.

Attributes

`Evaluators`
`data_values`	Cached data values.