opendataval.dataval.ShapEvaluator#

class opendataval.dataval.ShapEvaluator(*args, **kwargs)#

Abstract class for all semivalue-based methods of computing data values.

References#

Attributes#

samplerSampler, optional

Sampler used to compute the marginal contribution, by default uses TMC-Shapley with a Gelman-Rubin statistic terminator. Samplers are found in sampler

Parameters#

samplerSampler, optional

Sampler used to compute the marginal contributions. Can be found in opendataval/margcontrib/sampler.py, by default GrTMCSampler and uses additonal arguments as constructor for sampler.

gr_thresholdfloat, optional

Convergence threshold for the Gelman-Rubin statistic. Shapley values are NP-hard so we resort to MCMC sampling, by default 1.05

max_mc_epochsint, optional

Max number of outer epochs of MCMC sampling, by default 100

models_per_epochint, optional

Number of model fittings to take per epoch prior to checking GR convergence, by default 100

min_modelsint, optional

Minimum samples before checking MCMC convergence, by default 1000

min_cardinalityint, optional

Minimum cardinality of a training set, must be passed as kwarg, by default 5

cache_namestr, optional

Unique cache_name of the model to cache marginal contributions, set to None to disable caching, by default “” which is set to a unique value for a object

random_stateRandomState, optional

Random initial state, by default None

__init__(sampler: Sampler | None = None, *args, **kwargs)#

Methods

__init__([sampler])

compute_weight()

Compute the weights for each cardinality of training set.

evaluate(y, y_hat)

Evaluate performance of the specified metric between label and predictions.

evaluate_data_values()

Return data values for each training data point.

input_data(x_train, y_train, x_valid, y_valid)

Store and transform input data for semi-value samplers.

input_fetcher(fetcher)

Input data from a DataFetcher object.

input_metric(metric)

Input the evaluation metric.

input_model(pred_model)

Input the prediction model.

input_model_metric(pred_model, metric)

Input the prediction model and the evaluation metric.

setup(fetcher[, pred_model, metric])

Inputs model, metric and data into Data Evaluator.

train(fetcher[, pred_model, metric])

Store and transform data, then train model to predict data values.

train_data_values(*args, **kwargs)

Uses sampler to trains model to find marginal contribs and data values.

Attributes

Evaluators

data_values

Cached data values.