opendataval.dataval.GrTMCSampler#
- class opendataval.dataval.GrTMCSampler(*args, **kwargs)#
TMC Sampler with terminator for semivalue-based methods of computing data values.
Evaluators that share marginal contributions should share a sampler.
References#
Parameters#
- gr_thresholdfloat, optional
Convergence threshold for the Gelman-Rubin statistic. Shapley values are NP-hard so we resort to MCMC sampling, by default 1.05
- max_mc_epochsint, optional
Max number of outer epochs of MCMC sampling, by default 100
- models_per_epochint, optional
Number of model fittings to take per epoch prior to checking GR convergence, by default 100
- min_modelsint, optional
Minimum samples before checking MCMC convergence, by default 1000
- min_cardinalityint, optional
Minimum cardinality of a training set, must be passed as kwarg, by default 5
- cache_namestr, optional
Unique cache_name of the model to cache marginal contributions, set to None to disable caching, by default “” which is set to a unique value for a object
- random_stateRandomState, optional
Random initial state, by default None
- __init__(gr_threshold: float = 1.05, max_mc_epochs: int = 100, models_per_epoch: int = 100, min_models: int = 1000, min_cardinality: int = 5, cache_name: str | None = '', random_state: RandomState | None = None)#
Methods
__init__
([gr_threshold, max_mc_epochs, ...])compute_marginal_contribution
(*args, **kwargs)Compute the marginal contributions for semivalue based data evaluators.
set_coalition
(coalition)Initializes storage to find marginal contribution of each data point
set_evaluator
(value_func)Sets the evaluator function to evaluate the utility of a coalition
Attributes
CACHE
Cached marginal contributions.
GR_MAX
Default maximum Gelman-Rubin statistic.