opendataval.dataval.knnshap package#
Submodules#
opendataval.dataval.knnshap.knnshap module#
- class opendataval.dataval.knnshap.knnshap.KNNShapley(*args, **kwargs)#
Bases:
DataEvaluator
,ModelLessMixin
Data valuation using KNNShapley implementation.
KNN Shapley is a model-less mixin. This means we cannot specify an underlying prediction model for the DataEvaluator. However, we can specify a pretrained embedding model.
References#
Parameters#
- k_neighborsint, optional
Number of neighbors to group the data points, by default 10
- batch_sizeint, optional
Batch size of tensors to load at a time during training, by default 32
- embedding_modelModel, optional
Pre-trained embedding model used by DataEvaluator, by default None
- random_stateRandomState, optional
Random initial state, by default None
- evaluate_data_values() ndarray #
Return data values for each training data point.
Compute data values using KNN Shapley data valuation
Returns#
- np.ndarray
Predicted data values/selection for training input data point
- match(y: Tensor) Tensor #
\(1.\) for all matching rows and \(0.\) otherwise.
- train_data_values(*args, **kwargs)#
Trains model to predict data values.
Computes KNN shapley data values, as implemented by the following. Ignores all positional and key word arguments.
References#
[1] PyTorch implementation <https://github.com/AI-secure/Shapley-Study/blob/master/shapley/measures/KNN_Shapley.py>