opendataval.dataval.KNNShapley#

class opendataval.dataval.KNNShapley(*args, **kwargs)#

Data valuation using KNNShapley implementation.

KNN Shapley is a model-less mixin. This means we cannot specify an underlying prediction model for the DataEvaluator. However, we can specify a pretrained embedding model.

References#

Parameters#

k_neighborsint, optional

Number of neighbors to group the data points, by default 10

batch_sizeint, optional

Batch size of tensors to load at a time during training, by default 32

embedding_modelModel, optional

Pre-trained embedding model used by DataEvaluator, by default None

random_stateRandomState, optional

Random initial state, by default None

__init__(k_neighbors: int = 10, batch_size: int = 32, embedding_model: Model | None = None, random_state: RandomState | None = None)#

Methods

__init__([k_neighbors, batch_size, ...])

embeddings(*tensors)

Returns Embeddings for the input tensors

evaluate_data_values()

Return data values for each training data point.

input_data(x_train, y_train, x_valid, y_valid)

Store and transform input data for DataEvaluator.

input_fetcher(fetcher)

Input data from a DataFetcher object.

match(y)

\(1.\) for all matching rows and \(0.\) otherwise.

setup(fetcher[, pred_model, metric])

Inputs model, metric and data into Data Evaluator.

train(fetcher[, pred_model, metric])

Store and transform data, then train model to predict data values.

train_data_values(*args, **kwargs)

Trains model to predict data values.

Attributes

Evaluators

data_values

Cached data values.