opendataval.dataval.DVRL#

class opendataval.dataval.DVRL(*args, **kwargs)#

Data valuation using reinforcement learning class, implemented with PyTorch.

References#

hidden_dimint, optional: Hidden dimensions for the RL Multilayer Perceptron Value Estimator (VE) (details in DataValueEstimatorRL class), by default 100
layer_numberint, optional: Number of hidden layers for the Value Estimator (VE), by default 5
comb_dimint, optional: After concat inputs how many layers, much less than hidden_dim, by default 10
rl_epochsint, optional: Number of training epochs for the VE, by default 1000
rl_batch_sizeint, optional: Batch size for training the VE, by default 32
lrfloat, optional: Learning rate for the VE, by default 0.01
thresholdfloat, optional: Search rate threshold, the VE may get stuck in certain bounds close to \([0, 1]\), thus outside of \([1-threshold, threshold]\) we encourage searching, by default 0.9
devicetorch.device, optional: Tensor device for acceleration, by default torch.device(“cpu”)
random_stateRandomState, optional: Random initial state, by default None

__init__(hidden_dim: int = 100, layer_number: int = 5, comb_dim: int = 10, rl_epochs: int = 1000, rl_batch_size: int = 32, lr: float = 0.01, threshold: float = 0.9, device: device = device(type='cpu'), random_state: RandomState | None = None)#

Methods

`__init__`([hidden_dim, layer_number, ...])
`evaluate`(y, y_hat)	Evaluate performance of the specified metric between label and predictions.
`evaluate_data_values`()	Return data values for each training data point.
`input_data`(x_train, y_train, x_valid, y_valid)	Store and transform input data for DVRL.
`input_fetcher`(fetcher)	Input data from a DataFetcher object.
`input_metric`(metric)	Input the evaluation metric.
`input_model`(pred_model)	Input the prediction model.
`input_model_metric`(pred_model, metric)	Input the prediction model and the evaluation metric.
`setup`(fetcher[, pred_model, metric])	Inputs model, metric and data into Data Evaluator.
`train`(fetcher[, pred_model, metric])	Store and transform data, then train model to predict data values.
`train_data_values`(*args[, num_workers])	Trains model to predict data values.

Attributes

`Evaluators`
`data_values`	Cached data values.