opendataval.dataloader.mix_labels#

opendataval.dataloader.mix_labels(fetcher: DataFetcher, noise_rate: float = 0.2) dict[str, ndarray]#

Mixes y_train labels of a DataFetcher, adding noise to data.

For a given set of unique labels, we shift the label forward up to n-1 steps. This prevents selecting the same label when noise is added.

Parameters#

fetcherDataFetcher

DataFetcher object housing the data to have noise added to

noise_ratefloat

Proportion of labels to add noise to

Returns#

dict[str, np.ndarray]

dictionary of updated data points

  • “y_train” – Updated training labels mixed

  • “y_valid” – Updated validation labels mixed

  • “noisy_train_indices” – Indices of training data set with mixed labels