Shortcuts

torchaudio.datasets

All datasets are subclasses of torch.utils.data.Dataset i.e, they have __getitem__ and __len__ methods implemented. Hence, they can all be passed to a torch.utils.data.DataLoader which can load multiple samples parallelly using torch.multiprocessing workers. For example:

yesno_data = torchaudio.datasets.YESNO('.', download=True)
data_loader = torch.utils.data.DataLoader(yesno_data,
                                          batch_size=1,
                                          shuffle=True,
                                          num_workers=args.nThreads)

The following datasets are available:

Datasets

All the datasets have almost similar API. They all have two common arguments: transform and target_transform to transform the input and target respectively.

VCTK

class torchaudio.datasets.VCTK(root, downsample=True, transform=None, target_transform=None, download=False, dev_mode=False)[source]

VCTK Dataset. alternate url

Parameters
  • root (str) – Root directory of dataset where processed/training.pt and processed/test.pt exist.

  • downsample (bool, optional) – Whether to downsample the signal (Default: True)

  • transform (Callable, optional) – A function/transform that takes in an raw audio and returns a transformed version. E.g, transforms.Spectrogram. (Default: None)

  • target_transform (callable, optional) – A function/transform that takes in the target and transforms it. (Default: None)

  • download (bool, optional) – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again. (Default: True)

  • dev_mode (bool, optional) – If true, clean up is not performed on downloaded files. Useful to keep raw audio and transcriptions. (Default: False)

__getitem__(index)[source]
Parameters

index (int) – Index

Returns

The output tuple (image, target) where target is index of the target class.

Return type

Tuple[torch.Tensor, int]

YESNO

class torchaudio.datasets.YESNO(root, transform=None, target_transform=None, download=False, dev_mode=False)[source]

YesNo Hebrew Dataset.

Parameters
  • root (str) – Root directory of dataset where processed/training.pt and processed/test.pt exist.

  • transform (Callable, optional) – A function/transform that takes in an PIL image and returns a transformed version. E.g, transforms.Spectrogram. ( Default: None)

  • target_transform (Callable, optional) – A function/transform that takes in the target and transforms it. (Default: None)

  • download (bool, optional) – If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again. (Default: False)

  • dev_mode (bool, optional) – If true, clean up is not performed on downloaded files. Useful to keep raw audio and transcriptions. (Default: False)

__getitem__(index)[source]
Parameters

index (int) – Index

Returns

The output tuple (image, target) where target is index of the target class.

Return type

Tuple[torch.Tensor, int]

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources