TTSDataset
The TTSDataset is a collection of (text, mel_spectrogram) pairs.
- The data consists of just one dataset:
train
- The
traindataset is a torch.utils.data.Dataset instance providing__len__: number of utterances in the dataset;__getitem__: return the requested utterance as a dictionary with keys:"text": the text of an utterance as a string,"mel_spectrogram": the mel spectrogram of an utterance with shape[length, n_mels];
char_vocab: a npfl138.Vocabulary instance with the character mapping.
npfl138.datasets.tts_dataset.TTSDataset
Source code in npfl138/datasets/tts_dataset.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 | |
PAD
class-attribute
instance-attribute
PAD: int = 0
The index of the padding token in the vocabulary.
Element
class-attribute
instance-attribute
The type of a single dataset element, i.e., a single utterance.
Dataset
Bases: Dataset
Source code in npfl138/datasets/tts_dataset.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
__len__
__len__() -> int
Return the number of utterances in the dataset.
Source code in npfl138/datasets/tts_dataset.py
61 62 63 | |
__getitem__
Return the index-th element of the dataset as a dictionary.
Source code in npfl138/datasets/tts_dataset.py
65 66 67 68 69 70 71 72 73 74 75 76 77 | |
__init__
Load the dataset from the given filename, downloading it if necessary.
Source code in npfl138/datasets/tts_dataset.py
84 85 86 87 88 89 90 91 92 93 94 95 | |