GigaSpeech
A 10,000-hour multi-domain English ASR corpus covering audiobooks, podcasts, and YouTube
Duration
10000 hours
Languages
1
Sample Rate
16 kHz
Published
2021-06
Description
110,000 hours of high-quality labeled audio for supervised training, 40,000 hours total for semi-supervised and unsupervised training
2Sourced from audiobooks, podcasts, and YouTube, covering both read and spontaneous speaking styles
3Proposes a novel forced alignment and segmentation pipeline to create sentence segments and filter low-quality transcriptions
4Provides 5 training subsets of different scales: 10h, 250h, 1000h, 2500h, and 10000h
Language Details
| Language | Duration |
|---|---|
| English | 10000 hours |
Publisher