WenetSpeech4TTS
A 12,800-hour Mandarin TTS corpus refined from WenetSpeech, offering multi-tier quality subsets
Duration
12800 hours
Languages
1
Sample Rate
16 kHz
Published
2024-06
Description
1Refined from the open-source WenetSpeech dataset, optimized for TTS tasks
2Optimization pipeline includes: segment boundary adjustment, audio denoising/enhancement, intra-segment speaker mixing elimination, and more accurate ASR transcription
3Divided into Premium, Standard, Basic, and Rest quality subsets based on DNSMOS P.808 scores
4Effectiveness validated with VALL-E and NaturalSpeech 2 models
5Licensed under CC BY 4.0 with additional non-commercial research use restrictions
Language Details
| Language | Duration |
|---|---|
| Mandarin Chinese | 12800 hours |
Publisher