LLaSO-Align
Speech-text alignment training corpus for the LLaSO framework, 12 million English speech-text pairs covering diverse speech scenarios
Duration
None
Languages
1
Sample Rate
16 kHz
Published
2025-08
Description
1Alignment training component of the LLaSO open-source framework, containing 12 million speech-text pairs
2Data sources include GigaSpeech (conversational speech), LibriSpeech (read narrative), LJ Speech (audiobooks), MLS (multilingual speech), VCTK (accented English)
3Covers multiple domains including conversation, narrative, audiobooks, and accented speech
4Unified into JSON-format ASR alignment tasks using 18 instruction templates
5Audio uniformly resampled to 16 kHz and converted to 128-channel mel spectrograms
Language Details
| Language | Duration |
|---|---|
| English | None |
Publisher