Description

1400 speakers from various accent regions across China

2Training set: 120,098 utterances (340 speakers), validation set: 14,326 utterances (40 speakers), test set: 7,176 utterances (20 speakers)

3Recorded with high-fidelity microphones in quiet indoor environments, downsampled to 16 kHz

4Manual transcription accuracy above 95%

5Apache 2.0 open-source license

Language Details

Language	Duration
Mandarin Chinese	178 hours

Publisher

Beijing Shell Shell Technology Co.Ltd. (AISHELL Foundation)

License & Commercial Use

Resources

AISHELL-1