AISHELL-4
An 8-channel microphone array speech dataset for conference scenarios, supporting speech enhancement, separation, recognition, and speaker diarization
Duration
120 hours
Languages
1
Sample Rate
16 kHz
Published
2021-04
Description
1211 real meeting recordings, 4-8 speakers per session, approximately 60 speakers in total
2Recorded using an 8-channel circular microphone array
3Contains real meeting acoustic characteristics: short pauses, speech overlap, rapid speaker turns, noise, etc.
4The only Chinese meeting conversational speech dataset
Language Details
| Language | Duration |
|---|---|
| Mandarin Chinese | 120 hours |
Publisher