AISHELL-5
First open-source in-car multi-channel multi-speaker Chinese speech dataset with 100+ hours of real in-vehicle conversations for speaker diarization and ASR
Duration
100 hours
Languages
1
Sample Rate
16 kHz
Published
2025-05
Description
1Recorded in a hybrid vehicle with far-field microphones placed at the front and each speaker wearing a high-fidelity close-talk microphone
2165 speakers participated, without noticeable accents
32-4 speakers randomly seated at four positions in the car, engaging in unrestricted free conversations
4Over 100 hours total: 94h training, 3.3h validation, two test sets
5Far-field audio contains 4 channels; training set additionally includes close-talk audio
6Also provides a large-scale noise dataset for speech simulation research
7CC BY-SA 4.0 license
Language Details
| Language | Duration |
|---|---|
| Mandarin Chinese | 100 hours |
Publisher