MagicData-RAMC
A 180-hour richly annotated Mandarin conversational speech dataset covering 15 diverse domains
Duration
180 hours
Languages
1
Sample Rate
16 kHz
Published
2022-03
Description
1Contains 180 hours of conversational speech recorded by native Mandarin speakers via mobile phones
2Conversations categorized into 15 domains with topic labels, ranging from technology to daily life
3Provides precise transcriptions and speaker voice activity timestamps
4Supports ASR, speaker diarization, topic detection, keyword search, TTS, and other tasks
Language Details
| Language | Duration |
|---|---|
| Mandarin Chinese | 180 hours |
Publisher