SeniorTalk
First open-source Mandarin natural conversation speech dataset for super-aged seniors (75+), with rich multi-dimensional annotations
Duration
55.53 hours
Languages
1
Sample Rate
16 kHz
Published
2025-03
Description
1Total duration of 55.53 hours, containing 101 natural conversation recordings from 202 Chinese seniors aged 75 to 85
2Speakers from 16 provinces in China, 67 males and 135 females, with rich regional and accent diversity
3Recorded on mobile devices (70% Android, 30% iOS), covering topics such as health, pets, and retirement life
4Includes 8 annotation dimensions: speaker attributes (age, gender, hometown), timestamps, transcription text, accent intensity (0-3 scale), overlapping speech, and special audio event markers
5Contains 60,029 utterances, supporting speaker verification, speaker diarization, speech recognition, and speech editing tasks
6Released under CC BY-NC-SA 4.0 license for academic research only
Language Details
| Language | Duration |
|---|---|
| Mandarin Chinese | 55.53 hours |
Publisher