VietMed
A Vietnamese medical domain ASR dataset with 16 hours of labeled speech and 2,200 hours of unlabeled speech, covering all ICD-10 disease groups
Duration
2216 hours
Languages
1
Sample Rate
8 kHz
Published
2024-04
Description
1Contains 16 hours of labeled medical speech, 1,000 hours of unlabeled medical speech, and 1,200 hours of unlabeled general-domain speech
2Covers all ICD-10 disease groups and includes all Vietnamese accents
3Provides pre-trained and fine-tuned models
4Published at LREC-COLING 2024 (Oral)
Language Details
| Language | Duration |
|---|---|
| Vietnamese | 2216 hours |
Publisher