Primewords Chinese Corpus Set 1
Identifier: SLR47
Summary: Chinese Mandarin corpus released by Shanghai Primewords Co. Ltd. (www.primewords.cn), containing 100 hours of speech data.
Category: Speech
License: Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
Downloads (use a mirror closer to you):
primewords_md_2018_set1.tar.gz [9.0G] (speech data and transcripts
) Mirrors:
[US]
[EU]
[CN]
About this resource:
The corpus is recorded by smart mobile phones from 296 native Chinese speakers. The transcription accuracy is larger than 98%, at the confidence level of 95%. It is free for academic use.
The mapping between the transcript and utterance is given in JSON format.
You can cite the data using the following BibTeX entry:
@misc{primewords_201801, title={Primewords Chinese Corpus Set 1}, author={Primewords Information Technology Co., Ltd.}, year={2018}, note={\url{https://www.primewords.cn}} }
CONTACTOR Yinghui Liu, yinghui_liu@primewords.cn
External URLs: https://www.primewords.cn