Free English Corpus and Language Challenge -- Speechocean
Summary: A free 8.2 hours English speech recognition corpus provided by speechocean and an oriental language recognition challenge co-organized by speechocean and Tsinghua University.
License: Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
About this resource:
Dataset retracted by request of the SpeechOcean company
- This is an 8.2 hours English speech recognition corpus, which was recorded by cell phones (iOS system or android system).
- The corpus contains the recordings of 6393 utterances from 20 speakers in a quiet office environment.
- Transcription files are included and the sentence transcription accuracy is higher than 98%.
- It is totally free to use for academic purpose.
- This corpus is a subset of a bigger corpus (1147 hours). Please contact us if you are interested.
About Oriental Language Recognition Challenge (OLR 2020)
- This challenge co-organized by Speechocean and Tsinghua University aims at boosting language recognition technology for oriental languages. Following the success of the past four OLR challenges, the new challenge in 2020 is coming now and is more challenging and more interesting.
- In the past year, dozens of well-known companies and universities such as Samsung, Alibaba and University of Tokyo participated in this challenge.
- This year, 188 hours of speech recognition corpus covering 18 languages are totally free for every participant.
- Home page of the challenge: http://cslt.riit.tsinghua.edu.cn/mediawiki/index.php/OLR_Challenge_2020