BibleTTS
Identifier: SLR129
Summary: A large, high-fidelity, multilingual, and uniquely African speech corpus
Category: Speech
License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Downloads (use a mirror closer to you):
akuapem-twi.tgz [16G] ( Akuapem Twi speech and text
) Mirrors:
[US]
[EU]
[CN]
asante-twi.tgz [15G] ( Asante Twi speech and text
) Mirrors:
[US]
[EU]
[CN]
ewe.tgz [19G] ( Ewe speech and text
) Mirrors:
[US]
[EU]
[CN]
hausa.tgz [21G] ( Hausa speech and text
) Mirrors:
[US]
[EU]
[CN]
lingala.tgz [13G] ( Lingala speech and text
) Mirrors:
[US]
[EU]
[CN]
yoruba.tgz [6.3G] ( Yoruba speech and text
) Mirrors:
[US]
[EU]
[CN]
About this resource:
This repository contains the data for the six aligned languages of the BibleTTS corpus (Asante Twi, Akuapem Twi, Ewe, Hausa, Lingala, Yoruba).
This data has been automatically verse-aligned and filtered for TTS training.
Each .tgz
file contains: speech files for individual verses and corresponding transcripts for each standardized split per language (train, dev, test). Files in each split are grouped into subdirectories by book.
The speech data is distributed as flac files in the original 48kHz mono format; it may be desired to resample for TTS training.
For more information, see the:
- [dataset paper] Interspeech 2022 paper describing the corpus and its creation
- [project page] Masakhane-io project website with TTS models and samples
Citation: If you use the BibleTTS corpus in your work, please cite the dataset paper:
@inproceedings{meyer2022bibletts, title={BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus}, author={Josh Meyer and David Adelani and Edresson Casanova and Alp {\"O}ktem and Daniel Whitenack and Julian Weber and Salomon Kabongo Kabenamualu and Elizabeth Salesky and Iroro Orife and Colin Leong and Perez Ogayo and Chris Chinenye Emezue and Jonathan Mukiibi and Salomey Osei and Apelete Agbolo and Victor Akinode and Bernard Opoku and Olanrewaju Samuel and Jesujoba Alabi and Shamsuddeen Hassan Muhammad}, booktitle={Interspeech}, publisher = {{ISCA}}, year={2022}, url={https://arxiv.org/pdf/2207.03546.pdf} }