Google Teams Up with African Universities to Launch WAXAL, Landmark AI Dataset for Local Languages

Google has partnered with leading African universities and research institutions to launch WAXAL, a large-scale open speech dataset aimed at transforming artificial intelligence tools for African languages and closing the continent’s digital voice gap.

The dataset covers 21 Sub-Saharan African languages, including Hausa, Yoruba, Igbo, Swahili, Luganda, and Acholi, and is designed to support more than 100 million speakers who have historically been excluded from voice-powered technologies due to the lack of quality language data.

WAXAL contains 1,250 hours of transcribed natural speech alongside over 20 hours of high-quality studio recordings, providing a robust foundation for building voice assistants, transcription services, and realistic synthetic voices tailored to African contexts.

While voice-enabled technologies are now commonplace globally, Africa’s more than 2,000 languages remain largely underrepresented in AI systems. This gap has limited access to speech-based tools across key sectors such as education, healthcare, agriculture, and commerce.

Developed over a three-year period with funding from Google, WAXAL seeks to change that narrative by putting African languages and communities at the centre of AI innovation.

“The ultimate impact of WAXAL is empowerment,” said Aisha Walcott-Bryant, Head of Google Research Africa. “This dataset provides students, researchers, and entrepreneurs with the foundation to build technology in their own languages and on their own terms, finally reaching over 100 million people.”

A defining feature of the initiative is community-led data ownership. African partner institutions, including Makerere University (Uganda), the University of Ghana, and Digital Umuganda (Rwanda), led the data collection process and retain ownership of the dataset — a departure from many global AI projects.

“For AI to truly work in Africa, it must speak our languages and understand our realities,” said Joyce Nakatumba-Nabende, Senior Lecturer at Makerere University. “WAXAL gives researchers the high-quality data needed to build speech technologies that reflect our communities.”

At the University of Ghana, more than 7,000 volunteers contributed their voices to the project. Professor Isaac Wiafe, Associate Professor at the institution, noted that the dataset is already opening new pathways for innovation in health, education, and agriculture.

With WAXAL, Google and its African partners hope to accelerate inclusive AI development and ensure that Africa’s languages are no longer left behind in the global digital conversation.

 

Share this article

Share your Comment

guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Read More

Trending Posts

Quick Links