Exclusive: Donors commit $10M to include African languages in AI models
The Gates Foundation and other donors from the AI for Development Funders Collaborative are pledging $10 million to ensure AI models are more inclusive of African languages. Devex spoke with experts on why that matters.
By Catherine Cheney // 10 February 2025The rise of artificial intelligence has the potential to accelerate progress on a range of global development goals. But a key challenge stands in the way: most AI models, and particularly large language models, or LLMs, are predominantly trained in just two languages: English and Mandarin. This excludes the vast majority of the world’s 7,000 languages. This not only limits the accessibility of AI tools for billions of people but also risks perpetuating digital inequities, leaving marginalized communities out of the AI revolution. In response, a growing coalition of donors — including the Gates Foundation, and donors from Germany, the United Kingdom, and Canada — is working to ensure that AI models are trained on the languages spoken by most of the world. They call themselves the AI for Development Funders Collaborative. They want to ensure that the 1.3 billion people across the African continent, where many languages are primarily spoken and not written, can benefit from the transformative potential of AI tools, from strengthening health care delivery to improving educational outcomes and expanding access to financial services. At this week’s Paris AI Action Summit on Feb. 10 and 11, donors from the collaborative are pledging $10 million to support African-led efforts to ensure AI models are more inclusive of African languages. Zameer Brey, deputy director of technology diffusion at the Gates Foundation, told Devex this is just a starting point — as it will take $50 million over the next four years to “unlock the language capability of about 40 African languages on the continent.” The sudden dismantling of the U.S. Agency for International Development, a former member of the AI collaborative, is also increasing the sense of urgency among donors to align efforts and pool resources in order to localize AI through languages and ensure these technologies benefit the global majority. Why AI needs to ‘speak’ local languages Gaps in AI language, not just different language but also slang and other cultural nuances, could lead AI models to be misleading, or even dangerous, Brey said. He cited the example of a Gates Foundation grantee in South Africa developing an HIV-focused chatbot for adolescent girls and women. They discovered that LLMs failed to recognize a word that is commonly used to describe a sexually transmitted disease in males. So girls and women seeking information about STDs or HIV might have received incorrect or irrelevant responses from the chatbot. If these models aren’t useful, users will lose trust in them, Brey added. The lack of training data in African languages has made it difficult to develop AI applications that require deep localized nuances. “Imagine a farmer in Kenya who is fluent in Swahili but does not read English well,” said Balthas Seibold, co-lead of the FAIR Forward - Artificial Intelligence for All, an initiative of Germany’s Federal Ministry for Economic Cooperation and Development implemented by GIZ, which will provide technical support for the $10 million pledged at the AI Action Summit. “How can she benefit from any real-time digital advice to plant her crops or be warned early on about a drought or impending flood, if all AI-powered information is in English?” There are more than 2,000 languages spoken in Africa, he added, and FAIR Forward will only work if AI “understands” these languages. GIZ has implemented a number of Gates-funded projects, from setting up an agriculture information exchange platform that provides smallholder farmers in Kenya with advisory services in local languages to working with Mozilla on open voice data and technology for the East African languages Kinyarwanda, Kiswahili, and Luganda. Several of these projects take aim not only at the language barrier, but also at the other challenges that stand in the way of democratizing access to AI tools, including cost, access to devices, and literacy. Philanthropy’s role Over the past year, the AI for Development Funders Collaborative has been working to align resources and strategies. In addition to their new $10 million funding commitment, at the Paris summit, these donors will share more about their strategy to develop open-source datasets, AI models, and applications in African languages. “One of the largest data gaps in Africa is language,” said Laurent Elder, manager of the information and networks program of Artificial Intelligence for Development. He cited a recent study commissioned by the International Development Research Centre finding that there is more than 2,500 times more English data on the internet than data in all local African languages combined. “Linguistic inclusion is very important to us as we see the potential of language in so many different applications of AI.” Masakhane, a grassroots organization focused on natural language processing, or NPL, research “in African languages, for Africans, by Africans,” has been working with funders on their strategies to ensure that AI technology is developed by and for African communities. “Masakhane takes a participatory approach,” said Tajuddeen Gwadabe, a leading researcher in the initiative. “The person interested in annotation, the person interested in building the model, the person working on evaluating the model and checking the quality of the translation, everyone is regarded as an important component of that work.” Masakhane is in talks with the AI for Development Funders Collaborative about serving as a potential hub for African NLP research. The Gates Foundation’s strategy The Gates Foundation has engaged with Big Tech — including Microsoft, OpenAI, and Google — about diversifying the languages their models are interfacing with, while also supporting locally developed AI models. In 2023, the Gates Foundation announced $5 million in grants to support equitable access to AI technologies in low- and middle-income countries. A few lessons emerged from those projects: the need for compute capacity, or the amount of processing power and data storage capabilities on the continent, access to talent, and language diversification, Brey said. In 2024, the Gates Foundation made a number of investments related to localizing AI languages, including grants to the University of Pretoria, Data Science Nigeria, and Maseno University. “Our strategy is to try and also get use case specific nomenclature and language so that you're able to drive a use case on HIV or sexual reproductive health or smallholder farmers, and so to try and be more and more targeted with something specific,” Brey said. “The way we are thinking about it is to develop a corpus that will be a global good, and so to make this available for model builders writ large.” Brey said a key challenge moving forward is to find technically smart ways to optimize data collection because it’s expensive to collect and annotate hundreds of thousands of hours for every single language. Challenges and the road ahead: From Paris to Kigali While investments in localizing AI languages are critical, more needs to be done to build AI talent on the ground, says Uyi Stewart, chief data and technology officer at Data.org. “We need an integrated approach to AI investments,” he said. Without trained AI practitioners in Africa, many of these efforts risk becoming extractive, with language datasets being downloaded and used primarily by Big Tech rather than benefiting local communities, Stewart explained. He also reiterated his call for a public-private partnership for AI localization similar to Gavi, the Vaccine Alliance — “a true global platform,” where donors, the private sector, and local implementers collaborate in a structured and sustainable way rather than through scattered and siloed projects. Beyond the AI summit this week in Paris, the Global AI Summit on Africa in April is expected to be a critical milestone for galvanizing broader commitments, including from the private sector. “We see the Paris summit as really a stepping stone,” said Brey. “On the road to Kigali, we hope we will be able to galvanize lots more support, and by that stage, you know, have more partners.” As the AI for Development Funders Collaborative seeks a range of partners to help them shape a more equitable future for AI, Brey said that if Big Tech were to really embrace this agenda to localize AI languages, these efforts would move faster and have a greater impact.
The rise of artificial intelligence has the potential to accelerate progress on a range of global development goals. But a key challenge stands in the way: most AI models, and particularly large language models, or LLMs, are predominantly trained in just two languages: English and Mandarin. This excludes the vast majority of the world’s 7,000 languages.
This not only limits the accessibility of AI tools for billions of people but also risks perpetuating digital inequities, leaving marginalized communities out of the AI revolution. In response, a growing coalition of donors — including the Gates Foundation, and donors from Germany, the United Kingdom, and Canada — is working to ensure that AI models are trained on the languages spoken by most of the world. They call themselves the AI for Development Funders Collaborative.
They want to ensure that the 1.3 billion people across the African continent, where many languages are primarily spoken and not written, can benefit from the transformative potential of AI tools, from strengthening health care delivery to improving educational outcomes and expanding access to financial services.
This story is forDevex Promembers
Unlock this story now with a 15-day free trial of Devex Pro.
With a Devex Pro subscription you'll get access to deeper analysis and exclusive insights from our reporters and analysts.
Start my free trialRequest a group subscription Printing articles to share with others is a breach of our terms and conditions and copyright policy. Please use the sharing options on the left side of the article. Devex Pro members may share up to 10 articles per month using the Pro share tool ( ).
Catherine Cheney is the Senior Editor for Special Coverage at Devex. She leads the editorial vision of Devex’s news events and editorial coverage of key moments on the global development calendar. Catherine joined Devex as a reporter, focusing on technology and innovation in making progress on the Sustainable Development Goals. Prior to joining Devex, Catherine earned her bachelor’s and master’s degrees from Yale University, and worked as a web producer for POLITICO, a reporter for World Politics Review, and special projects editor at NationSwell. She has reported domestically and internationally for outlets including The Atlantic and the Washington Post. Catherine also works for the Solutions Journalism Network, a non profit organization that supports journalists and news organizations to report on responses to problems.