Update Includes 8 New African Languages, Joining Existing Nigerian Languages Like Fulani, Hausa, Igbo, Kanuri, TIV and Yoruba.
Gwamcee News
Google has announced a major expansion of Google Translate, adding 110 new languages to the platform.
The update is part of Google’s 1,000 Languages Initiative, which uses AI models to support the 1,000 most spoken languages around the world, represents a significant step towards breaking down language barriers and fostering communication across diverse cultures. The new inclusions include eight languages from across Africa, which now join Nigerian languages Hausa, Igbo, Yoruba, Fulani, Kanuri, and Tiv, which were already supported by Google Translate.
Taiwo Kola-Ogunlade, Communications and Public Affairs Manager for West Africa at Google, highlighted the importance of this initiative: “Our mission is to enable everyone, everywhere, to understand the world and express themselves across languages. With the addition of these 110 new languages, including many from Africa, we’re opening up new opportunities for over half a billion people to connect and communicate.”
Africa, with its rich linguistic diversity, is a key focus of this expansion. The addition of numerous African languages underscores Google’s commitment to supporting underrepresented languages and amplifying voices from across the continent.
Kola-Ogunlade further explained the complexities involved in language selection: “A lot of consideration goes into new language additions for Google Translate, ranging from which languages to include to the use of specific spellings. Many languages do not have a single, standard form, so learning the specific dialect that is spoken the most in an area is more feasible. Our approach has been to prioritise the most commonly used varieties of each language.”
The latest expansion utilises the PaLM 2 large language model, following the addition of 24 languages in 2022 using Zero-Shot Machine Translation. This technology enables Translate to more efficiently learn languages that are closely related to one another or have various distinct dialects. Google collaborated extensively with native speakers to ensure accuracy and prioritise the most commonly used varieties of each language.
The 110 new languages represent over 614 million speakers worldwide, covering around 8% of the world’s population. This includes major world languages with over 100 million speakers, languages spoken by small Indigenous communities, and languages undergoing revitalization efforts.
Here are key African Languages Now Supported by Google Translate:
Middle Africa: Kikongo
Eastern Africa: Luo, Swati, Venda
Western Africa: Fon, Wolof
Southern Africa: Swati, Ndebele
Notably, this update marks Google’s largest expansion of African languages to date, including:
Afar is a tonal language spoken in Djibouti, Eritrea and Ethiopia. Of all the languages in this launch, Afar had the most volunteer community contributions.
Cantonese is one of the most requested languages for Google Translate. Because Cantonese often overlaps with Mandarin in writing, it’s tricky to find data and train models.
Manx is the Celtic language of the Isle of Man. It almost went extinct with the death of its last native speaker in 1974. But thanks to an island-wide revival movement, there are now thousands of speakers.
NKo is a standardised form of the West African Manding languages that unifies many dialects into a common language. Its unique alphabet was invented in 1949, and it has an active research community that develops resources and technology for it today.
Punjabi (Shahmukhi) is the variety of Punjabi written in Perso-Arabic script (Shahmukhi), and is the most spoken language in Pakistan.
Tamazight (Amazigh) is a Berber language spoken across North Africa. Although there are many dialects, the written form is generally mutually understandable. It’s written in Latin script and Tifinagh script, both of which Google Translate supports.
Tok Pisin is an English-based creole and the lingua franca of Papua New Guinea. If you speak English, try translating into Tok Pisin — you might be able to make out the meaning!