Google Translate now includes Portuguese from Portugal

Google Translate now includes Portuguese from Portugal

By 2022, Google will add 24 new languages ​​using zero-based machine translation, where a machine learning model learns to translate into another language without ever seeing an example, and announced the “1,000 Languages ​​Initiative, a commitment to building AI.” [inteligência artificial] Which will support more than a thousand languages ​​spoken in the world,” recalls Google.

“Now, we're using AI to expand the range of supported languages” and “thanks to the amazing language model PaLM 2, we're starting to roll out 110 new languages ​​to Google Translate, our biggest expansion ever,” including Portuguese from Portugal,” he says in a post. on the Internet.

In other words, Google Translate will now differentiate between Portuguese variants (Portugal vs. Brazil).

“From Cantonese to Kekchi, these new languages ​​are represented by more than 614 million speakers, allowing translation for about 8% of the world’s population,” Google says.

He adds that about a quarter of the new languages ​​“are originally from Africa and represent our largest expansion of African languages ​​to date, including Fon, Kikongo, Luo, Ga, Swati, Venda and Wolof.”

Among the languages ​​now supported by Google Translate is Afar, a tonal language spoken in Djibouti, Eritrea, and Ethiopia. “Among all the languages ​​in this launch, Afar received the largest number of volunteer contributions from the community,” he highlights.

Then continues Cantonese, which has long been “one of the most requested languages ​​on Google Translate.”

Other examples include Manx, the Celtic language of the Isle of Man, which was nearly extinct when its last native speaker died in 1974, but “thanks to an island-wide revival movement, there are now thousands of speakers”, and Nkw. , a standardized form of the Manding language of West Africa that unites several dialects into a common language.

“Its unique alphabet was invented in 1949 and has an active research community today developing the resources and technology for it,” Google says in its post.

There is also Punjabi (Shahmukhi), a variety of Punjabi written in the Perso-Arabic script (Shahmukhi) and which is the most widely spoken language in Pakistan, Tamazight, a Berber language spoken in North Africa, and Tok Pisin, a “creole language of English origin” and the lingua franca of Papua New Guinea.

Languages ​​”have enormous diversity: regional variations, dialects, and different spelling patterns”, and in fact, “many languages ​​do not have a uniform format, so it is impossible to choose the right variety”.

“But our approach was to prioritize the most frequently used varieties in each language,” he adds.

“PaLM 2 was a key piece in this puzzle, helping the translator learn languages ​​that are closely related to each other more efficiently, including languages ​​closely related to Hindi, such as Awadhi and Marwadi, and French Creoles, such as Seychellois Creole and Mauritian Creole.” He explains.

As technology evolves “and we continue to partner with expert linguists and native speakers, over time we will support more language variations and spelling conventions.”

By Chris Skeldon

"Coffee trailblazer. Social media ninja. Unapologetic web guru. Friendly music fan. Alcohol fanatic."