Laura Romig, Brown University Class of 2025, Language Ambassador
A few weeks ago, researchers at Meta announced that, using artificial intelligence, they had developed the first speech-to-speech translator for the language Hokkien, which is widely spoken in southeastern Fujian in China, as well as across southeastern Asia as a result of the Chinese diaspora. This new translator is part of a broader project at Meta working to create a universal speech-to speech translator using the technology of machine-learning.
Although it is spoken by millions of people throughout east Asia and the world, Hokkien lacks a standard written form; it has mostly been passed orally through families. Written systems for Hokkien include a makeshift use of Chinese characters or a Latin romanization script, neither of which can wholly encompass the spoken language. Because Hokkien is primarily spoken, it presents a unique challenge for translation. Meta has attempted to solve that problem at large with an AI-developed system that first translates the spoken sounds into Mandarin Chinese characters, and then translates those to English. While their translator can only handle one sentence at a time, it is still an accomplishment that opens up questions about the future of translation and the potential of speech translation.
Just like written translators such as Google translate, automated speech translation has several inherent pitfalls that make it precarious for high-stakes translation situations - for example, international diplomacy. And although both written and speech translation technology have made huge improvements over time, the doubt still remains over whether they will ever match the nuance and understanding of a human speaker. Is advanced machine learning the inevitable future of translation and language? It makes sense that Meta, a tech-focused company aiming to build a virtual world, thinks so.
However, machine learning (in short, the process by which a computer itself 'builds' a better and better algorithm for a task) often relies on the input of massive data sets. If a language with fewer speakers or less recorded sources lacks large enough data sets to reliably build a translation program, then this 'universal' speech-to-speech translation project is likely to fail speakers of those languages. Without dedicated researchers to a language - like Peng-Jen Chen at Meta is to Hokkien - that language may not receive the research and attention needed, or it may be working with incomplete, biased data sets.
Equity and inclusion in computing technology are not new issues for Hokkien or other primarily spoken languages. Because some versions of the Hokkien writing system involve both standard Chinese characters as well as other, Hokkien-specific characters, some of those latter characters are not part of Unicode, the international standard for encoding text. So computers can have trouble processing written Hokkien, simply because its characters aren't included in the international standard.
Language technology offers high potential as a means to connect people around the world, but it has a few common pitfalls, often around inclusion. It is likely that Meta and other companies will continue to research and develop language-translation or transcription technologies in the coming years. As this happens, it's important for us, as consumers of these companies' products and potential users of new technologies, to always pay attention to which languages are being studied, and how.
For more, check out this deep-dive into the project at Meta, and this review for a book about the history and cultural context of Hokkien as a migrating language.