The MATERIAL world: US Intelligence embarks on ambitious translation project
The Intelligence Advanced Research Projects Activity (IARPA), a branch of the U.S Office of the Director of National Intelligence, has announced that it will launch a new program in October focused on the development of an “English-in, English-out” information retrieval system.
The program is called Machine Translation for English Retrieval of Information in Any Language, MATERIAL for short. It will attempt to create a tool that allows users to access articles translated to English from thousands of languages spoken around the world, including “low-resource languages,” languages that are not widely studied or are in danger of fading out.
According to IARPA, MATERIAL’s goal is essentially to create a highly sophisticated translator that would allow intelligence analysts to focus on the most crucial parts of the data and significantly reduce the amount of time spent deciphering the language.
Although the project is primarily meant to aid national security efforts, if successful, it will have astounding implications for the scientific community. A system capable of translating documents on a wide range of topics would improve access to studies and research from around the globe that might have otherwise never been acknowledged by the English-speaking community.
Critics of the pervasive Anglo-centrism in the sciences may not agree that an English-in, English-out system would be beneficial to the field overall, considering that scientists around the world are already pressured to publish their most important work in English regardless of their native tongue.
But the research published by MATERIAL may have important contributions to the creation of more precise translation systems that would aid in converting journals from English to a variety of languages, including those considered low-resource.
Possessing the ability to translate work from a variety of languages would help prevent unnecessary replication of studies and keep researchers primed on the most important developments in their field. Though this idea seems straightforward enough to generations familiar with Google translate, the process of creating a single system capable of translating even parts of complicated documents like research papers is incredibly complex.
In the past, similar projects have been attempted and advanced translation programs already exist to facilitate access to work in a variety of languages, but none can currently boast the range or scope MATERIAL is aspiring to.
According to IARPA, MATERIAL is unique because it seeks “to drastically decrease the time and data needed to field systems capable of fulfilling an English-in, English-out task” for every language; not just the ones familiar to Google translate.
Unfortunately, MATERIAL’s uniqueness may not offer enough to overcome its near herculean challenge of synthesizing data from thousands of languages into one integrated system: in other words, though the project is idealistic, its completion is a long way off.
In the meantime, we can still only imagine what a world unhindered by language barriers will look like; and hope that new ideas are not lost in translation.