This is a reprint of a post published on the SDL website and closely related to a previous post on the difficulties and challenges of building Russian to English MT systems.
Many may not be aware that several in the SDL MT team have very significant research and development credentials, and have pioneered the data-driven approaches to MT system development. They initially commercialized Statistical MT and the members of the original research team read like a Who's Who in the MT research community. The original principals at Google MT and the developer of Moses both come from the original Language Weaver team and today you will see members of the original team have a hand in every major MT initiative in the US.
The SDL MT team also has the unique experience of working closely with linguists and translators on a long-term basis and thus have unique exposure to ongoing human MT quality assessments for most of the systems they build. In localization use case scenarios MT systems need to be assessed for PEMT suitability and that is best done by human review and assessment rather than by BLEU scores which is the metric of choice for most research teams in the industry.
Thus, this accomplishment with the Russian NMT system is an assessment made with deeply trusted human assessments that have been refined over 10 years of ongoing practice. The professional translators who do these assessments are part of the permanent team of around 2,000 translators at SDL. These assessments are more reliable than automated metrics which can often be gamed or manipulated or simply be known when they should by definition be blind tests, i.e. the system should not have trained on it. The claims to have cracked the problem are based on the very high level of accomplishment based on a metric that has gained greater reliability with the research team than BLEU and many other automated metrics which are also used in tandem.
The SDL Machine Translation (MT) research team recently announced that our latest machine learning innovations and development strategies with Neural MT have resulted in a breakthrough that clearly demonstrates a significant and substantial leap forward. When testing our Russian to English MT system, the output, when measured against an extensive suite of comparative experiments to verify and validate the outstanding results, outperformed all industry standards, setting a benchmark for Russian to English machine translation. Over 90% of the system’s output has been labelled as perfect translation by professional Russian-English translators.
Those who have been monitoring the progress of Neural MT systems may be aware that Russian to English has been a particularly challenging direction for MT developers.
The improvements to human assessment are most noticeable when considering fluency and word order issues with machine translation output. However, the most common automatic quality metric used by MT developers during the R&D phase is still the BLEU score, so it is important to incorporate human assessments into the scoring methodology.
The improvements to human assessment are most noticeable when considering fluency and word order issues with machine translation output. However, the most common automatic quality metric used by MT developers during the R&D phase is still the BLEU score, so it is important to incorporate human assessments into the scoring methodology.
The strategy adopted by the SDL researchers was to use professional human assessment as a primary means to assess the MT quality. We wanted to know the human perceived translation quality of SDL Neural MT, and understand how it compared to the human perceived translation quality of an actual human translation. SDL builds custom MT engines for production use on a regular basis, and has developed an accurate and reliable evaluation methodology to assess the quality of MT output that minimizes human bias.
A team of professional human Russian to English translators were shown a set of blind and random translations that came from any of the following systems and were not identified in any way:
The SDL research shows that its Neural MT system outperformed all industry standards, setting a benchmark for Russian to English machine translation, with 95% of the system’s output labelled as equivalent to a human translation in terms of quality by professional Russian-English translators.
"With over fifteen years of research and innovation in machine translation, our scientists and engineers took up the challenge to bring Neural MT to the next level,” said Samad Echihabi, Head of Machine Learning R&D, SDL. “We have been evolving, optimizing and adapting our neural technology to deal with highly complex translation tasks such as Russian to English, with phenomenal results. A machine running SDL’s Neural MT technology can now produce translations of Russian text virtually indistinguishable from what Russian-English bilingual humans can produce.”
SDL latest Neural MT technology is optimized for both accuracy and fluency and provides a powerful paradigm to deal with morphologically rich and complex languages. While the focus of the SDL tests and measurements was the Russian to English system, the strategies deployed by the SDL team are expected to be compatible with and be of benefit to other complex and morphologically rich languages.
It is interesting to note that the best Russian-English SMT systems, even after 10+ years of research, were only marginally better than the best Russian-English RBMT systems, if at all. This points to the significant challenge presented by the Russian to English language combination, and explains why RBMT systems have been preferred by many industrial users until quite recently. The new SDL Neural MT system is very likely to accelerate this transition.
Inflection
Unlike English, Russian is a highly inflected language. Suffixes on nouns mark 6 distinct cases, which determine the role of the noun in the sentence (whether it’s the subject, the direct object, the indirect object, something being possessed, something used as an instrument, or the object of a preposition). For example, all of these are different forms of the word “book.”
That’s 12 forms of the same word, which are used depending on what role the word is playing in the sentence. “But they’re not all distinct; you can have the same form for different roles, like the singular genitive & the plural nominative,” says Wes Feely, Senior Computational Linguist, SDL.
Additionally, like Spanish or French every noun has a gender. The word for “book” is feminine, but this is an arbitrary categorization. “There’s no reason why a book (книга kníga) is feminine and why a table (стол stól) is masculine,” explains Wes. “But it matters because the case suffixes are different for each gender (masculine, feminine, or neuter). So while there are 12 different forms of the word “book” and 12 different forms of the word “table”, they don’t share the same set of suffixes. When adjectives modify nouns, they need to agree with the noun, taking the same (or similar) suffix.”
Also, like Spanish or French, verbs conjugate depending on tense (past vs. non-past), person (I vs. you vs. he/she/it), number (singular vs. plural), etc. So one verb may have several different forms, as well.
Word order
In English, we use word order to accomplish the same thing as the suffixes on nouns in Russian. Because Russian has these case markings, their word order is much more free. For example, these are all acceptable ways of saying “I went to the shop.”
SDL’s latest Neural MT technology is able to deal with all the Russian language challenges described above and can produce fluent and accurate translations. Below are some examples from the new SDL Neural MT Russian-English system.
It is important to qualify that these results only reflect the results of generic MT systems and of generic sentences as shown above. The SDL research noted that the generic Neural MT system did not perform as well on domain specific data. As a supplier of MT solutions to the enterprise, SDL will typically adapt MT systems to the unique needs of each enterprise customer’s domain. This adaptation is also an especially challenging task with Neural MT models. In other experiments with domain adapted MT systems, the SDL research team noted that there were further improvements in perceived quality and they documented that, adaptation of the SDL Neural technology provided a 30% improvement over the generic neural engine on domain specific data.
"It is remarkable to see such a leap in translation quality with SDL’s latest neural technology. We are currently working on transitioning this advancement from our R&D lab to our enterprise customers,” said Quinn Lam, Senior Product Manager, SDL. “Planned for release this summer, the latest version of SDL Enterprise Translation Server (ETS) will be powered by this fully productized state-of-the-art Neural technology. Stay tuned!”
Another key requirement for successful MT adoption in the enterprise is the ability to get the system to learn and adapt to enterprise-specific linguistic requirements and preferences. This has been especially challenging with Neural MT technology which, until now, has been difficult to do without undermining the fluency and output quality. SDL researchers recently figured out how to augment Neural MT with dictionary capabilities. This means enterprises can easily adapt SDL Neural MT across multiple departments that have differing terminology, yet still maintain the translation fluency that this latest generation of MT technology is acclaimed for.
SDL’s latest dictionary feature sets a new industry standard for user control over automated translations, allowing users across the enterprise to use different dictionaries without impacting the quality of the translations. SDL ETS Dictionary capabilities include:
"We (SDL) now have has several other initiatives underway and will continue to introduce new features and capabilities emerging from their Neural MT research over the coming year as we bring innovative research ideas from the lab to the production deployment arena. As Samad added, “While there is great excitement about Neural MT, it is clear that as we explore further the science, we already see signs that we will continue to make progress and we look forward to bringing the most relevant innovations to market for our customers.”
Many may not be aware that several in the SDL MT team have very significant research and development credentials, and have pioneered the data-driven approaches to MT system development. They initially commercialized Statistical MT and the members of the original research team read like a Who's Who in the MT research community. The original principals at Google MT and the developer of Moses both come from the original Language Weaver team and today you will see members of the original team have a hand in every major MT initiative in the US.
The SDL MT team also has the unique experience of working closely with linguists and translators on a long-term basis and thus have unique exposure to ongoing human MT quality assessments for most of the systems they build. In localization use case scenarios MT systems need to be assessed for PEMT suitability and that is best done by human review and assessment rather than by BLEU scores which is the metric of choice for most research teams in the industry.
Thus, this accomplishment with the Russian NMT system is an assessment made with deeply trusted human assessments that have been refined over 10 years of ongoing practice. The professional translators who do these assessments are part of the permanent team of around 2,000 translators at SDL. These assessments are more reliable than automated metrics which can often be gamed or manipulated or simply be known when they should by definition be blind tests, i.e. the system should not have trained on it. The claims to have cracked the problem are based on the very high level of accomplishment based on a metric that has gained greater reliability with the research team than BLEU and many other automated metrics which are also used in tandem.
======
The SDL Machine Translation (MT) research team recently announced that our latest machine learning innovations and development strategies with Neural MT have resulted in a breakthrough that clearly demonstrates a significant and substantial leap forward. When testing our Russian to English MT system, the output, when measured against an extensive suite of comparative experiments to verify and validate the outstanding results, outperformed all industry standards, setting a benchmark for Russian to English machine translation. Over 90% of the system’s output has been labelled as perfect translation by professional Russian-English translators.
Those who have been monitoring the progress of Neural MT systems may be aware that Russian to English has been a particularly challenging direction for MT developers.
It was the Russian language that first inspired the science and research behind machine translation,” said Adolfo Hernandez, CEO, SDL. “Since then it has always been a major challenge for the community. SDL has deployed breakthrough research strategies to master these difficult languages, and support the global expansion of its enterprise customers. We have pushed the boundaries and raised the performance bar even higher, and we are now paving the way for leadership in other complex languages.”The linguistic properties and intricacies of the Russian language relative to English make it particularly challenging for MT systems to model. Russian is a highly inflected language with different syntax, grammar, and word order compared to English. Given the complexities created by these differences between the Russian and English language, raising the translation quality has been an ongoing focus of the SDL Machine Learning R&D team.
SDL Neural MT Russian to English results
Much of the enthusiasm for Neural MT is driven by the degree of fluency and naturalness of the output, and its ability to produce a large number of sentences that look like they are very fluent and look like they are from the human tongue. We have seen that often the early results with Neural MT output show that it is considered to be clearly better to human evaluators, even though established MT evaluation metrics such as the BLEU score may only show nominal or no improvements.The improvements to human assessment are most noticeable when considering fluency and word order issues with machine translation output. However, the most common automatic quality metric used by MT developers during the R&D phase is still the BLEU score, so it is important to incorporate human assessments into the scoring methodology.
The improvements to human assessment are most noticeable when considering fluency and word order issues with machine translation output. However, the most common automatic quality metric used by MT developers during the R&D phase is still the BLEU score, so it is important to incorporate human assessments into the scoring methodology.
The strategy adopted by the SDL researchers was to use professional human assessment as a primary means to assess the MT quality. We wanted to know the human perceived translation quality of SDL Neural MT, and understand how it compared to the human perceived translation quality of an actual human translation. SDL builds custom MT engines for production use on a regular basis, and has developed an accurate and reliable evaluation methodology to assess the quality of MT output that minimizes human bias.
A team of professional human Russian to English translators were shown a set of blind and random translations that came from any of the following systems and were not identified in any way:
- A human translation of the test set by a professional translator
- State-of-the-art Rule-Based MT output
- State-of-the-art Statistical MT output
- SDL Neural MT output
The SDL research shows that its Neural MT system outperformed all industry standards, setting a benchmark for Russian to English machine translation, with 95% of the system’s output labelled as equivalent to a human translation in terms of quality by professional Russian-English translators.
Left to Right: Amos Kariuki, Ling Tsou, Dragos Munteanu, Samad Echihabi, Quinn Lam, Wes Feely, and William Tambellini |
"With over fifteen years of research and innovation in machine translation, our scientists and engineers took up the challenge to bring Neural MT to the next level,” said Samad Echihabi, Head of Machine Learning R&D, SDL. “We have been evolving, optimizing and adapting our neural technology to deal with highly complex translation tasks such as Russian to English, with phenomenal results. A machine running SDL’s Neural MT technology can now produce translations of Russian text virtually indistinguishable from what Russian-English bilingual humans can produce.”
SDL latest Neural MT technology is optimized for both accuracy and fluency and provides a powerful paradigm to deal with morphologically rich and complex languages. While the focus of the SDL tests and measurements was the Russian to English system, the strategies deployed by the SDL team are expected to be compatible with and be of benefit to other complex and morphologically rich languages.
It is interesting to note that the best Russian-English SMT systems, even after 10+ years of research, were only marginally better than the best Russian-English RBMT systems, if at all. This points to the significant challenge presented by the Russian to English language combination, and explains why RBMT systems have been preferred by many industrial users until quite recently. The new SDL Neural MT system is very likely to accelerate this transition.
Why is Russian difficult for MT?
Russian has always been considered to be one of the most difficult languages in MT, mostly because it is very different linguistically from English. Russian differs from English significantly in inflection, morphology, word order and gender associations with nouns.Inflection
Unlike English, Russian is a highly inflected language. Suffixes on nouns mark 6 distinct cases, which determine the role of the noun in the sentence (whether it’s the subject, the direct object, the indirect object, something being possessed, something used as an instrument, or the object of a preposition). For example, all of these are different forms of the word “book.”
That’s 12 forms of the same word, which are used depending on what role the word is playing in the sentence. “But they’re not all distinct; you can have the same form for different roles, like the singular genitive & the plural nominative,” says Wes Feely, Senior Computational Linguist, SDL.
Additionally, like Spanish or French every noun has a gender. The word for “book” is feminine, but this is an arbitrary categorization. “There’s no reason why a book (книга kníga) is feminine and why a table (стол stól) is masculine,” explains Wes. “But it matters because the case suffixes are different for each gender (masculine, feminine, or neuter). So while there are 12 different forms of the word “book” and 12 different forms of the word “table”, they don’t share the same set of suffixes. When adjectives modify nouns, they need to agree with the noun, taking the same (or similar) suffix.”
Also, like Spanish or French, verbs conjugate depending on tense (past vs. non-past), person (I vs. you vs. he/she/it), number (singular vs. plural), etc. So one verb may have several different forms, as well.
Word order
In English, we use word order to accomplish the same thing as the suffixes on nouns in Russian. Because Russian has these case markings, their word order is much more free. For example, these are all acceptable ways of saying “I went to the shop.”
Sample Output from SDL Russian to English Neural MT system
Essentially, all orderings are possible, except that the preposition “to” (в v) must precede the word for “shop” (магазин magazin). You can imagine that as sentences get longer, the number of possible sentence order structures increase. There are some limits on this: some orders in this example sound strange or archaic, and others are only used to emphasize where you’re going or who is going. But there are certainly more ways of saying the same thing than English, which is stricter in its word order.SDL’s latest Neural MT technology is able to deal with all the Russian language challenges described above and can produce fluent and accurate translations. Below are some examples from the new SDL Neural MT Russian-English system.
Russian | До Уистлера, расположенного в провинции Британская Колумбия, можно быстро добраться от Ванкувера на автомобиле или самолете. |
SDL Neural MT English Output | Whistler, located in British Columbia, is easily accessible from Vancouver by car or plane. |
Human English Output | Whistler, British Columbia, is quickly accessible from Vancouver by road or air. |
Russian | Фестивали, спа, рестораны и бары сочетаются с бесконечными возможностями досуга на свежем воздухе, делая Уистлер идеальным местом, где вы можете отдохнуть и расслабиться. |
SDL Neural MT English Output | Festivals, spa, restaurants and bars combine with endless outdoor activities, making Whistler the ideal place to relax and unwind. |
Human English Output | Festivals, spas, restaurants and bars combine with endless outdoor activities to make Whistler the ultimate place to escape and unwind. |
Russian | Директор оперативного управления Международного комитета Красного Креста Доминик Стиллхарт сообщил в воскресенье на пресс-конференции в Сане, что с 27 апреля по 13 мая в стране умерло от холеры 115 человек. |
SDL Neural MT English Output | The Director of Operations of the International Committee of the Red Cross, Dominic Stillhart, reported on Sunday at a press conference in Sana’a that 115 people died from cholera from April 27 to May 13. |
Human English Output | On Sunday, Dominik Stillhart, director of operations of the International Committee of the Red Cross, said during a press conference in Sana’a that 115 people died from cholera in the country between April 27 and May 13. |
Russian | Рынок акций США в среду, вероятнее всего, начнет торговую сессию умеренным ростом на 0,4-0,5% в рамках коррекции к падению предыдущего дня, вызванному напряжением в торговых отношениях между США, Китаем и некоторыми другими странами. |
SDL Neural MT English Output | The US stock market is likely to start its trading session on Wednesday 0.4-0.5% as part of an adjustment to the fall of the previous day caused by the tension in trade relations between the United States, China and some other countries. |
Human English Output | The US stock market on Wednesday, will most likely start a trade session with a moderate growth of 0.4-0.5% as part of the correction to the fall of the previous day caused by tension in trade relations between the USA, China and some other countries. |
It is important to qualify that these results only reflect the results of generic MT systems and of generic sentences as shown above. The SDL research noted that the generic Neural MT system did not perform as well on domain specific data. As a supplier of MT solutions to the enterprise, SDL will typically adapt MT systems to the unique needs of each enterprise customer’s domain. This adaptation is also an especially challenging task with Neural MT models. In other experiments with domain adapted MT systems, the SDL research team noted that there were further improvements in perceived quality and they documented that, adaptation of the SDL Neural technology provided a 30% improvement over the generic neural engine on domain specific data.
SDL Neural MT for the enterprise
The SDL Neural MT breakthrough accomplishment comes soon after several other announcements related to their ongoing research and development, and substantial progress with taking Neural MT from a research environment to a deployable enterprise-ready technology."It is remarkable to see such a leap in translation quality with SDL’s latest neural technology. We are currently working on transitioning this advancement from our R&D lab to our enterprise customers,” said Quinn Lam, Senior Product Manager, SDL. “Planned for release this summer, the latest version of SDL Enterprise Translation Server (ETS) will be powered by this fully productized state-of-the-art Neural technology. Stay tuned!”
Another key requirement for successful MT adoption in the enterprise is the ability to get the system to learn and adapt to enterprise-specific linguistic requirements and preferences. This has been especially challenging with Neural MT technology which, until now, has been difficult to do without undermining the fluency and output quality. SDL researchers recently figured out how to augment Neural MT with dictionary capabilities. This means enterprises can easily adapt SDL Neural MT across multiple departments that have differing terminology, yet still maintain the translation fluency that this latest generation of MT technology is acclaimed for.
SDL’s latest dictionary feature sets a new industry standard for user control over automated translations, allowing users across the enterprise to use different dictionaries without impacting the quality of the translations. SDL ETS Dictionary capabilities include:
- Controls that allow an enterprise to enforce multiple terminology and translation preferences for the same word, something that is necessary for different departments who may have unique interpretations for the same word or term.
- Easy implementation of preferred terminology and personalization by any user with no upfront technical knowledge or training required.
- Deployment of multiple dictionaries in a single engine at the same time, allowing multiple departments with differing needs to optimize the MT engine differently.
- Terminology preferences that can be changed and modified on an ongoing basis to accommodate changing business and communication priorities.
Left to Right: Gonzalo Iglesias, Bill Byrne, Eva Hasler & Adrià De Gispert |
"We (SDL) now have has several other initiatives underway and will continue to introduce new features and capabilities emerging from their Neural MT research over the coming year as we bring innovative research ideas from the lab to the production deployment arena. As Samad added, “While there is great excitement about Neural MT, it is clear that as we explore further the science, we already see signs that we will continue to make progress and we look forward to bringing the most relevant innovations to market for our customers.”
No comments:
Post a Comment