Monday, July 24, 2017

The Ongoing Neural Machine Translation Momentum

This is largely a guest post by Manuel Herranz of Pangeanic, slightly abbreviated and edited from the original, to make it more informational and less promotional. Last year we saw FaceBook announce that they were going to shift all their MT infrastructure to a Neural MT foundation as rapidly as possible, this was later followed by NMT announcements from SYSTRAN, Google, and Microsoft. In the months since we have seen that many MT technology vendors have also jumped onto the NMT wagon. Some with more conviction than others. The view for those who can go right into the black box and modify things (SDL, MSFT, GOOG, FB and possibly SYSTRAN) is, I suspect, quite different from those who use open source components and have to perform a "workaround" on the output of these black box components. Basically, I see there are two clear camps amongst MT vendors:

Those who are shifting to NMT as quickly as possible (e.g. SYSTRAN)
Those who are being much more selective and either "going hybrid = SMT+NMT" or building both PB-SMT and NMT engines and choosing the better one.(e.g. Iconic).

Pangeanic probably falls in the first group based on the enthusiasm in this post. Whenever there is a paradigm shift in MT methodology the notion of "hybrid" invariably comes up. A lot of people who don't understand the degree of coherence needed in the underlying technology generally assume this is a better way. Also, I think that sometimes, the MT practitioner has too much investment sunk into the old approach and is reluctant to completely abandon the old for the new. SMT took many years to mature and what we see today is an automated translation production pipeline that includes multiple models (translation, language, reordering etc..) together with pre and post processing of translation data. The term hybrid is sometimes used to describe this overall pipeline because data can be linguistically-informed on some of these pipeline steps.

When SMT first emerged, many problems were noticed (relative to the old RBMT model), and it has taken many years to resolve some of them. The solutions that worked for SMT will not necessarily work for NMT and in fact, there is a good reason to believe they clearly will not. Mostly because the pattern matching technology in SMT is quite different, even though it is much better understood, and more evident than in NMT. The pattern detection and learning that happens in NMT is much more mysterious and unclear at this point. We are still learning what levers to pull to make adjustments and fix weird problems that we see. What can be carried forward easily are data preparation, data and corpus analysis and data quality measures that have been built over time. NMT is a machine learning (pattern matching) technology that learns from data that you show it. Thus far it is limited to translation memory and glossaries.

I am somewhat skeptical about the "hybrid NMT" stuff being thrown around by some vendors. The solutions to NMT problems and challenges are quite different (from PB-SMT) and to me, it makes much more sense to me to go completely one way or the other. I understand that some NMT systems do not yet exceed PB-SMT performance levels, and thus it is logical and smart to continue using the older systems in such a case. But given the overwhelming evidence with NMT research and actual user experience in 2017, I think the evidence is pretty clear that NMT is the way forward across the board. It is a question of when, rather than if, for most languages. Adaptive MT might be an exception in the professional use scenario because it is learning in real time if you work with SDL or Lilt. While hybrid RBMT and SMT made some sense to me, hybrid SMT+NMT does not make any sense to me and triggers blips on my bullshit radar, as it reeks of marketing-speak rather than science. However, I do think that Adaptive MT built with an NMT foundation might be viable, and could very well be the preferred model for MT for years to come, in post-editing and professional translator use scenarios in future. It is also my feeling that as these more interactive MT/TM capabilities become more widespread the relative value of pure TM tools will decline dramatically. But I am also going to bet that an industry outsider will drive this change, simply because real change rarely comes from people with sunk costs and vested interests. And surely somebody will come up with a better workbench for translators than standard TM matching, one which provides translation suggestions continuously, and learns from ongoing interactions.

I am going to bet that the best NMT systems will come from those who go "all in" with NMT and solve NMT deficiencies without resorting to force-fitting old SMT paradigm remedies on NMT models or trying to go "hybrid", whatever that means.

The value of the research data of all those who are sharing their NMT experience is immense to all, as it provides data that is useful to everybody else in moving forward faster. I have summarized some of this in previous posts: The Problem with BLEU and Neural Machine Translation, An Examination of the Strengths and Weaknesses of Neural Machine Translation, and Real and Honest Quality Evaluation Data on Neural Machine Translation.The various posts on SYSTRAN's PNMT and the recent review of SDL's NMT also describe many of the NMT challenges.

In addition to the research data from Pangeanic in this post, there is also this from Iconic and ADAPT where they basically state that mature a PB-SMT systems will still outperform NMT systems in the use-case scenarios they tested, and finally, the reconstruction strategy pointed out by Lilt, whose results are shown in the chart below. This approach apparently improves overall quality and also seems to handle long sentences better in NMT than others have reported. I have seen other examples of "evidence" where SMT outperforms NMT but I am wary of citing references where the research is not transparent or properly identified.

Source: Neural Machine Translation with Reconstruction

This excerpt from a recent TAUS post is also interesting, and points out that finally, the data is essential to making any of this work:

Google Director of Research Peter Norvig said recently in a video about the future of AI/ML in general that although there is a growing range of tools for building software (e.g. the neural networks), “we have no tools for dealing with data." That is: tools to build data, and correct, verify, and check them for bias, as their use in AI expands. In the case of translation, the rapid creation of an MT ecosystem is creating a new need to develop tools for “dealing with language data” – improving data quality and scope automatically, by learning through the ecosystem. And transforming language data from today’s sourcing problem (“where can I find the sort of language data I need to train my engine?”) into a more automated supply line.

For me this statement by Norvig is a pretty clear indication that perhaps the greatest value-add opportunities for NMT come from understanding, preparing and tuning the data that ML algorithms learn from. In the professional translation market where MT output quality expectations are the highest, it makes sense that data is better understood and prepared. I have also seen that the state of the aggregate "language data" within most LSPs is pretty bad, maybe even atrocious. It would be wonderful if the TMS systems could help improve this situation and provide a richer data management environment to enable data to be better leveraged for machine learning processes. To do this we need to think beyond organizing data for TM and projects, but at this point, we are still quite far from this. Better NMT systems will often come from better data, which is only possible if you can rapidly understand what data is most relevant (using metadata) and can bring it to bear in a timely and effective way. There is also an excessive focus on TM in my opinion. Focus on the right kind of monolingual corpus can also provide great insight, and help to drive strategies to generate and manufacture the "right kind" of TM to drive MT initiatives further. But this all means that we need to get more comfortable working with billions of words and extracting what we need when a customer situation arises.

===============

The Pangeanic Neural Translation Project

So, time to recap and describe our experience with neural machine translation with tests into 7 languages (Japanese, Russian, Portuguese, French, Italian, German, Spanish), and how Pangeanic has decided to shift all its efforts into neural networks and leave the statistical approach as a support technology for hybridization.

We selected training sets from our SMT engines as clean data to train the same engines with the same data and run parallel human evaluation between the output of each system (existing statistical machine translation engines) and the new engines produced by neural systems. We are aware that if data cleaning was very important in a statistical system, it is even more so with neural networks. We could not add additional material because we wanted to be certain that we were comparing exactly the same data but trained with two different approaches.

A small percentage of bad or dirty data can have a detrimental effect on SMT systems, but if it is small enough, statistics will take care of it and won’t let it feed through the system (although it can also have a far worse side effect, which is lowering statistics all over certain n-grams).

We selected the same training data for languages which we knew were performing very well in SMT (French, Spanish, Portuguese) as well as those that have been known to researchers and practitioners as “the hard lot”: Russian as the example of a very rich morphologically language and Japanese as a language with a radically different grammatical structure where re-ordering (that’s what hybrid systems have done) has proven to be the only way to improve.

Japanese neural translation tests

Let’s concentrate first on the neural translation results in Japanese as they represent the quantum leap in machine translation we all have been waiting for. These results were presented at TAUS Tokyo last April. (See our previous post TAUS Tokyo Summit: improvements in neural machine translation in Japanese are real).

We used a large training corpus of 4.6 million sentences (that is nearly 60 million running words in English and 76 million in Japanese). In vocabulary terms, that meant 491,600 English words and 283,800 character-words in Japanese. Yes, our brains are able to “compute” all that much and even more, if we add all types of conjugations, verb tenses, cases, etc. For testing purposes, we did what is supposed to do not to inflate percentage scores and took out 2,000 sentences before training started. This is a standard in all customization – a small sample is taken out so the engine that is generated translates what is likely to encounter. Any developer including the test corpus in the training set is likely to achieve very high scores (and will boast about it). But BLEU scores have always been about checking domain engines within MT systems, not across systems (among other things because the training sets have always been different so a corpus containing many repetitions or the same or similar sentences will obviously produce higher scores). We also made sure that no sentences were repeated and even similar sentences had been stripped out of the training corpus in order to achieve as much variety as possible. This may produce lower scores compared to other systems, but the results are cleaner and progress can be monitored very easily. This has been the way in academic competitions and has ensured good-quality engines over the years.

The standard automatic metric in SMT did not detect much difference between the output in NMT and the output in SMT.

However, WER was showing a new and distinct tendency.

NMT shows better results in longer sentences in Japanese. SMT seems to be more certain in shorter sentences (training a 5 n-gram system)

And this new distinct tendency is what we picked up when the output was evaluated by human linguists. We used Japanese LSP Business Interactive Japan to rank the output from a conservative point of view, from A to D, A being human quality translation, B a very good output that only requires a very small percentage of post-editing, C an average output where some meaning can be extracted but serious post-editing is required and D a very low-quality translation without no meaning. Interestingly, our trained statistical MT systems performed better than the neural systems in sentences shorter than 10 words. We can assume that statistical systems are more certain in these cases when they are only dealing with simple sentences with enough n-grams giving evidence of a good matching pattern.

We created an Excel sheet (below) for human evaluators with the original English to the left with the reference translation. The neural translation followed. Two columns were provided for the rating and then the statistical output was provided.


Neural-SMT EN>JP ranking comparison showing the original English, the reference translation, the neural MT output and the statistical system output to the right

German, French, Spanish, Portuguese and Russian Neural MT results

The shocking improvement came from the human evaluators themselves. The trend pointed to 90% of sentences being classed as perfect translations (naturally flowing) or B (containing all the meaning, with only minor post-editing required). The shift is remarkable in all language pairs, including Japanese, moving from an “OK experience” to a remarkable acceptance. In fact, only 6% of sentences were classed as a D (“incomprehensible/unintelligible”) in Russian, 1% in French and 2% in German. Portuguese was independently evaluated by translation company Jaba Translations.

This trend is not particular to Pangeanic only. Several presenters at TAUS Tokyo pointed to ratings around 90% for Japanese using off-the-shelf neural systems compared to carefully crafted hybrid systems. Systran, for one, confirmed that they are focusing only on neural research/artificial intelligence and throwing away years of rule-based work, statistical and hybrid efforts.

Systran’s position is meritorious and very forward thinking. Current papers and some MT providers still resist the fact that despite all the work we have done over the years, Multimodal Pattern Recognition has got the better hand. It was only computing power and the use of GPUs for training that was holding it behind.

Neural networks: Are we heading towards the embedment of artificial intelligence in the translation business?

BLEU may be not the best indication of what is happening to the new neural machine translation systems, but it is an indicator. We were aware of other experiments and results by other companies pointing in a similar direction. Still, although the initial results may have made us think that there was no use to it, BLEU is a useful indicator – and in any case, it was always an indicator of an engine’s behavior not a true measure of an overall system versus another. (See the Wikipedia article https://en.wikipedia.org/wiki/Evaluation_of_machine_translation).

Machine translation companies and developers face a dilemma as they have to do without the research, connectors, plugins and automatic measuring techniques and build new ones. Building connectors and plugins are not so difficult. Changing the core from Moses to a neural system is another matter. NMT is producing amazing translations, but it is still pretty much a black box. Our results show that some kind of hybrid system using the best features of an SMT system is highly desirable and academic research is moving in that direction already – as it happened with SMT itself some years ago.

Yes, the translation industry is at the peak of the neural networks hype. But looking at the whole picture and how artificial intelligence (pattern recognition) is being applied in several other areas, in order to produce intelligent reports, tendencies, and data, NMT is here to stay – and it will change the game for many, as more content needs to be produced cheaply with post-edition, at light speed when good machine translation is good enough. Amazon and Alibaba are not investing millions in MT for nothing – they want to reach people in their language with a high degree of accuracy and at a speed, human translators cannot.

Manuel Herranz is the CEO of Pangeanic. Collaboration with Valencia’s Polytechnic research group and the Computer Science Institute led to the creation of the PangeaMT platform for translation companies. He worked as an engineer for Ford machine tool suppliers and Rolls Royce Industrial and Marine, handling training and documentation from the buyer’s side when translation memories had not yet appeared in the LSP landscape. After joining a Japanese group in the late 90’s, he became Pangeanic’s CEO in 2004 and began his machine translation project in 2008 creating the first, command-line versions of the first commercial application of Moses (Euromatrixplus) and was the first LSP in the world to implement open source Moses successfully in a comercial environment, including re-training features and tag handling before they became standard in the Moses community.

Tuesday, July 18, 2017

Linguistic Quality Assurance in Localization – An Overview

This is a post by Vassilis Korkas on the quality assurance and quality checking processes being used in the professional translation industry. (I still find it really hard to say localization, since that term is really ambiguous to me, as I spent many years trying to figure out how to deliver properly localized sound through digital audio platforms. To me, localized sound = cellos from the right and violins from the left of the sound stage. I have a strong preference for instruments to stay in place on the sound stage for the duration of the piece. )

As the volumes of translated content increase, the need for automated production lines also grows. The industry is still laden with products that don't play well with each other, and buyers should insist that vendors of the various tools that they use enable and allow easy transport and downstream processing of any translation related content. Froom my perspective automation in the industry is also very limited, and there is a huge need for human project management because tools and processes don't connect well. Hopefully, we start to see this scenario change. I also hope that the database engines for these new processes are much smarter about NLP and much more ready to integrate machine learning elements as this too will allow the development of much more powerful, automated, and self correcting tools.

As an aside, I thought this chart was very interesting, (assuming it is actually based on some real research), and shows why it is much more worthwhile to blog than to share content on LinkedIn, Facebook or Twitter. However, the quality of the content does indeed matter and other sources say that high quality content has an even longer life than shown here.

Source: @com_unit_inside

Finally, CNBC had this little clip describing employment growth in the translation sector where they state: "The number of people employed in the translation and interpretation industry has doubled in the past seven years." Interestingly, this is exactly the period where we have seen the use of MT also dramatically increase. Apparently, they conclude that technology has also helped to drive this growth.

The emphasis in the post below is mine.

==========

In pretty much any industry these days, the notion of quality is one that seems to crop up all the time. Sometimes it feels like it’s used merely as a buzzword, but more often than not quality is a real concern, both for the seller of a product or service and the consumer or customer. In the same way, quality appears to be omnipresent in the language services industry as well. Obviously, when it comes to translation and localization, the subject of quality has rather unique characteristics compared to other services, however, ultimately it is the expected goal in any project.

In this article, we will review what the established practices are for monitoring and achieving linguistic quality in translation and localization, examine what the challenges are for linguistic quality assurance (LQA) and also attempt to make some predictions for the future of LQA in the localization industry.

Quality assessment and quality assurance: same book, different pages

Despite the fact that industry standards have been around for quite some time, in practice, terms such as ‘quality assessment’ and ‘quality assurance’, and sometimes even ‘quality evaluation’, are often used interchangeably. This may be due to a misunderstanding of what each process involves but, whatever the reason, this practice leads to confusion and could create misleading expectations. So, let us take this opportunity to clarify:

[Translation] Quality Assessment (TQA) is the process of evaluating the overall quality of a completed translation by using a model with pre-determined values which can be assigned to a number of parameters used for scoring purposes. Such models are the LISA, the MQM, the DQF, etc.
Quality Assurance “[QA] refers to systems put in place to pre-empt and avoid errors or quality problems at any stage of a translation job”. (Drugan, 2013: 76)

Quality is an ambiguous concept in itself and making ‘objective’ evaluations is a very difficult task. Even the most rigorous assessment model requires subjective input by the evaluator who is using it. When it comes to linguistic quality, in particular, we would be looking to improve on issues that have to do with punctuation, terminology and glossary compliance, locale-specific conversions and formatting, consistency, omissions, untranslatable items and others. It is a job that requires a lot of attention to detail and strict adherence to rules and guidelines – and that’s why LQA (most aspects of it, anyway) is a better candidate for ‘objective’ automation.

Given the volume of translated words in most localization projects these days, it is practically prohibitive in terms of time and cost to have in place a comprehensive QA process, which would safeguard certain expectations of quality both during and after translation. Therefore it is very common that QA, much like TQA, is reserved for the post-translation stage. A human reviewer, with or without the help of technology, will be brought in when the translation is done and will be asked to review/revise the final product. The obvious drawback of this process is that significant time and effort could be saved if somehow revision could occur in parallel with the translation, perhaps by involving the translator herself with the process of tracking errors and making these corrections along the way.

The fact that QA only seems to take place ‘after the fact’ is not the only problem, however. Volumes are another challenge – too many words to revise, too little time and too expensive to do it. To address this challenge, Language Service Providers (LSPs) use sampling (the partial revision of an agreed small portion of the translation) and spot-checking (the partial revision of random excerpts of the translation). In both cases, the proportion of the translation that is checked is about 10% of the total volume of translated text, and that is generally considered agreeable to be able to say whether the whole translation is good or not. This is an established and accepted industry practice that was created out of necessity. However, one doesn’t need to have a degree in statistics to appreciate that this small sample, whether defined or random, is hardly big enough to reflect the quality of the overall project.

The progressive increase of the volumes of text translated every year (also reflected in the growth of the total value of the language service industry, as seen below) and the increasing demands for faster turnaround times makes it even harder for QA-focused technology to catch up. The need for automation is greater than ever before.


Source: Common Sense Advisory (2017)

Today we could classify QA technologies into three broad groups:

built-in QA functionality in CAT tools (offline and online),
stand-alone QA tools (offline),
custom QA tools developed by LSPs and translation buyers (mainly offline).

Built-in QA checks in CAT tools range from the completely basic to the quite sophisticated, depending on which CAT tool you’re looking at. Stand-alone QA tools are mainly designed with error detection/correction capabilities in mind, but there are some that use translation quality metrics for assessment purposes – so they’re not quite QA tools as such. Custom tools are usually developed in order to address specific needs of a client or a vendor who happens to be using a proprietary translation management system or something similar. This obviously presupposes that the technical and human resources are available to develop such a tool, so this practice is rather rare and exclusive to large companies that can afford it.

Consistency is king – but is it enough?

Terminology and glossary/wordlist compliance, empty target segments, untranslated target segments, segment length, segment-level inconsistency, different or missing punctuation, different or missing tags/placeholders/symbols, different or missing numeric or alphanumeric structures – these are the most common checks that one can find in a QA tool. On the surface at least, this looks like a very diverse range that should cover the needs of most users. All these are effectively consistency checks. If a certain element is present in the source segment, then it should also exist in the target segment. It is easy to see why this kind of “pattern matching” can be easily automated and translators/reviewers certainly appreciate a tool that can do this for them a lot more quickly and accurately than they can.

Despite the obvious benefits of these checks, the methodology on which they run has significant drawbacks. Consistency checks are effectively locale-independent and that creates false positives (the tool detects an error when there is none), also known as “noise”, and false negatives (the tool doesn’t detect an error when there is one). Noise is one of the biggest shortcomings of QA tools currently available and that is because of the lack of locale specificity in the checks provided. It is in fact rather ironic that the benchmark for QA in localization doesn’t involve locale-specific checks. To be fair, in some cases users are allowed to configure the tool in greater depth and define such focused checks on their own (either through existing options in the tools or with regular expressions).

Source: XKCD

But, this makes the process more intensive for the user and it comes as no surprise that the majority of users of QA tools never bother to do that. Instead, they perform their QA duties relying on the sub-optimal consistency checks which are available by default.

Linguistic quality assurance is (not) a holistic approach

In practice, for the majority of large scale localization projects, only post-translation LQA takes place, mainly due to time pressure and associated costs – an issue we touched on earlier in connection with the practice of sampling. The larger implication of this reality is that:

a) effectively we should be talking about quality control rather than quality assurance, as everything takes place after the fact; and
b) quality assurance becomes a second-class citizen in the world of localization. This contradicts everything we see and hear about the importance of quality in the industry, where both buyers and providers of language services prioritise quality as a prime directive.

As already discussed, the technology does not always help. CAT tools with integrated QA functionality have a lot of issues with noise, and that is unlikely to change anytime soon because this kind of functionality is not a priority for a CAT tool. On the other hand, stand-alone QA tools with more extensive functionality work independently, which means that any potential ‘collaboration’ between stand-alone QA tools and CAT tools can only be achieved in a cumbersome and intermittent workflow: complete the translation, export it from the CAT tool, import the bilingual file in the QA tool, run the QA checks, analyse the QA report, go back to the CAT tool, find the segments which have errors, make corrections, update the bilingual file and so on.

The continuously growing demand in the localization industry for the management of increasing volumes of multilingual content in pressing timelines and the compliance with quality guidelines means that the challenges described above will have to be addressed soon. As the trends of online technologies in translation and localization become stronger, there is an implicit understanding that existing workflows will have to be uncomplicated in order to accommodate future needs in the industry. This can indeed be achieved with the adoption of bolder QA strategies and more extensive automation. The need in the industry for a more efficient and effective QA process is here now and it is pressing. Is there a new workflow model which can produce tangible benefits both in terms of time and resources? I believe there is, but it will take some faith and boldness to apply it.

Get ahead of the curve

In the last few years, the translation technology market has been marked by substantial shifts in the market shares occupied by offline and online CAT tools respectively, with the online tools gaining rapidly more ground. This trend is unlikely to change. At the same time, the age-old problems of connectivity and compatibility between different platforms will have to be addressed one way or another. For example, slowly transitioning to an online CAT tool and still using the same offline QA tool from your old workflow is inefficient as it is irrational, especially in the long run.

A deeper integration between CAT and QA tools also has other benefits. The QA process can move up a step in the translation process. Why have QA only in post-translation when you can also have it in-translation? (And it goes without saying that pre-translation QA is also vital, but it would apply to the source content only so it’s a different topic altogether.) This shift is indeed possible by using API-enabled applications – which are in fact already standard practice for the majority of online CAT tools. There was a time when each CAT tool had its own proprietary file formats (as they still do), and then the TMX and TBX standards were introduced and the industry changed forever, as it became possible for different CAT tools to “communicate” with each other. The same will happen again, only this time APIs will be the agent of change.


Source: API Academy

Looking further ahead, there are also some other exciting ideas which could bring about truly innovative changes to the quality assurance process. The first one is the idea of automated corrections. Much in the same way that a text can be pre-translated in a CAT tool when a translation memory or a machine translation system is available, in a QA tool which has been pre-configured with granular settings it would be possible to “pre-correct” certain errors in the translation before a human reviewer even starts working on the text. With a deeper integration scenario in a CAT tool, an error could be corrected in a live QA environment the moment a translator makes that error.

This kind of advanced automation in LQA could be taken even a step further if we consider the principles of machine learning. Access to big data in the form of bilingual corpora which have been checked and confirmed by human reviewers makes the potential of this approach even more likely. Imagine a QA tool that collects all the corrections a reviewer has made and all the false positives the reviewer has ignored and then it processes all that information and learns from it. Every new text processed and the machine learning algorithms make the tool more accurate in what it should and should not consider to be an error. The possibilities are endless.

Despite the various shortcomings of current practices in LQA, the potential is there to streamline and improve processes and workflows alike, so much so that quality assurance will not be seen as a “burden” anymore, but rather as an inextricable component of localization, both in theory and in practice. It is up to us to embrace the change and move forward.

Reference
Drugan, J. (2013) Quality in Professional Translation: Assessment and Improvement. London: Bloomsbury.

---------------

Vassilis Korkas is the COO and a co-founder of lexiQA. Following a 15-year academic career in the UK, in 2015 he decided to channel his expertise in translation technologies, technical translation and reviewing into a new tech company. In lexiQA he is now involved with content development, product management, and business operations.

Note
This is the abridged version of a four-part article series published by the author on lexiQA’s blog: Part 1 – Part 2 – Part 3 – Part 4

This link will also provide specific details on the lexiQA product capabilities.

Tuesday, July 11, 2017

Translation Industry Perspective: Utopia or Dystopia?

This is a follow-up post by Luigi Muzii on the evolving future of the "professional translation industry". His last post has already attracted a lot of attention based on Google traffic rankings. In my view, Luigi provides great value to the discussion on "the industry" with his high-level criticism of dubious industry practices, since much of what he points to is clearly observable fact. Bullshit marketing speak is a general problem across industries, but Luigi hones in on some of the terms that are thrown around at localization conferences and in the industry. You may choose to disagree with him, or perhaps possibly see that there are good reasons to start a new, more substantive discussion. Among other things, Luigi challenges the improper usage of the term "agile" in the localization world in this post. The concept of agile comes from the software development world and refers most often, to the rapid prototyping, testing and production implementation of custom software development projects.

Agile software development is a set of principles for software development in which requirements and solutions evolve through collaboration between self-organizing, cross-functional teams. It promotes adaptive planning, evolutionary development, early delivery, and continuous improvement, and it encourages rapid and flexible response to change.

To apply this concept to translation production work is indeed a stretch in my view. (Gabor, can you provide me a list of who uses this word (agile) the most on their websites?) While there is a definite change in the kind of projects and the manner in which translation projects are defined and processed today, using terms from the software industry to describe minor evolutionary changes to very archaic process and production models, is indeed worth raising some questions on. The notion of "augmented translation" is also somewhat silly in a world where only ~50% of translators use the 1990's technology called translation memory, a database technology that is archaic at best.

It is my feeling that step one to make a big leap forward is to shift the focus from the segment level to the corpus level. Step two is to focus on customer conversation text rather than documentation text that few ever read. Step three is to have proper metadata and build more robustness on this leveragable linguistic asset.

MT is already the dominant means to translate content in the world today, but few LSPs or translators really know how to use it skillfully. Change in the professional translation world is slow, especially if it is evolutionary, and involves the skillful use of new technology (i.e. not DIY MT or DIY TMS). In my long-term observation of how the industry has responded to and misused MT, I can attest to this. Those few who get MT right I think will very likely be the leaders who will define the new agenda as tightly integrated MT and HT work is a key to rapid response and business process agility (not "agile" ) and continuous improvement. Effectiveness is closely related to developing meaningful new technology savvy translation process skills, which few invest in, and thus many are likely to be caught in the cross-hairs of new power players who might enter the market and change the rules.

Given the recent rumors of Amazon developing MT technology services, we should not be too surprised, if in the next few years a new force emerges in "professional translation", that from the outset properly integrates MT + HT + Outsourced Labor (Super Duper Mechanical Turk) with continuous improvement machine learning and AI infrastructure to deliver equivalent translation product at a fraction of the cost of an LSP like Lionbridge or Transperfect for example. They are already building MT engines across a large number of subject domains, so have deep knowledge of how to do this on the billions-of-words per month scale, and they are also the largest provider of cloud services today. As I pointed out last year the players who make the most money from machine translation are companies outside the translation industry. Amazon has already displaced Sears and several other major retailers and they have the right skills to pull this off if they really wanted to. Check out the chart on retail store closings that is largely driven by AMZN.

Even if they only succeed in taking only 5-10% of the "translation market" it would still make them a multi-billion dollar LSP that could handle 3 word or 3 billion word projects into 10 languages with equal ease, and do this with minimal need to labor through project management meetings and discussions about quality. It might also be in the most automated and continuous improvement modus operandi we have ever seen. So, think of a highly automated and web-based customer interaction language translation service, that has a Google scale MT with output better quality across 50 subject domains and an AI backbone, and the largest global network of human translators and editors who get paid on delivery, and are given a translator workbench that enhances and assists actual translation work at every step of the way. Think of a workbench that integrates corpus analysis, TM, MT, dictionaries, concordance, synonym lookup, and format management all in one single interface, and which makes translation industry visions of "augmented translation" look like toys.

So get your shit together boys and girls cause the times they are a-changing.

The highlights and emphasis in this post and most of the graphics are my choices. I have also added some comments within Luigi's text in purple italics. The Dante quote below is really hard to translate. I will update it if somebody can suggest a better translation.

-----------------

Vuolsi così colà dove si puote/ciò che si vuole, e più non dimandare.

It is so willed there where is power to do/That which is willed; and farther question not.

Merriam-Webster defines technology as “the practical application of knowledge especially in a particular area.” The Oxford Dictionary defines it as “the application of scientific knowledge for practical purposes, especially in industry.” The Cambridge Dictionary defines technology as “(the study and knowledge of) the practical, especially industrial, use of scientific discoveries.” Collins defines technology as the “methods, systems, and devices which are the result of scientific knowledge being used for practical purposes”. More extensively, Wikipedia defines technology as “the collection of techniques, skills, methods, and processes used in the production of goods or services or in the accomplishment of objectives.”

This should be enough to mop away the common misconception that technology is limited to physical devices. In fact, according to Encyclopedia Britannica, hard technology is concerned with physical devices, and soft technology is concerned with human and social factors. Hard technologies cannot do without the corresponding soft technologies, which, however, are hard to acquire because they depend on human knowledge that is obtained through instruction, application, and experience. Technology is also divided in basic and high.

That said, language, to all effects and purposes is a technology. A soft technology, and a basic one, yet highly sophisticated.

Why this long introduction on technology? Because we have been experiencing an exponential technological evolution for over half a century that we can hardly master. We have been adapting fast, as usual, but every day less.

This exponential technological evolution is the daughter of the Apollo program, whose upshot has been universally acknowledged as the greatest achievement in human history. It stimulated practically every area of technology.

Some of the most important technological innovations from the Apollo program were popularized in the '80s and the '90s, and even the so-called translation industry is, in some ways, a spinoff of that season.

Translation Industry Evolution

Indeed, if the birth of the translation profession as we know it today can be traced back to the years between the two world wars of the last century, with the development of world trade, the birth of the translation industry can be set around the late 1980s with the spread of personal computing and office automation.

The products in this category were aimed at new customers, SMEs and SOHO, rather than the usual customers, the big companies that had the resources and the staff to handle bulky and complex systems. These products could be sold to a larger public, even overseas, but for worldwide sales to be successful, they had to speak the languages of the target countries. Translation then received a massive boost, and the computer software industry was soon confronted with the problem of adapting its increasingly multifaceted products to local markets.

The translation industry as we know it today is then the abrupt evolution of a century-old single person practice into multi-person shops. As a matter of fact, intermediaries (the translation agencies) existed even before tech companies helped translation become a global business, but their scope and ambition were strictly local. They were mostly multiservice centers, and their marketing policy was to essentially renew an ad on the local yellow pages every year.

With the large software companies, the use of translation memories (TMs) also burst onto the scene. The software industry saw in the typical TM feature of finding repetitions, a way to cut translation costs.

So far, TMs have been the greatest and possibly the single disruptive innovation in translation. As SDL’s Paul Filkin recently recalled, TMs were the application of the research of Alan Melby and his team at Brigham Young University in the early ‘80s.

Unable to bear the overhead that the large volumes from big-budget clients were procuring, translation vendors devised a universal way to recover from profit loss by asking for discounts to their vendors, regardless of the nature of jobs.

In the late 1990s, Translation Management Systems (TMSs) began to spread; they were the only other innovation, way less important and much less impacting than TMs.

At the end of the first decade of the 2000s, free online machine translation (MT) engines started releasing “good-enough” outputs, and since the surge in demand for global content over the last three decades has resulted in a far greater need to translate content than enough talent available, MT has been growing steadily and exponentially, to the point that today, machines translate over 150 billion words per day, 300 times more than humans, in over 200 combinations, serving more than 500 million users every month. (Actually, I am willing to bet the daily total is in excess of 500 billion words. KV)

We are now on the verge of full automation of translation services. Three main components of the typical workflow might, indeed, be almost fully automatized: production, management, and delivery. Production could be almost fully automatized with MT; TMSs have almost fully automatized translation management and delivery. Why almost? Because the translation industry is not immune to waves and hype, but it is largely and historically very conservative, a little reactive, and therefore a “late” adopter of technology. A manifest evidence is an infatuation with the agile methodology, and the consequent excitement affecting some of the most prominent industry players. Of course, prominence does not necessarily mean competence.

In fact, agile is rather a brand-name, with the associated marketing hype, and as such, is more a management fad, that has a limited lifespan. In fact, localization processes can hardly be suitable for agile methodology, for its typical approach and process. If it is true that no new tricks can be taught to any old dog, for agile to be viable, a century-old teaching and practicing attitude should be profoundly reformed. Also, although agile has become the most popular software project management paradigm, it is understood for not having even really improved software quality, that is generally considered low. (Here is a website called http://agileisbullshit.tumblr.com/ that documents the many problems of this approach. KV) In contrast, the translation industry has always been claiming to be strongly focused on and committed to quality. If quality is the main concern for translation buyers, this possibly means that most vendors are still far from achieving a constant level of appreciable quality. In fact, while lists of security defects for major software companies show high levels of open deficiencies, the complaints of translation users and customers around the world say that the industry works poorly.

Raising the bar, increasing the stakes, pushing the boundary always a little further, are all motives for the adoption of a new working methodology like agile. These motives translate into more, faster and cheaper, but not necessarily better. Indeed, higher speed, greater agility, and lower cost of processes are supposed to make reworks and retrofitting expedient.

Anyway, flooding websites, blogs, presentations, and events with paeans praising the wonders of a methodology that is supposed to be fashionable is not just ludicrous, it is of no help. Mouthing trendy words without knowing much about their meaning and the underlying concepts may seem like an effective marketing strategy, but, in the end, it is going to hurt when the recipients realize that this only disguises the actual backwardness and ignorance.

The explosion of content has been posing serious translation issues to emerging global companies. The relentless relocation of businesses on the Web made DevOps and continuous delivery the new paradigms, pushing the demand for translation automation even further.

Many in the translation community speak and act as if they were and will be living in imaginary and indefinitely remote place that possesses highly desirable or nearly perfect qualities for its inhabitants. They see the future, whatever it is depicted, as an imaginary place where people lead dehumanized and often fearful lives.

In the meanwhile, a survey presented a few weeks ago in an article in the MIT Technology Review reports, there’s a 50% chance of AI outperforming humans in all tasks in 45 years and of automating all human jobs in 120 years. Specifically, researchers predict AI will outperform humans in translating languages by 2024, writing high-school essays by 2026, writing a bestselling book by 2049, and working as a surgeon by 2053.

After all, innovation and translation have always been strange bedfellows. Innovations come from answering new questions, while the translation community has been struggling with the same old issues for centuries. Not surprisingly, any innovation in the translation industry is and will most certainly be sustaining innovation, perpetuating the current dimensions of performance.

Nevertheless, despite the fear that robots will destroy jobs and leave people unemployed, the market for translation technologies is increasing, but translation Luddites are convinced that translation technologies will not endanger translation jobs anytime soon, and point rather to the lack of skilled professionals.

Indeed, the translation industry resembles a still life painting, with every part of it seemingly immutable. A typical part of this painting is quality assessment, still following the costly and inefficient inspection-based error-counting approach and the related red-pen syndrome.

In this condition of increasing automation and disintermediation, a tradeoff on quality seems the most easily predictable scenario. As for the software industry, increasing speed and agility, while controlling costs could make reworks and retrofits acceptable. MT will be spreading more and more and post-editing will be the ferry to the other banks of global communication, allowing direct transit between points at a capital cost much lower than bridges or tunnels. MT is not Charon, though.

Charon as depicted by Michelangelo in his fresco The Last Judgment in the Sistine Chapel

The key question regarding post-editing is how much longer it will be necessary or even requested. Right now, most translation jobs are small, rushed, basic, and unpredictable in frequency, and yet production models are still basically the same as fifty years ago, despite the growing popularity of TMS systems and other automation tools. This means that the same amount of time is required for managing big and tiny projects, as translation project management still hinges on the same rigid paradigms borrowed from tradition.

The most outstanding forthcoming innovations in this area will be confidence scoring and data-driven resource allocation. They have already been implemented and will be further improved when enough quality data is going to be available. In fact, confidence scoring is almost useless if scores cannot be first compared with benchmarks and later with actual results. Benchmarks can only come from project history while results must be properly measured, and measures must be known to be read and then classified.

This is not yet in the skillset of most LSPs and is far, very far to be taught in translation courses or in translator training programs.

However, this is where human judgment will remain valuable for a long time. Not quality assessment, which is still today not yet objective enough. Information asymmetry will remain a major issue, as there will always be a language pair totally outside the scope of any customer, who has no way of knowing if the product would match the promises made to the customer. Indeed, human assessment of translation quality, if based on the current universal approach, implies the use of a reference model, although implicit. In other words, everyone who is requested to evaluate a translation does it based on his/her own ideal.

MT capability will be integrated into all forms of digital communication, and MT itself will soon become a commodity. This will further make post-editing replace translation memory leveraging as the primary production environment in industrial translation in the next few years. This also means that, in the immediate future, the urge for post-editing of MT could escalate and find translation industry players unprepared.

In fact, the quality of the MT output has been steadily improving, and now it is quite impressive. This is what most translators should be afraid of, that expectations on professional translators will be increasing.

With machines being soon better at almost everything humans do, translation companies will have to rethink their business. Following the exponential pace of evolution, MT will soon leave little room for translation business. This does not mean that human translations will not be necessary any longer. Simply that today’s 1 percent will shrink even further, much further. Humans will most possibly be required where critical decisions must be made. This is precisely the kind of situation where information asymmetry plays a central role, in those cases where one party has no way of knowing if the product received from the other party would match the promises, for example when a translation should be handled as evidence in court.

With technology making it possible to match resources, find the most suitable MT engine for a particular content, predict quality, etc. human skills will have to change. Already today, no single set of criteria guarantees an excellent translation, and the quality of people alone has little to do with the services they render and the fees they charge.

This implies that vendor management (VM) will be an increasingly crucial function. Assessing vendors, of all kinds, will require skills and tools that have never been developed in translation courses. Today, vendor managers are mostly linguists who have developed VM management competence on their own, and most of the time cannot dedicate all their time and efforts to vendor assessment and management and are forced to do their best with spreadsheets, without having the chance to attend HRM or negotiation courses. Vendor management systems (VMSs) have been around for quite some time now, but they are still unknown to most LSPs. And yet, translation follows a typical outsourcing supply chain, down to freelancers.

So, translation industry experts, authorities, and players, should stop bullshitting. True, the industry has been growing more or less steadily, even in a time of general crisis, but the translation business still only counts for a meager 1 percent of the total. In other words, when translation buyers are deciding to waive the zero-translation option and have all or most content translated, the growth is still linear.

Agile in translation is not the only mystification via marketing-speak being used in the localization business. Now it is the turn of “augmented translation” and “lights-out project management." (Lights Out Management (LOM) is the ability for a system administrator to monitor and manage servers by remote control.) Borrowing terms (not concepts) from other fields is clearly meant to disguise crap, look cool, and astonish the audience, but, trying to look cool does not necessarily mean being cool. In the end, it can make one seem she/he does not really know what she/he is talking about. Even trendy models are shaped by precise rules and roles: using them only as magic words may backfire.

Nonetheless, this bad habit does not seem to decline even a bit. Indeed, it still dominates industry events.

Localization World, for example, is supposed to be the world’s premiere conference when it comes to unveiling new translation technology and trends. Anyway, most of the over 400 participants gathered in Barcelona seemed to have spent their time in parties and social activities, while room topics strayed quite far away from the conference theme of continuous delivery and the associated technologies and trends, despite the fact that the demand for better automation and more advanced tools are growing steadily. Maybe it is true that social aspect in conferences is what conferences are for, but then why pick a theme and layout presentations and discussions?

Presentations revolve around the usual arguments, widely and repeatedly dealt with before, and after the event, and are often slavish repetitions of commercial propositions. Questions and comments are usually not meant to be challenging or to generate debate, although stimulating and enriching it would be. Triviality rules, because no one is willing to burn his/her stuff that is intended to be presented in other times to different audiences.

Anyway, change is coming fast and, once again, the translation industry is about to be found unprepared when the effects of the next innovation will mess it up. So, it is time for LSPs—and their customers—to rethink their translation business and awaken from the drowsiness in which they have always received innovations. Also, jobs are changing quickly and radically too, and the gap to bridge between education and business would be even wider than it is now, which is already large. It is making less and less sense to imagine for one’s own children a future in translation as a profession, and this is going to make it harder and harder to find young talents who are willing and able to work with the abundance of technology, data and solutions available in the industry, however fantastic.

This said it won’t be long before “skilled in machine learning” becomes the new “proficient in Excel”. And now very few in the translation community are concretely doing something about this. Choosing an ML algorithm will soon be as simple as selecting a template in Microsoft Word, but so far, very few translation graduates and even professional translators seem that proficient. In Word, of course.

Luigi Muzii has been in the "translation business" since 1982 and has been a business consultant since 2002, in the translation and localization industry through his firm. He focuses on helping customers choose and implement best-suited technologies and redesign their business processes for the greatest effectiveness of translation and localization related work.

This link provides access to his other blog posts.

Monday, July 3, 2017

A Closer Look at SDL's new MT announcements

SDL recently announced some new initiatives in their machine translation product SDL Enterprise Translation Server (ETS). As with most press releases, there is very little specific detail in the release itself, and my initial impression was that there really was not much news here other than the mention that they are also doing Neural MT. Thus, I approached my SDL contacts and asked if they would be willing to share some more information with me for this blog, for the reader base who were curious about what this announcement really means. I initially set out with questions that were very focused on NMT, but the more I learned about SDL ETS, the more I felt that it was worth more attention. The SDL team were very forthcoming, and shared interesting material with me, thus allowing me to provide a clearer picture in this post of the substance behind this release, (which I think Enterprise Buyers, in particular, should take note of), and which I have summarized below.

The press release basically focuses on two things:

An update to SDL ETS 7.4, “a secure Machine Translation (MT) platform for regulated industries,” (though I am not sure why it would not apply to any large global enterprise, especially those that depend on a more active dialogue with customers like eCommerce),
The availability of Neural Machine Translation (NMT) technology on this secure and private MT platform.

Why is the SDL ETS announcement meaningful?

The SDL ETS platform is an evolution of the on-premises MT product offering that has been in use in Government environments for over 15 years now and has been widely used in national security and intelligence agencies in particular, across the NATO block of countries. Given the nature of national security work, the product has had to be rock solid, and as support-free as possible as anti-terrorist analysts are not inclined to, or permitted to make calls for technical support to the MT vendor. Those of you who have struggled through clunky, almost-working-MT-onsite-software from other MT vendors, who are less prepared for this low-support-requirement use-case, will probably appreciate the value of SDL's long-term experience in servicing this market need.

As we have seen of late, determined hackers can break into both government and corporate networks and do so on a regular basis. This graphic and interactive visualization is quite telling in how frequently hackers succeed, and how frequently large data sets of allegedly private data are accidentally exposed. So it is understandable that when the new SDL management surveyed large global enterprises priorities, they found that "Data Security and Data Privacy" were a key concern for many executives across several industries.

In a world where MT is a global resource, and 500 Billion words a day are being translated across the various public MT portals, data security and privacy have to be a concern for a responsible executive, and, any serious corporate governance initiative. While the public MT platforms have indeed made MT ubiquitous, they also generally reserve the right to run machine learning algorithms on our data, to try and pick up useful patterns from our usage and continue to improve their services. Today, MT is increasingly used to translate large volumes of customer and corporate communications, and it is likely that most responsible executives would rather not share the intimate details of their customer and intra-corporate global communications with the public MT services where privacy could be compromised.

If you think that your use of MT is not monitored or watched, at least at a machine learning level, you should perhaps take a look at the following graphic. This link provides a summary of what they collect.

The notion that these MT services are “free” is naive, and we cannot really be surprised that the public MT services try to capitalize on what they can learn from the widespread use of their MT services. The understanding gained from watching user behavior not only helps improve the MT technology, it also provides a basis to boost advertising revenues, since an MT service provider has detailed metrics on what people translate, and search on, in different parts of the world.

To adapt the original ETS platform to the different needs of the global enterprise market, SDL had to add several features and capabilities that were not required for national security information triage applications, where MT was invariably an embedded component service, interacting with other embedded components like classification and text analytics in a larger information analysis scenario. The key enhancements added are for the broader enterprise market, where MT can be an added corporate IT service for many different kinds of applications, and the MT service needs direct as well as embedded access. The new features include the following:

A redesigned and intuitive interface that improves the user experience for product installation, administration as well as ongoing operation and management to respond to changing needs. As the GUI is web-based, no installations are required on individual user machines. Users and Admins can easily get up to speed using SDL ETS via its Web GUI.

The new browser-based user interface includes features like Quick Translate, Browse Translate, Host Management, and User Management

Scalable architecture accommodates low and high translation throughput demands. The addition of a load balancer for automatic distribution of client requests which manages available MT resources to facilitate throughput and translation services synchronization in an efficient manner.
Time to deployment is minimized with various kinds of installation automation. SDL ETS can be deployed swiftly without the need to install any extra third-party software components manually. SDL ETS services automatically restart upon system restart as they are automatically installed as OS services for both Windows and Linux. (This is in contrast to most Moses based solutions in the market.)
User roles & authentication

Enable user access via permission-based login and/or authenticate against corporate’s central Active Directory with LDAP.

Scaling and managing SDL ETS deployments are made easy with centralized Host Management. Admins no longer need to access individual ETS servers and modify configuration files. Setup can be done via the SDL ETS Web GUI’s Host Management module and includes things like loading custom dictionaries for specific applications.
Includes state-of-the-art Neural Machine Translation technology, offering leading-edge technology for the highest quality machine translation output
Highly tuned MT engines that reflect the many years of MT developer engagement with SDL human translation services, and ongoing expert linguistic feedback that is a driving force behind higher quality base translations
Ease of access through an MS-Office Plug-in and a rich REST API for integration with other applications and workflows
Enhanced language detection capability

Support the automatic detection of over 80 languages and 150 writing scripts.

How does SDL ETS differ from other MT on-premise solutions?

Based on my experience with, and observation of other on-premise MT systems, I think it is fair to say that the SDL ETS features are a significant step forward for the translation industry in bringing the capabilities of industrial strength MT up to modern enterprise IT standards and needs. In #americanbuffoon-speak we might even say it is tremendous and bigly good.

Based on what I was able to gather from my conversations, here is a list of distinctive features that come to mind. Note that most of these current updates relate to the improved UX, the elegant simplicity of SDL ETS, and, the ease of ongoing management of changing needs and requirements from an enterprise customer viewpoint.

More scalable and elastic, and easier for the customer to manage without calling in technical experts from the MT vendor
Ease of administration and ongoing management and maintenance of different corporate applications that interact with MT translation services
Powered by SDL’s proprietary PB-SMT & NMT technologies
Efficiency of architecture – fewer servers needed for the same amount of work

A closer look at SDL’s Neural MT

Except for the NMT efforts at Facebook, Google, and Microsoft, most of the NMT initiatives we hear about in the translation industry today, are based on open source solutions built around the Torch and Theano frameworks. While using open source allows an MT practitioner to get started quickly, it also means that they have to submit to the black box nature of the framework. Very few are going to be able to go into the source code, to fundamentally alter the logic and mechanics of the framework, without potentially damaging or destabilizing the basic system. The NMT systems that practitioners are able to develop are only as good as the data they use, or their ability to modify the open source codebase.

In contrast to this, the SDL NMT core engine is owned and developed 100% in-house by SDL, which allows much greater flexibility and deep underlying control of the basic logic and data processing flows of the NMT framework. This deeper control also allows a developer more alternatives in dealing with NMT challenges like limited vocabulary, performance/speed and changing the machine learning strategies and techniques as the market and the deep learning technology evolve e.g. switching from recurrent neural networks (RNN) to convolutional neural networks (CNN) deep learning strategies as Facebook just did.

My sense, based on my limited understanding, is that owning your NMT code base very likely affords more powerful control options than open source alternatives allow, because problem areas can be approached at a more fundamental level, in the well-understood source code, rather than using workarounds to handle problematic output of open source black-box components. It is also possible that owning and understanding the code base also results in longer term integrity and stability in the code base. This is also probably why the big 3 choose to develop their own code base over using open source components and foundations. The SDL system architecture reflects 15+ years of experience with data-driven MT and is designed to allow rapid response to emerging changes in machine learning technology, like a possible change to CNN from the current RNN + Attention approach that everybody is using.

In my conversations with SDL technical team members, it became apparent that they have a much greater ability to address several different NMT problem areas:

Vocabulary – SDL has multiple strategies to handle this issue in many different use scenarios – both when the source data universe is known, and also when it is unknown and developers wish to minimize unknown word occurrences.
Neural Babble – NMT systems often produce strange output that the SDL developers call neural babble. One such scenario is when the output produces the same phrases, mysteriously repeated multiple times. SDL has added heuristics to develop corrective strategies to reduce and eliminate this and other errant occurrences. This is an area that open source NMT systems will be unable to resolve easily and will need to add pre- and post-processing sequences to manage.
Speed/Performance issues can be better managed since the code base is owned and understood so it even possible to make changes to the decoder if needed. SDL is testing and optimizing NMT performance on a range of GPUs (Cheap, Mid-range & Premium) to ensure that their client base has well understood and well-tested deployment options.
Rapid Productization of Deep Learning Innovation: Owning the code base also means that SDL could easily change from the current deep learning approach (RNN) to new deep learning approaches like (CNN) which may prove to be much more promising and efficient for many applications that need better production performance. This agility and adaptability can only come from deep understanding and control of how the fundamentals of the NMT system works.

The NMT customization and adaptation options are currently being explored and benchmarked against well understood PB-SMT systems. Initial results are providing great insight into specific data combination and pruning strategies that result in the best custom NMT system output. SDL's long-term experience building thousands of custom systems should be invaluable in driving the development of superior custom NMT solutions. The research methodology used to investigate this follows best practices (i.e. they are careful and conservative, unlike the over-the-top claims by Google) and we should expect that all production NMT systems will be significantly superior to most other alternatives. While I am not at liberty to share details of the benchmark comparisons, I can say that the improvements are significant and especially promising in language combinations that are especially important to global enterprises. The SDL team is also especially careful in making public claims about improved productivity and quality (unlike some MT vendors in the market), and are gathering multiple points of verification from both internal and customer tests to validate initial results which are very promising.

I expect that they will also start (or possibly have already started) exploring linking their very competitive Adaptive MT capabilities with high-quality NMT engines. I look forward to learning more about their NMT experience in production customer environments.

Happy 4th of July to my US friends, and here is an unusual presentation of the national anthem that is possibly more fitting for the current leadership.

eMpTy Pages

Pages