Wednesday, September 18, 2013

Understanding MT Customization

While we have reached a point in time where many more people realize that machine translation (MT) produces the best results when it is properly customized, what customization actually means is still not well understood. 

There is a significant difference between shallow customization and deep customization in terms of the impact on the MT system’s output quality. The quality of output in turn has a direct impact on the potential business leverage and return on investment. There are a growing number of MT vendors, but very few real MT developers in the market today, and deep expertise is the key differentiator that leads directly to better output and better productivity. It is important for anyone considering purchasing an MT solution to understand the difference between the two types of vendors.

Generally, MT developers have created either Rules Based Machine Translation (RBMT) or Statistical Machine Translation (SMT) systems with hands-on coding at the deepest levels of the core MT engine and its surrounding technologies. Thus they are likely to have insight into how and why an MT engine works the way it does. They are also more likely to be able to coax an engine to produce better quality output by applying the optimal corrective actions to improve on initial results.

In contrast, most Do-It-Yourself (DIY) MT vendors provide little, if any, real innovation and focus on simplifying and packaging a collection of open source tools into a web based offering. Their primary emphasis is on simplifying the interface to these open source tools and enabling a user to build a basic MT system with user data instantly. I would characterize this approach as a shallow customization. When real understanding of the engine technology and data is required, few have the necessary skills needed to make this initial MT engine quality better on an ongoing basis and even less ability to make it reach levels of quality that provides real competitive advantage.

When evaluating MT vendors, there are a few simple things that anyone considering purchasing an MT offering should understand:

Is your MT vendor a serious developer of MT technology or do they simply provide/package other third-party or open source MT technology?
There are only a very small number of companies developing commercial enterprise class MT. Most MT vendors are users or packagers of third-party technology. Many do not have the depth of understanding to do anything but the simplest and shallowest MT customization tasks. These vendors will often present themselves as experts and sometimes claim to be technology agnostic. Some Language Service Providers (LSPs) that have a few years’ experience using open source or third party RBMT systems are presenting themselves as MT experts. Be wary of any vendor that claims deep experience in multiple MT technologies. Advanced skills in any MT technology require long-term investment and long-term experience to get to any kind of distinctive expertise. To get good results from any of these approaches require very different skill-sets and independent and unique expertise must be developed for each approach. The notion that a standard set of MT development skills that work anywhere and everywhere is a myth.

Any MT vendor that does not have a strong and experienced human skill and human steering component in the customization process will always deliver lower quality results. 

Does your MT vendor use a Clean Data SMT or Dirty Data SMT strategy?
The Clean Data SMT approach was pioneered by Asia Online in 2008. Most MT vendors do not have the technology or rigorous data analysis and data cleaning processes to deliver a Clean Data SMT approach, and so take the easier Dirty Data SMT approach. Clean Data SMT has many benefits such as more rapid improvement from post editing and provides the ability to manage and control terminology so that it is consistent. Dirty Data SMT by its very nature is unpredictable and inconsistent and therefore is difficult to manage and much slower to improve with corrective feedback.

Does your MT vendor claim that MT is easy?

3 Monkeys
Some MT vendors claim that MT is not complex. One DIY MT vendor even likens those who claim MT to be complex to be monkeys. The reality is that running an open source MT solution or using a “upload and pray” solution like that of many DIY MT vendors has become very easy. 

Building an instant MT engine is not the same as delivering a production quality MT system that provides production efficiency. Indeed, a significant number of DIY custom MT engines deliver translation quality well below the quality of Google.
Delivering high quality MT requires skill, a deep understanding of the different approaches to MT and the inner workings of the technology, a deep understanding of the data used to engineer the customization process and a range of tools, skills and knowledge that permit optimization to deliver the highest possible quality. There will be unique requirements to each and every engine – after all, the point of customizing is to match the translation output to a particular customer writing style and audience. This can only be achieved with human cognition and guidance and cannot be fully automated. 

The impact of real expertise is clear. Asia Online customers can speak on the record of achieving productivity gains greater than 300%, while DIY MT vendors typically claim that they can deliver productivity gains between 20-40% if any at all.

Does your MT vendor give you control of the data and the process?
Many MT vendors today provide very limited control of core data elements and typically rely on a simple “upload and pray” web interface that promises instant results. They generally lack the ability to manage, control and normalize data used to customize an MT engine and generally do not have any data analysis and data manufacturing capabilities. A developer like Asia Online provides multiple levels of control, both during the development and translation process that enable much better output quality and thus higher productivity.

What is expected of you as a user in order to customize and MT engine?
If the answer is nothing more than uploading your translation memories (TMs) then a red flag should already be raised. Machine translation can be very high quality when managed with expertise, but expecting good results without any knowledge investment and real expertise is not realistic. 

Just as in any high quality focused human translation project management, special tools, processes and expertise are required to get better results. 

Any custom MT technology that does not require your involvement in steering the customization process will deliver considerably lower quality output - often worse than anybody could do with Google or Bing. MT systems that produce good quality output require human steering, guidance and control. This is possible with today's technology, but does require more effort than just uploading some translation memories.
How much effort does it take and how quickly can the customized engine improve after the first version?
Dirty Data SMT systems offered by DIY MT vendors require significant amounts of new data to improve the system after an initial system is in place, usually around 25% of the total training data that the custom MT engine is built on. So if your engine has 3 million segments provided by your MT vendor and 200,000 segments provided by you, to see any improvement you will need at least 640,000 new segments to see a noticeable improvement in quality. Getting this much additional data is usually beyond the reach for nearly all users of MT systems. As the customization approach is Dirty Data SMT, errors are very difficult to trace and correct. The standard means to correct issues is to add more data and hope that the problem is resolved. 

Clean Data SMT systems such as Asia Online’s Language Studio™ can learn and improve with just a few thousand edits and every edit counts. Terminology is consistent, and there are tools to identify common problems ahead of time and means to automatically resolve them. Data manufacturing is also applied to amplify edits and corrective feedback and ensures they are applied to the engine in a broader set of contexts. The cause of errors can quickly be traced and the errors can be rectified using a number of problem analysis tools. The resulting improvement is rapid and noticeable even with a very small effort by a single person. 

Bottom Line: Creating a high quality custom MT engine requires deep expertise, control and broad experience, elements that are usually not present in the "upload and pray" approach provided in a DIY MT model. Developing high quality MT is complex and in 2013 still an expertise based affair. 

To simply upload a translation memory and expect high MT quality to come out is wishful thinking. A computer cannot automatically know your preferred terminology, vocabulary choices, writing style, target audience and purpose. Just like a human translation project, achieving quality requires effort, time, management and skill. 

Customizing an MT engine to produce “near-human” output quality levels is possible and there are many proof points where raw MT output has been able to produce 50% or more of the MT translated segments requiring no editing at all - i.e. they were “perfect”, with many of the remaining segments having minor issues that could be quickly edited. A fully customized MT engine built on the Clean Data SMT approach consistently deliver 150%-300% (sometimes even greater) productivity gains. The long-term ROI impact is clear relative to the meager productivity that instant MT approaches sometimes produce. 

MT in 2013 is still a complex affair that requires deep expertise and collaboration with experts if your intention to build long-term business leverage through more efficient translation production processes. There is no advantage to a system that any of your competitors could create instantly and there is no value or business advantage to just dabbling with MT.

“When conceiving the idea of Moses, the primary goal was to foster research and advance the state of MT in academia by providing a de facto base from which to innovate from.

Currently the vast majority of interesting MT research and advancements still takes place in academia. Without open source toolkits such as Moses, all the exciting work would be done by the Google’s and Microsoft’s of the world, as is the case in related fields such as information retrieval or automatic speech recognition
Philipp Koehn

As a platform for academic research, Moses provides a strong foundation. However, Moses was not intended to be a commercial MT offering. There are considerable amounts of additional functionality, beyond providing a web based user interface for Moses, that are not included in Moses that are essential in order to offer a strong and innovative commercial MT platform. “ 
Professor Philipp Koehn, University of Edinburgh, Chief Scientist, Asia Online

Addendum: This post triggered a strong reaction from Manual Herranz at Pangeanic and I am including a response I made on his blog in case it does not make it through the approval process there. 

My primary point in my blog posting is that expertise, long-term experience and a real understanding of how the technology works is necessary and critical to get the best results. Most DIY users do not have these characteristics and thus are very likely to get much lower quality results. Pointing this out, to my mind is not equivalent to "bad mouthing competition",  I am simply comparing approaches and pointing out the value implications.

Also, while I claim that expertise does matter, I do not suggest that Asia Online is the only company with this expertise. There are several other MT experts including RbMT developers like Systran and a specialist like Tayou in Spain.

I do believe that  MT technology is complex enough that it does require specialization, and that developing real competence with MT is difficult enough that it is unlikely to be successfully done by a company  whose primary business is being a translation agency. It is clear that you disagree.  I am also pointing out that the value received by a customer is very likely to be lower for a DIY user. I can understand that you may have a different opinion to mine and assure you that my observations are not borne of virulence.

Historically we saw many LSPs develop their own TMS systems too, but most people in the industry would concede that  the best TMS systems have come from companies that focus and specialize in the development of these tools e.g. MemoQ, Memsource, Across, XTM etc.. We have also seen the SDL acquisitions of software companies like Idiom, LW and Trados result in what most perceive as reduced customer responsiveness, quality and commitment to these products. Buying critical production infrastructure from a competitor generally does not make sense in any industry and thus we have seen the momentum slow down on all the SDL software acquisitions. IMO Specialization matters and with technology this complex, one will get the best results using technology developed and managed by specialists for the foreseeable future.

Anyway, I wish you peace and health.