Monday, September 12, 2011

Understanding Where Machine Translation (MT) Makes Sense

One of the reasons that many find MT threatening I think, is the claim by some MT enthusiasts that it that it will do EXACTLY the same work that was previously done by multiple individuals in the “translate-edit-proof” chain without the humans, of course. To the best of my knowledge this is not possible today, even though one may produce an occasional sentence where this does indeed happen. If you want final output that is indistinguishable from competent human translation, then you are going to have to use the human “edit-proof” chain to make this happen.

Some in the industry have attempted to restate the potential of MT from Fully Automated High Quality Translation - FAHQT (Notice how that sounds suspiciously like f*&ked?) to Fully Automated Useful Translation – FAUT. However, in some highly technical domains it is actually possible to see that carefully customized MT systems can outperform exclusively human-based production, because it is simply not possible to find as many competent technical translators as are required to get the work done.

We have seen that both Google and Bing have gotten dramatically better since they switched from RbMT to statistical data-driven approaches, but the free MT solutions have yet to deliver real compelling quality outside of some Romance languages, and the quality is usually far from competent human translation. They also offer very little in terms of control, even if you are not concerned about the serious data privacy issues that their use brings to the user. It is usually worthwhile for professionals to work with specialists who can help them customize these systems to the specific purpose they are intended for. MT systems evolve and can get better with with small amounts of corrective feedback if they are designed from the outset to do this. Somebody who has built thousands of MT systems, across many language combinations, is likely to offer more value and skill than most can get from using tools like Moses building a handful of systems, or even the limited dictionary building input possible with many RbMT systems. And how much better can customized systems get than the free systems? Depending on the data volume and quality, it can range from small but very meaningful improvements to significantly better overall quality.

So where does MT make most sense? Given that there is a significant effort required to customize an MT system, it usually makes most sense when you have ongoing high volume, dynamically created source data and tolerant users or any combination thereof. It is also important to understand that the higher the quality requirements, the greater the need for human editing and proofing. The graphic below elaborates on this.

While MT is unlikely to replace human beings in any application where quality is really important, there are a growing number of cases that show that MT is suitable for:
  • Highly repetitive content where productivity gains with MT can exceed what is possible with just using TM alone
  • Content that would just not get translated otherwise
  • Content that simply cannot afford human translation
  • High value content that is changing every hour and every day but has a short shelf life
  • Knowledge content that facilitates and enhances the global spread of critical knowledge, especially for health and social services
  • Content that is created to enhance and accelerate communication with global customers who prefer a self-service model
  • Real-time customer conversations in social networks and customer support scenarios
  • Content that does not need to be perfect but just approximately understandable

So while there are some who would say that MT can be used anywhere and everywhere, I would suggest that a better fit for professional use is where you have ongoing volume, and dynamic but high value source content that can enhance international initiatives. To my mind, customized MT does not make sense for one-time, small localization projects where the customization efforts cannot be leveraged frequently. Free online MT might still prove of some value in these cases, to boost productivity, but as language service providers learn to better use and steer MT, I expect that we will see that they will provide translators access to “highly customized internal systems”  for project work, and the value to the translators will be very similar to the value provided by high quality translation memory.  Simply put – it can and will boost productivity even for things like user documentation and software interfaces.

It is worth understanding that while “good” MT systems can enhance translator productivity in traditional localization projects, they can also enable completely new kinds of translation projects that have larger volumes and much more dynamic content. While we can expect that these systems will continue to improve in quality, they are not likely to produce TEP equivalent output. I expect that these new applications will be a major source for work in the professional translation industry but will require production models that differ from traditional TEP production.
image  image

However, we are still at a point in time where there is not a lot of clarity on what post-editing, linguistic steering and MT engine refinement really involve. They do in fact involve many of the same things that are of value in standard localization processes , e.g. unknown word resolution, terminological consistency, DNT lists and style adjustments. They also increasingly include new kinds of linguistic steering designed to “train” the MT system to learn from historical error patterns and corrections. Unfortunately many of the prescriptions on post-editing principles available on LinkedIn and translator forums, are either linked to older generation MT systems (RbMT), systems that really cannot improve much beyond a very limited point or are linked to a specific MT system. In the age of data-driven systems new approaches are necessary and we have only just begun to define this. These new hybrid systems also allow translators and linguists to create linguistic and grammar rules around the pure data patterns. Hopefully we will see much better “user-friendly” post-editing environments that bring powerful error detection and correction utilities into much more linguistically logical and pro-active feedback loops. These tools can only emerge if more savvy translators are involved (and properly compensated) and we successfully liberate translators from dealing with the horrors of file and format conversions and other non-linguistic tedium that translation today requires. This shift to a mostly linguistic focus could also be much easier with better data interchange standards. The best examples of these are from Google and Microsoft rather than the translation industry, thus far.


Talking about standards, possibly the most worthwhile of the initiatives focusing on translation data interchange standards is meeting in Warsaw later this month. The XLIFF symposium IMO is the most concrete and most practical standards discussion going on at the moment and includes academics, LSPs, TAUS, tools vendors and large buyers sharing experiences. The future is all about data flowing in and out of translation processes and we all stand to benefit from real, robust standards that work for all constituencies.


  1. FAUT?
    it all of a sudden calls back to mind FAULT: not exactly a great adv idea to me ...

    joking aside, I am under the sharp impression that you definitely changed the "register" since your first warm posts on MT, just some months ago

  2. The point is that the essence of doing a translation job mentally has been altered tremendously into serving other functions such as layout, formal constraint, terminology alignment, etc Restoring the initial conditions would do a lot good to quality. separating spell check, form allocation, etc wold increase productivity, just as pre-editing and checking, and not post editing as it is done now.

  3. @Claudio

    I am not sure what you mean, I have been using pretty much the same message for almost two years now. In fact the list I use in this post where MT can be used is copied from a blog post in early 2010.

    The FAHQT and FAUT acronyms are both unfortunate and I first saw them in TAUS presentations.

    Thank you for your comments

  4. In the list of cases suitable for MT I would add also situations where speed is essential. Real-time scenarios was already mentioned but there are also a lot of near real-time scenarios where "somewhat reliable information fast" is better than "100% reliable information too late".

  5. Nice post, Kirti!
    Although you list a variety of scenarios in which MT should be considered, my experience lies primarily in the delivery of high quality translation.
    As you note, think of MT as an extension to TM and embrace the fact that your goal is not to replace the human.
    My customers have seen significant performance gains by applying a 3 step process: 1) TM leverage, 2) MT application, and 3) human post editing.
    Essentially, the MT step provides the linguist with addition proposals from which to work from - hence, allowing them to deliver the same level of quality in a shorter period of time.
    Lastly, some linguists have been threatened by MT. However, if they embrace the technology and treat it just like TM, they'll have a completely different perspective. Generally, my customers are taking the savings derived from MT and turning it into more translation requests - resulting in the same or more work for the linguists.