Wednesday, August 28, 2013

Understanding ROI with Machine Translation Technology

As the use of machine translation gathers momentum in the professional translation world, it is interesting to see that much of the essential economic rationale for effective deployment of this technology is still somewhat muddy and unclear. MT use on it's own does not guarantee ROI.The MT system has to produce output that actually improves the production process to generate meaningful ROI.  Poor quality MT impacts goodwill, reputation and is also a waste of time, effort and money. There are many in the industry who continue to view machine translation (MT ) with the same “project-oriented mentality” that is common with traditional translation work that sometimes involves the production use of translation memory (TM ).  However, MT is fundamentally different, and I think it requires a different mindset in design and technology deployment to maximize economic benefits. TM is often used as a way to reduce costs and often MT is seen as a new way to lower costs without any real understanding of it’s viability and value in the translation production process. This sub-optimal use of MT has resulted in translator resistance and results that are often less than impressive.

To elaborate, a project-orientation is something that makes sense and works well with cottage industry approaches that typify historical translation project work, and most CAT tools and translation memory technology fit well within this paradigm. This project approach is widely prevalent in the language service provider world, where teams and tools are assembled to get a translation job or project done as jobs come in. There is little specialization in terms of subject matter knowledge on the translated content and there are very few LSPs who develop any kind of domain specialization or in-depth subject matter expertise. Every translation job is seen as being of equal weight, and the general objective is to quickly assemble  a team (translators and reviewers) to get source content converted to target content, using TM if available. The best service providers build a base of trusted translators, who they work with on a regular basis to ensure “quality” and institute processes to minimize errors and produce standardization in the production process. In this world it is very easy to replace an LSP (even large ones like SDL and Lionbridge) since their key value add is project management and translation management software sometimes, and thus LSPs are always vulnerable to being switched out to a predatory competitor passing by.  

MT on the other hand requires a different perspective. In most cases, the successful use of MT in professional translation work is the equivalent of building a production line. In general, one would (should) not build a production line for a single project unless one expected to do many similar kinds of projects. One should only build production lines for projects that one expects will have large volumes or where one expects repeat business on an ongoing basis. Production lines require specialization and it is unlikely that a single production line will do every thing well. Having a well functioning production line will give the producer a cost advantage but good, efficient and effective production lines are never created instantly. They always require investment and refinement and uncommon expertise.  The greater the efficiency of the production line, the greater the cost advantage which can result in meaningful barriers to competition.  MT done right can provide long-term cost and competitive advantage.  While the “free” engines offer useful value in some languages, these systems are usually not considered of adequate quality to be useful in professional translation settings where the final deliverable is work that looks like it was done by human translators. Many of the MT systems in use today are developed by naïve LSPs who send very low quality output to post-editors and expect them to fix it for lower than standard rates. Thus the huge and justifiable hue and cry in the translator community about mind numbing post-editing work. 

Thus, when we look at the current adoption of MT in the professional translation world we frequently see the following:
  • A focus on the initial outlay for MT experimentation that often leads to adoption of “free” and open source technology as the initial focus is only on start-up costs,
  • A rush to Moses and a large number of substandard MT systems that produce output lower in quality than the “free” MT from Google and Microsoft (all the other free engines are hardly worth the bother),
  • Some who claim “expertise” are merely building simple dictionaries for RbMT systems (that have been around for 50 years with little or no quality advancement in that period) or throwing data blindly into an instant Moses setup,
  • Very little general awareness of the deep expertise and experience required to tune, adjust and modify MT systems to meet business production requirements at a meaningful level of utility,
  • Some LSPs claim to provide MT services for other LSPs, which to me is the equivalent of Ford asking Kia to build cars for them. It makes very little sense to me why an LSP would go to SDL (or any other LSP) to get them to build MT systems for them.(A death wish? Reckless at least. ) Those who do this might want to listen to this Zappa song. (Warning: some might consider this NSFW)
What is common to all of these early initiatives is that they all have a focus on a relatively low initial investment strategy, very rarely is there any deep expertise involved or required, and they essentially provide no real barrier to competition since these efforts can be easily duplicated or adopted by any competitor.  Interestingly, the technology has gotten good enough that even naïve attempts can sometimes produce productivity improvements of 5 and 10 percent.

However, to maximize economic benefit from MT technology I think requires all or at least some of the following:
  • A clear understanding of your production efficiency before you use MT. It is difficult to understand the impact of MT on the translation production process if you do not have an understanding of the baseline efficiency.
  • A clear subject domain focus rather than the generic “anything and everything” focus that so many MT initiatives start off with, as this domain focus is a critical requirement for producing higher quality output,
  • Deep MT engine development expertise to ensure that the MT system output is of the highest possible quality.  The quality of the output is directly related to productivity improvements in the production process and are the essence of the economic benefit of using this technology. This also usually means that most do it yourself (DIY) efforts will not be your best foot forward and actually result in lower ROI and higher total cost of ownership (TCO) and frequently result in failure,  

  • Repeat and regular use of “good” MT engines will result in greater ROI. This means that MT is not a great strategy for a single project that may never happen again unless it is of substantive volume. (This seems pretty obvious but you would be amazed how often this is overlooked.) The more work you do in a single domain, the greater the benefit and the greater the leverage and the more useful it is to have an efficient production line
  • Manage expectations with customers, translators, editors and ensure that compensation and benefits are equitably shared so that win-win scenarios are created.

Thus a simple formula to maximize ROI (from my somewhat biased perspective) would include the following:
  1. A clear subject domain focus that could have multiple sub-domains as shown in the example where an Automotive domain MT system could be further refined for different clients and specific types of content.
  2. Work with an expert to develop these engines as long-term experience and deep expertise really do matter and the knowledge required to do this well Is not easily or quickly acquired.
  3. Focus on getting more clients in this domain and demonstrate the cost and timeliness advantage that a good MT system would provide. For the customer this would mean lower cost, faster turnaround at equal or better (yes better) overall quality. Building a large customer cohort will also enable a service provider to develop real subject matter expertise and provide value beyond the basic project management and translation workflow management.
  4. Continue to invest in refining the MT systems so that the engines produce continuously improving output. This will positively affect your future cost and turnaround times as it is much faster to process very high quality MT output.
  5. Expand the use of a good quality MT engines to new types of content that would not get translated otherwise. Thus, in the automotive scenario it would be possible to translate all kinds of internal product discussion related documents and emails, competitor websites, trade journals and international market market feedback and coverage. While this new type of content may not go through the same quality assurance and post-editing it could still be quite useful to monitor international markets and share more information with dealer and customer networks, even in a raw-MT format.  There is much evidence from the information technology sector that current support content and background technical information is valuable in building customer happiness.
  6. Shared benefits so that new processes are more easily accepted.

A basic rule of thumb is that the efficiency of the MT system (the productivity impact) matters much more than initial start-up costs. 

I am willing to bet that a language service provider who provides superior MT solutions will be considered a much more valuable partner to most corporate customers who are interested in sharing and monitoring business related content in international settings. While it is easy to duplicate the capability of most DIY MT systems, the quality of expert managed and developed systems are hard to match, and usually provide much greater productivity than DIY efforts which reduces the TCO and in most cases provide a clear long-term barrier to competitive service providers. MT is still a very complex undertaking in 2013 and the systems that produce the best ROI will very likely require expert guidance and input. Remember that long-term advantages only come after you have built a distinctive advantage in your MT systems and thus the key to the highest long-term ROI is an MT system that produces the highest quality output possible.



  1. I see your point, but Asia Online's platform, for example, is touted as delivering higher quality based on a "better" feedback process from human editors than other MT enginges, which will lead to better quality (eventually). But how do you convince a client to invest time and energy into helping build a system that initially, admitedly, produces lower quality? If it's not publication quality, then it's basically "gist", so why would I want a level of accuracy that's higher than Google Translate, but still not publication quality? And how do I store the translated content, from a client perspective? Do I tag them "not particularly accurate"? "Not for distribution"? I think MT will eventually become a great tool for all translators, and for people who are looking for "gist", but I see that happening through huge databases like GT, that employ the use of tags for specific content. Not for custom-built engines.

    1. Hi Sean

      In most cases even the initial engines produce better output than one would get from Google/Bing. The reason that you might still find this useful is that it helps get a translation job done faster even if it is not "publication quality" yet. Remember MT is used to expedite a translation project not replace translators.

      In professional settings a really good engine will produce say 60% of all new segments at a quality level that requires no further editing, but the remaining 40% still needs correction. The point of using MT is that you get the whole job done FASTER and at a lower overall cost because you used MT. Subsequent projects in the same domain will likely be even faster as the corrective feedback is sent back to the engine. This allows the service provider to lower prices to their customers but still maintain a competitive quality level and also respond with faster turnaround.

      Usually it is not necessary to store MT output or you could possibly store them by version so that you can understand the error patterns and plan ahead to avoid them for the next round.

      Platforms like Asia Online offer a degree of control and customization, not to mention data security and privacy that cannot be matched by GT (at this time). The people at Asia Online also share their MT development experience and expertise learned from developing thousands of MT engines across many language combinations to ensure that their customers get the best possible results with the data available..

  2. Kirti, you point #5 here violates some of the most basic principles of domain selectivity and controlled language that you otherwise promote, though that isn't necessarily obvious. Trawling "internal product discussion related documents and emails, competitor websites, trade journals and international market market feedback and coverage" with MT is almost guaranteed to give you a stream of garbage results likely to disappoint. Ditto for some of the support databases I've seen with content added by underpaid staff barely cognizant of how riddled their own native language use is with slang, internal abbreviations and insider references and basic linguistic atrocities. I'll never rid myself of the memory of discovering one such MT attempt which involved murdering the user's mother. The really funny part was the vigorous attempt to defend the use of that translation afterward.

    You describe the basic problem of the unprofessional and counterproductive approach of Linguistic Sausage Producers to MT quite well, but I doubt that a few true words are likely to be heard by the lemmings rushing to answer the call of those who tell them to "get on the MT boat or drown". There are too many willing to sacrifice any common sense to let manipulators in an "advisory" do their thinking for them, and ultimately I suppose this is a good thing. Let them enjoy the full long-term economic benefit of such developments and we'll see whether parallel evolutions produce more viable forms of work and life.

    1. Kevin

      You may be right about the product discussion content in some cases. I have been involved with cases where there was very careful customization done to ensure key terms related to product discussions were properly translated and understood, and in such cases MT can provide some useful visibility to content that would otherwise only be available to those who spoke the source language.

      I agree that just running random product related content is not likely to be useful to anyone.

      MT is complex so even well-intentioned people can do it badly sometimes because they miss some key variables. For those rushing to jump on the bandwagon just so they can say "we do MT", the results are almost guaranteed to be unfortunate and painful for those poor souls who have to clean it up.

  3. Interesting and I bet this has a lot of potential. I am glad that you shared this post. Still not many of us understood what this is, this post really helped. Thanks! -

  4. A new article "Understanding Return On Investment (ROI) and Total Cost of Ownership (TCO) for Machine Translation" has just been published at

  5. Excellent article. Very interesting to read. I really love to read such a nice article. Thanks! keep rocking.
    Technology news