Thursday, June 14, 2012

Thoughts on an MT technology presentation at ALC New Orleans, May 2012

This is a guest post by Huiping Iler who I had the pleasure to meet in New Orleans last month who made a very interesting presentation on how to increase the intrisnic value of an LSP firm. She runs a language services firm that is one of the growing fold of LSPs who have direct experience with post-editing MT output, and see an increasing role for MT in the future of her business.  I should add that while her own feedback on my presentation here is quite flattering, there were also others who commented through the regular feedback process that my slides were too dense and information filled, and one who even felt that my presentation was a “thinly disguised sales pitch”. (I assure you Sir, it was not.) It is difficult to find a balance that makes sense to everybody and all feedback is valuable. The pictures below come from the wonderful photographic eye of Rina Ne’eman taken during her visit to New Orleans.

It was a real delight listening to Kirti Vashee from Asia Online presenting on the ROI of Machine Translation – Scoping and Measuring MT. It took place at the most recent annual Association of Language Companies conference in New Orleans between May 16-20, 2012.

Kirti pointed out that:

  • Much of the today’s business content is dynamic and continuously flowing.
  • The need for real time international -language content cannot be met by human translators alone due to cost and time restraints.
  • Machine translation (MT), especially statistical machine translation is gaining traction among enterprises that have large amounts of data to translate.
  • IT companies and travel review sites are examples of early adopters of statistical MT.
  • Compared to any general or free MT tools out there such as Google, an enterprise MT tool and service like Asia Online is highly customizable and adaptable to unique customer needs
  • It gives clients much more control on terminology, non-translatable terms, vocabulary choice and writing style. As a result, it produces much higher accuracy and translation quality, especially in highly specialized and focused domains.
This echoes the feedback I heard from one of wintranslation’s enterprise clients who has been using statistical MT for the last few years. Our translation team have been tasked with post editing, providing corrective feedback to the client’s MT engineering team for continuous improvements.

According to translators who have mastered the art of editing machine translation, post editing raw output requires a different skill set than the traditional editing of human translations. 

As a starter, text selected for MT often tends to be “low visibility.” Kirti gave an example that for a travel review site, the four or five star hotel reviews are human translated while the lower star hotel reviews are machine translated with some or no human post editing. 

Other low visibility text examples include car service manuals that not everybody reads, or web-based support content. High visibility (and typically low volume) text such as marketing communications, rarely if ever, gets selected for machine translation. 

In the situation of translating low visibility text, particularly in technical communication, it is more important for the text to be technically accurate than stylish. It is a case where the translation might sound awkward but technically correct IS acceptable, as long as translation efficiency is maximized without hurting accuracy.
But translators new to post editing may be tempted to edit the text for not only accuracy but also flow and style. It leads them to spend more time than necessary on the text and they are also more likely to complain about the quality of MT output. After all style and flow is not the strength of MT but speed and consistency is. It is important to have an agreement with the human post editors what is good enough (i.e. technical accuracy only, not style). Improved productivity and lower cost are very important to clients using MT. The best post editors understand this and can deliver a high number of edited words per hour that meet quality standards. 

One of wintranslation’s MT post editors commented, “When I have to review a translation, either done by a human or by a machine, I do not try to make it sound like if I wrote it. I mostly correct errors, terminology inconsistencies, awkward style, problems with conveying the intended meaning and issues that really bother me. If we are able to have that mindset, then it will be less cumbersome to review machine-translated text. If we have the tendency to rewrite the translation, then the editing will be time-consuming and cumbersome.” It sums up the ideal attitude a post editor should have.

Consistency is one of machine translation’s core strengths. When set up properly, non-translatable text, like numbers, acronyms and product names are reliably consistent throughout the translation. It is an area that MT can outperform human translators.
For example,
Source: Migration information for JKJ 5.x
MT Target: Información sobre migraciones para JKJ 5.x
When a post editor reviews this text, she/he knows that for sure “JKJ 5.x” is correct and she/he doesn’t have to worry about it being translated as “JKJ 6.x” or “JKJ 5.s.” This is not always the case when reviewing human translations, because the editor will always have to double check the product name and version etc.

The absence of spelling errors in machine translated text is a distinct advantage that saves time. But it is a good practice to spellcheck the translation before delivery, because post editors could have introduced typos while inputting corrections.

When a post editor finds an error pattern, communicating it with the client will help training the engine and improving results for the future. For example, in one of the MT text, the term “wireless” is always translated into Spanish as “productos inalámbricos,” which in most cases is wrong. The post editor quickly identifies and fixes the error. Because this error happens often enough to be a pattern, it is submitted to the client for dictionary updating. This and other types of pattern based corrective work can greatly enhance the overall production efficiency of post-editing work.

When words are not in the right order in the translated text, it is best the post editor just drag them to the right place and that way he/she doesn’t have to retype them and delete them from the wrong place. This saves time.

Source: Where to buy ABC Anti-Theft Service related products.
MT Target: Dónde comprar ABC contra robo servicio productos relacionados.
In this case, “productos relacionados” needs to be moved towards the beginning of the sentence, the post editor just highlights the two words and drag them to their right place. She also needed to move the word “servicio” and make a few quick fixes.
Final Target: Dónde comprar productos relacionados con el servicio ABC contra robo.
When the upfront linguistic set up work has been inadequate or there is a lack of ongoing communication between the post editing team and the MT engineers, it produces a lot of frustrations for MT editors and creates unnecessary delay.

For example, a Brazilian Portuguese translator noticed the MT software was often using European Portuguese vocabulary even though the text is intended for the Brazilian market (there are significant spelling differences between Brazilian Portuguese and European Portuguese).

For instance, acção (should be ação), gestores de projectos (should be gerenciadores de projetos), etc.

She asked why the machine translation software “was not told” about that. This ability to provide feedback to the MT system is a key ingredient to getting better results and raising editor productivity and satisfaction. The best results of MT come from close collaboration between the engineering and linguistic post editing team.

The translator mentioned above also found inconsistencies in the translation of key terms such as product names. “Green Power Management” was translated as Energia verde Management, Verdes gerenciamento de energia, and Verdes poder Management. Some editing of the translation memory to reduce such inconsistency would speed up the posting editing process a lot.

In terms of productivity gains, it varies from language to language. In Spanish and Portuguese for example where MT has made more inroads, one can expect as high as 50% productivity increase in terms of number of words translated per hour assuming the MT engine has been properly set up and trained. But gains are harder to come by in Asian languages. 

There is a real and imminent opportunity for translation companies to offer real-time translation services for select type of content that is out of reach for human translations due to time and cost. The linguistic training of statistical translation engines and developing post MT editors are key pieces in realizing that opportunity.
On a side note, I cannot help but noticing Mr. Vashee’s passion and sharing of MT expertise is contagious. He is one of the finest craftsmen in the sales and marketing field of technology and translation; an empathetic communicator, he is always able to see things from his clients’ eyes; when in the company of translation company owners, he presents possibilities to use a tool like Asia Online to generate new revenue and create differentiation (ask which translation company owner doesn’t like to hear that); he satisfies the data driven analytical types with numbers and return on investment measured in quality metrics and dollars; he has an amazing ability to stay insightful and relevant in a conversation while sticking to his value proposition; he is an outstanding marketer and an entrepreneur’s dream pitch man. 

About Huiping Iler:
Huiping Iler is the president of wintranslationTM, a Canadian based translation company *specializing in information technology and financial services. wintranslation has been coordinating post editing of machine translated text for the last several years.
Canadian translation company wintranslation