Tuesday, November 29, 2011

Wanted: A Fair and Simple Compensation Scheme for MT Post-Editing

As the subject of fair and equitable compensation to post-editors of MT is important to the ongoing momentum of MT, I would like to introduce some people who have examined this issue, and have made an attempt (however imperfect) to developing a solution. The initial response to many such initiatives often seems to be criticism of how the approach fails. I am hoping that the dialogue on these ideas can rise above this, to more constructive and pragmatic advice or feedback to help the continuing evolution of this approach to reach more widely accepted levels of accuracy. The MemSource approach is something that measures the effort after the work is done. Used together with other initiatives that attempt to provide some measure of the post-editing task a priori, I think it could have great value in developing new compensation models that make sense to all the stakeholders in the professional translation world. It is important to develop new ways to measure MT quality and post-editing difficulty as this will become increasingly more common in the professional translation world.

This is  a guest post by David Canek, CEO of MemSource Technologies. I have not edited David’s article other than selecting some phrases that I felt were worth highlighting for a reader who skims the page.

Throughout 2011 MemSource, a provider of a cloud-based translation environment and CAT tool, has run a number of workshops, exploring the impact of machine translation on the traditional translation workflow. We had lots of debates with translation buyers, LSPs, as well as translators on machine translation post-editing and specifically on how it should be compensated. We have shared our findings at the 2011 Localization World in Barcelona and we thought it may be interesting to also share them here, on the eMpTy Pages blog.

Translation Buyers and MT

While the majority of translation buyers still need to discover machine translation, there are many organizations whose progress with MT goes beyond the pilot phase. The innovators, among them many software companies, have successfully used machine translation to make the traditional translation process more efficient. One headache still remains: A simple and fair compensation scheme for machine translation post-editing. Today, typically a flat reduction of the “normal” translation rate is negotiated with the vendor, disregarding the actual effort of the translator spent on post-editing a specific document, let alone a specific segment. This can be rather imprecise, even unfair as MT quality can vary significantly from document to document, and of course segment to segment.

Translators and MT

There is a myth that all translators dislike machine translation post-editing. In fact many translators have started MT post-editing as their standard translation workflow long before anyone requested them to do so. They themselves chose to use MT because it helped them increase their productivity. Then, some years later, they were approached by their LSP/client regarding MT. Perhaps it went like this?

Dear translator,
We have introduced this great new technology, it is called machine translation. It will help you speed up your translation and – by the way we will cut your rates by 30%.
All the best...

Of course, none of the translators could be happy at the face of this news. The innovative translators - already using MT to speed up their translations - would not be happy because nothing would change for them except that their rates would get cut. The less innovative also had no reason to be happy – they had to adapt to a new translation method and their rates got cut – without any guarantee that the new translation workflow would actually speed up their translation process.

LSPs and MT

Language service providers, generally speaking, are not too fast to adopt machine translation. This may come as a surprise, as LSPs should be most interested in slashing their costs with intelligent use of MT. However, LSPs, it seems, face specific obstacles, which make MT adoption not a simple task. In contrast to translation buyers, LSPs have to cope with limited resources, yet on the other hand have to tackle multiple language pairs and subject domains, spanning across all of their clients. Training a custom MT engine in this context is a bit challenging. The available online MT services, such as Google Translate or Microsoft Translator, are perceived by many LSPs as inadequate, mainly because of “confidentiality” concerns. The – growing – minority of LSPs that have started using custom MT engines report mixed results but are generally quite optimistic about the output.

Getting the right MT technology in place is important but not enough. LSPs need to make sure that there is ROI on the new technology. That means they need to modify their translation workflow to include machine translation and most of all have to make sure the new workflow makes translating faster, i.e. cheaper. This means that they will have to renegotiate rates with their translators. All of this is far from trivial and if not done carefully, it can cause more trouble than good.

Fair Compensation for MT Post-editing

MT is an innovative technology that will eventually (though not equally across all language pairs and domains) make human translation faster, i.e. cheaper. It is important that all stakeholders benefit from this increased efficiency: Translation buyers, LSPs and translators.

Above all, compensation for MT post-editing should be fair. There can be different ways. Some translation buyers run regular productivity tests and, based on the results, apply a flat discount on translations supported by MT (I believe Autodesk has a fairly sophisticated approach to this). At MemSource we have tried to come up with a different, perhaps complementary, approach, which is based on the editing distance between the MT output and the post-edited translation. Indeed, quite simple. We call this the Post-editing Analysis. In fact this approach is an extension of the traditional “TRADOS discount scheme”, which long ago became a standard method for analyzing translation memory matches and the related discounts in the translation industry.

Post-editing Analysis: How It Works

When a translation for a segment can be retrieved from translation memory (a 100% match), the translation rate for that segment is reduced – typically to just 10% of the normal rate. A similar approach can be applied to MT post-editing. If the MT output for a segment is approved by the post-editor as correct, then we can say we have a 100% match and the post-editing rate should be very moderate for that segment. If, on the other hand, the post-editing effort is heavy and the machine translated output needs to be completely rewritten for a segment, a full translation rate should be paid. In the post-editing analysis, there is, of course an entire scale ranging from 0% to 100% when calculating the similarity (editing distance) between the MT output and its post-edited version. The rates can be adjusted accordingly.


The advantages of the post-editing analysis:
· Simple
· Transparent
· Measurable at segment-level
· Extension of the established TM discount scheme

There are also some disadvantages. Namely, the analysis can be run only after the post-editing has been carried out, which means that any discounts can be determined only after the translation job is completed. Another objection could be that the editing distance is a simplification of the actual effort of the post-editor. Indeed, this could be valid and a more complex approach could be applied. However, our goal was to come up with a simple and efficient approach, which could be easily implemented into today’s CAT workbenches and translation environments.

Interested to Know More and Experiment?

More details on the MemSource Post-editing analysis, incl. a sample post-editing analysis can be found on our wiki. If you are interested to share your experiences with MT post-editing initiatives and/or find out more about our efforts in this space, sign up for a webinar, etc. write to

David Canek is the founder and CEO of MemSource Technologies, a software company providing cloud translation technology. David, a graduate from Translation and Comparative Studies, received his education at Charles University, Prague, Humboldt University in Berlin and the University of Vienna. His professional experience includes business development and product management roles in the software and translation industries. David is keen on pursuing innovative trends in the translation industry, such as machine translation post-editing or cloud-based translation technologies and has presented on these topics at leading industry conferences, such as Localization World, Tekom, ATA and others.