Pages

Tuesday, November 29, 2011

Wanted: A Fair and Simple Compensation Scheme for MT Post-Editing

As the subject of fair and equitable compensation to post-editors of MT is important to the ongoing momentum of MT, I would like to introduce some people who have examined this issue, and have made an attempt (however imperfect) to developing a solution. The initial response to many such initiatives often seems to be criticism of how the approach fails. I am hoping that the dialogue on these ideas can rise above this, to more constructive and pragmatic advice or feedback to help the continuing evolution of this approach to reach more widely accepted levels of accuracy. The MemSource approach is something that measures the effort after the work is done. Used together with other initiatives that attempt to provide some measure of the post-editing task a priori, I think it could have great value in developing new compensation models that make sense to all the stakeholders in the professional translation world. It is important to develop new ways to measure MT quality and post-editing difficulty as this will become increasingly more common in the professional translation world.

This is  a guest post by David Canek, CEO of MemSource Technologies. I have not edited David’s article other than selecting some phrases that I felt were worth highlighting for a reader who skims the page.

======================================
  
Throughout 2011 MemSource, a provider of a cloud-based translation environment and CAT tool, has run a number of workshops, exploring the impact of machine translation on the traditional translation workflow. We had lots of debates with translation buyers, LSPs, as well as translators on machine translation post-editing and specifically on how it should be compensated. We have shared our findings at the 2011 Localization World in Barcelona and we thought it may be interesting to also share them here, on the eMpTy Pages blog.

Translation Buyers and MT

While the majority of translation buyers still need to discover machine translation, there are many organizations whose progress with MT goes beyond the pilot phase. The innovators, among them many software companies, have successfully used machine translation to make the traditional translation process more efficient. One headache still remains: A simple and fair compensation scheme for machine translation post-editing. Today, typically a flat reduction of the “normal” translation rate is negotiated with the vendor, disregarding the actual effort of the translator spent on post-editing a specific document, let alone a specific segment. This can be rather imprecise, even unfair as MT quality can vary significantly from document to document, and of course segment to segment.

Translators and MT

There is a myth that all translators dislike machine translation post-editing. In fact many translators have started MT post-editing as their standard translation workflow long before anyone requested them to do so. They themselves chose to use MT because it helped them increase their productivity. Then, some years later, they were approached by their LSP/client regarding MT. Perhaps it went like this?

Dear translator,
We have introduced this great new technology, it is called machine translation. It will help you speed up your translation and – by the way we will cut your rates by 30%.
All the best...

Of course, none of the translators could be happy at the face of this news. The innovative translators - already using MT to speed up their translations - would not be happy because nothing would change for them except that their rates would get cut. The less innovative also had no reason to be happy – they had to adapt to a new translation method and their rates got cut – without any guarantee that the new translation workflow would actually speed up their translation process.

LSPs and MT

Language service providers, generally speaking, are not too fast to adopt machine translation. This may come as a surprise, as LSPs should be most interested in slashing their costs with intelligent use of MT. However, LSPs, it seems, face specific obstacles, which make MT adoption not a simple task. In contrast to translation buyers, LSPs have to cope with limited resources, yet on the other hand have to tackle multiple language pairs and subject domains, spanning across all of their clients. Training a custom MT engine in this context is a bit challenging. The available online MT services, such as Google Translate or Microsoft Translator, are perceived by many LSPs as inadequate, mainly because of “confidentiality” concerns. The – growing – minority of LSPs that have started using custom MT engines report mixed results but are generally quite optimistic about the output.

Getting the right MT technology in place is important but not enough. LSPs need to make sure that there is ROI on the new technology. That means they need to modify their translation workflow to include machine translation and most of all have to make sure the new workflow makes translating faster, i.e. cheaper. This means that they will have to renegotiate rates with their translators. All of this is far from trivial and if not done carefully, it can cause more trouble than good.

Fair Compensation for MT Post-editing

MT is an innovative technology that will eventually (though not equally across all language pairs and domains) make human translation faster, i.e. cheaper. It is important that all stakeholders benefit from this increased efficiency: Translation buyers, LSPs and translators.

Above all, compensation for MT post-editing should be fair. There can be different ways. Some translation buyers run regular productivity tests and, based on the results, apply a flat discount on translations supported by MT (I believe Autodesk has a fairly sophisticated approach to this). At MemSource we have tried to come up with a different, perhaps complementary, approach, which is based on the editing distance between the MT output and the post-edited translation. Indeed, quite simple. We call this the Post-editing Analysis. In fact this approach is an extension of the traditional “TRADOS discount scheme”, which long ago became a standard method for analyzing translation memory matches and the related discounts in the translation industry.

Post-editing Analysis: How It Works

When a translation for a segment can be retrieved from translation memory (a 100% match), the translation rate for that segment is reduced – typically to just 10% of the normal rate. A similar approach can be applied to MT post-editing. If the MT output for a segment is approved by the post-editor as correct, then we can say we have a 100% match and the post-editing rate should be very moderate for that segment. If, on the other hand, the post-editing effort is heavy and the machine translated output needs to be completely rewritten for a segment, a full translation rate should be paid. In the post-editing analysis, there is, of course an entire scale ranging from 0% to 100% when calculating the similarity (editing distance) between the MT output and its post-edited version. The rates can be adjusted accordingly.

clip_image002

The advantages of the post-editing analysis:
· Simple
· Transparent
· Measurable at segment-level
· Extension of the established TM discount scheme

There are also some disadvantages. Namely, the analysis can be run only after the post-editing has been carried out, which means that any discounts can be determined only after the translation job is completed. Another objection could be that the editing distance is a simplification of the actual effort of the post-editor. Indeed, this could be valid and a more complex approach could be applied. However, our goal was to come up with a simple and efficient approach, which could be easily implemented into today’s CAT workbenches and translation environments.

Interested to Know More and Experiment?

More details on the MemSource Post-editing analysis, incl. a sample post-editing analysis can be found on our wiki. If you are interested to share your experiences with MT post-editing initiatives and/or find out more about our efforts in this space, sign up for a webinar, etc. write to labs@memsource.com



David Canek is the founder and CEO of MemSource Technologies, a software company providing cloud translation technology. David, a graduate from Translation and Comparative Studies, received his education at Charles University, Prague, Humboldt University in Berlin and the University of Vienna. His professional experience includes business development and product management roles in the software and translation industries. David is keen on pursuing innovative trends in the translation industry, such as machine translation post-editing or cloud-based translation technologies and has presented on these topics at leading industry conferences, such as Localization World, Tekom, ATA and others.

17 comments:

  1. Hi there

    first point: "translating faster, i.e. cheaper"

    mmmh...
    I thought (and many voices said so) that MT was developped to manage the otherwise un-manageable increasing volume of documents, so to speed-up translation work-flow, but now I see that rather it is born to make translation cheaper: the industry throws off its mask, finally?
    ;-)))

    second point: "Interested to Know More and Experiment?
    Pay for a webinar!"

    what?
    If memsource or everybody else is willing to pay ME for gladly testing possible PEMT payment schemes I'm available, otherwise no way!

    third point: "There is a myth that all translators dislike machine translation post-editing. In fact many translators have started MT post-editing as their standard translation workflow long before anyone requested them to do so."

    yes, I think so, and I think that the grievance chorus when GT app stopped to be for free, reveals the guilty conscience of many peers ...

    anyway, except all these considerations, I suspect that this grievance arised even when TM was firstly developped, or I'm wrong?

    ReplyDelete
  2. I applaud David's efforts and the intent behind this proposal. I also think he has started in the right place with the editing distance approach. However, at the risk of fulfilling the prophesy that the "initial response [is] criticism of how the approach fails", I would offer the following additional disadvantage. Clever (and avaricious) translators can easily increase the "editing distance" by re-arranging word order and other very minor "corrections" to the original. It seems that any truly "fair" approach needs to have some way of detecting or discouraging unnecessary changes, whether or not the motive is to enhance translator revenue.

    I am looking forward to the discussion.

    ReplyDelete
  3. Claudio, I do not know why MT was developed :) but I do think it can sometimes make translating faster for a human translator. However, not always and not evenly across all language pairs/domains/types of content.

    So I think it would be useful to know when MT helps increase productivity and when it does not.

    ReplyDelete
  4. Bob, this a very valid point - one that has also been raised during some of the workshops we did on the topic. We have to see. But personally, I think it should not be a problem.

    Why?

    The compensation scheme for MT post-editing should be beneficial for all parties involved - including translators. In other words it should be set up in such a way that making extra changes to an otherwise OK machine translation output would simply not be worth the effort.

    ReplyDelete
  5. @Claudio

    MT has always had the promise of doing two things:
    1) Making content that would never be translated otherwise multilingual
    2) Raising the productivity on repetitive localization projects.

    It has had much better success with 1) but as my last post pointed out - it can definitely have a place in speeding up some TEP production processes. However this requires much higher quality engines and is more difficult to do.

    Also GTT is still free for individuals - they have only stopped free access via the API.

    ReplyDelete
  6. @Bob You are right that this approach could be gamed and like many business transactions involving humans it will need trust. But I think it is still an improvement on an arbitrary lower number that is often used today and is very often considered grossly unfair.

    There is still a great need for an assessment of the difficulty or scope of the work BEFORE you begin post-editing and I hope that we see more there. MT developers are coming forward with confidence limits and there are reasonably good ways to do this with aggregate quality estimates based on sampling. But we still have far to go and fortunately it is getting more attention.

    ReplyDelete
  7. Translators who accept editing jobs have the necessary skills and know well how to determine the effort required to do the job and put a fair and equitable price to it. This price is strictly dependent on the quality of the translation to be edited, be it human or machine, and on the client's requirements. Why do you think a new compensation scheme is necessary? Do you mean that current compensation schemes as applied by professional translators to editing jobs are not fair and equitable? Why?

    Cristóbal del Río

    ReplyDelete
    Replies
    1. Agree, and example.

      My cover letter says:
      "Proofreading rate: from 30% to 100% of translation rate, depending on the quality of the translation."

      And I refuse MT Post-editing because the compensation rate is 'imposed' by the LSP.

      Delete
  8. I agree that simply reducing the translator's existing full translation rate is unfair. I also agree that the concept of editing distance is more fair and more theoretically more accurate. But it has imperfections, such as those listed above. What I immediately see is that such editing distance involved measuring the the quality of the MT output versus the quality of the final PE output. But as an industry, we still dont' agree what quality even is. And in recent years, MT itself has caused our industry to speak regularly about how quality can no longer be seen on one set black-and-white threshold. Quality now is relative to the needs of the content and project. SO how do we accurately measure quality at those two stages in the process to be able to calculate distance? I don't know. That is another place where concensus is needed if the math is to be acceptable.

    ReplyDelete
  9. Diego Bartolome •

    It's interesting, Kirti! In my opinion, the similarity metric that could better handle the issue is in "words" rather than in "characters".

    ReplyDelete
  10. I wanted to comment on the role of LSPs and the way I see MT is happening. These comments are based on my personal experience and I cannot extrapolate this to other LSPs, but maybe others would like to comment so we can start an interesting discussion. In my personal experience, and contrary to what the article says, software developers ask for flat discounts on MT segments onto the vendor (LSP) claiming on many occasions the advantages of the technology and without tests on the actual quality of the output. In our case, we have been the ones analyzing the output to try and establish accurate compensation for translators, and on many occasions applying a lower discount because the quality was simply not at that level. Also, translators, not all of them, but many, have been very reluctant to use MT and we have trained large groups on how to use it properly. We have created post-editing guidelines (at our own costs) based on our output analysis (at our own costs) and offer feedback (at our own costs). As I have commented on this forum before, we have seen that post-editors and reviewers do make a lot of preferential changes or not the appropriate changes (in favour of speed) and without proper instructions and guidelines they do not know how to deal with MT segments efficiently. So, the LSP is often found between a rock and a hard place (the customer asking for a harsh discount and translators not willing to accept a discount or not having the right knowledge to work on MT). I think we should find ways of compensating for MT post-editing for both translators and LSPs, because the job is not only post-editing a segment but it involves other tasks often unseen such as output analysis, guidelines, automatic post-editing, feedback, measurements, etc. The "a posteriori" approach does not seem balanced to me because, as I mentioned, we have noticed that post-editors and reviewers make a lot of preferential changes, although I do agree that it is a good way of checking those segments that have been discarded and that should be paid in full. My impression right now (and this might change) is that we should all work on having better analysis of the outputs (per language, per engine, per domain), and then establish either confidence levels or a discount in line with the quality of the output similar to those for TM. Maybe, we could combine both approaches and use the "a posteriori" analysis to check if the discount applied was accurate, but using only what the post-editors did does not reflect the reality of MT post-editing as a whole in my view.

    ReplyDelete
  11. Patricia Bown • I would like to see/hear more compensation proposals from the performers of services in this conversation. Most busy translators/linguists/freelancers/humans have little time or interest in scamming a system, and even less so if they feel they are part of a fairly compensated, mutually beneficial arrangement. One can look and find examples, but it takes no effort at all to find freelance translators who in essence donate work to their customers in order to adhere to standards they have set for themselves, in an attempt to manage risk for themselves, and because they have a timeline that is relationship rather than transaction based. That they do this is both a credit and detriment to themselves. Some LSPs do this, too...not charging for all of the work performed and delivered, and for similar reasons. (CSA, maybe you know how much of the service provided in our industry doesn't get invoiced?) While things are still in flux, this is an opportunity for both service providers and especially service performers to say more about what it takes to provide the service and what feels fair in compensation

    ReplyDelete
  12. Bob Donaldson • @Patricia -- Good points on the potential scamishness of the typical translator ... not sure I am quite as optimistic as you on human nature and mendacity in general, however. That said, we absolutely need to hear more from actual translators on how best to create a fair compensation system. We are always hearing (and some of us saying) that MT can lead to more revenue potential for linguists/translators but until the translators see such a "route to riches" we will not gain much credibility.

    ReplyDelete
  13. Radovan Pletka •

    It is not a rocket science.
    In fact, it is easy.
    If I am good translator and I make $50 per hour translating, I will do post editing, if I make the same money for starters, with opportunity to make more when I get faster. Otherwise, why I should bother and lose money (smile).

    If this doesn't make financial sense for you, it is your problem, not mine.
    There is plenty of people, who work for much less (smile)

    ReplyDelete
  14. Pia F Bresnan •

    At e2f translations (Eng->French SLV), we've been performing MT Post editing tasks for our partners for some time now, and precisely because of the disconnect between client expectation and reality, we always request a sample of the output. This allows us to analyze the productivity, provide feedback to the client and agree on the discount before fully taking on the task. Apart from client's own expectation or comparison with other languages when productivity between languages cannot be compared due to variability in MT output, this approach seems to be working out so far.

    ReplyDelete
  15. Victor Foster •

    When it comes to MT post-editing, I would say that the majority of the time translators provide more services than they are compensated for. It might be true that some translators do the bare minimum required but that's actually what the client is paying for while asking for more, isn't it? Any client using MT and then sending it out for post editing is really just looking for good enough output, not necessarily top quality translation work. Or, at least, that's all they pay for. I'd be interested in learning ideas about making compensation commensurate with the actual services and effort involved.

    ReplyDelete
  16. Thanks for posting, David, and for hosting, Kirti.

    I like your suggestion: It does seem to provide an accurate measure of the Post-Editor's effort, so the compensation sounds totally fair.

    On big disadvantage, though, is that the true costs will be known only post-factum. IMHO, I think this is a big problem because, given the status quo, end customers and LSPs want to know in advance how much the job will cost. I suppose one could provide ballpark estimates to be adjusted once the job is completed, but I can anticipate a lot of resistance to change the established cost/price per word model that the localization industry has been using for decades now.

    ReplyDelete