This is a short guest post by @translationguy also known as Ken Clark.
These initial preamble comments in italics are mine.
Today, many LSPs and Enterprises are working with MT and there is enough evidence that MT works even when you don't really know what you are doing. Unfortunately, many agencies still try to do it themselves with Moses and most of these DIY experiments either completely fail or produce systems that are not as good as the public systems produced by Microsoft and Google, which defeats the whole point of doing it. MT as a technology only provides business leverage if you have a superior MT system and have aligned processes to take advantage of this.
Ken differentiates between light and full post-editing in his view of post-editing, and I would like to add another dimension to this discussion. It is my experience that full post-editing is done with smaller (in MT terms) projects, or when the information translated is very critical to get right. Thus, in a knowledge base project context, content related to security, privacy, and legal terms may be sent for full post-editing, and other content may just get a lighter post-edit. Also, when one is involved with very large MT projects like the team at eBay is, where hundreds of millions of words are involved, it is not possible to do a full post-edit on all the data so a light post-edit is done, or maybe nothing beyond the very specific linguistic work on high-frequency n-grams and important patterns that Silvio Picinini describes in this post. Unfortunately, it’s hard for translators and clients to agree on when we’re done with “light” post-editing, so it’s a headache to manage as editors often cannot tell when to stop.
Thus, as agencies really get involved with "real MT " projects they will do corpus profiling work and focus their attention on critical patterns as Juan Rowda has described in this post.
To me, real competence with MT in an agency or enterprise is demonstrated when there is some expertise with as many of the following core functions as possible:
These initial preamble comments in italics are mine.
Today, many LSPs and Enterprises are working with MT and there is enough evidence that MT works even when you don't really know what you are doing. Unfortunately, many agencies still try to do it themselves with Moses and most of these DIY experiments either completely fail or produce systems that are not as good as the public systems produced by Microsoft and Google, which defeats the whole point of doing it. MT as a technology only provides business leverage if you have a superior MT system and have aligned processes to take advantage of this.
Ken differentiates between light and full post-editing in his view of post-editing, and I would like to add another dimension to this discussion. It is my experience that full post-editing is done with smaller (in MT terms) projects, or when the information translated is very critical to get right. Thus, in a knowledge base project context, content related to security, privacy, and legal terms may be sent for full post-editing, and other content may just get a lighter post-edit. Also, when one is involved with very large MT projects like the team at eBay is, where hundreds of millions of words are involved, it is not possible to do a full post-edit on all the data so a light post-edit is done, or maybe nothing beyond the very specific linguistic work on high-frequency n-grams and important patterns that Silvio Picinini describes in this post. Unfortunately, it’s hard for translators and clients to agree on when we’re done with “light” post-editing, so it’s a headache to manage as editors often cannot tell when to stop.
Thus, as agencies really get involved with "real MT " projects they will do corpus profiling work and focus their attention on critical patterns as Juan Rowda has described in this post.
To me, real competence with MT in an agency or enterprise is demonstrated when there is some expertise with as many of the following core functions as possible:
- Understanding the Data - Corpus Analysis
- Focusing Linguistic Work on High-Frequency Patterns
- Working with Expert MT Systems Developers in a pro-active way
- Understanding MT Output Quality
- Driving MT quality higher with specific linguistic feedback
- Managing Post-Editing Processes and Compensation
TAUS provides an excellent overview of the larger perspective in this post on best practices in MT.
As the MT technology evolves I think we will see that strategies that made great sense with phrase-based SMT may not always make sense with the new Neural MT technology. I am talking to SYSTRAN about the realities in the NMT paradigm and hope to produce a post on this soon.
-------------------------
Machine translation has improved by leaps and bounds. What was once considered machine-produced gibberish is increasingly giving human translators a run for their money, particularly for predictable texts like weather reports.
While machine translation (MT) is also more economical than human translation, it’s not a true alternative yet. In most cases, machine translation can’t be used as is. And that’s where the expertise of machine translation post-editors comes in. Machine translation post-editors are the human editors that work to improve the output of machine translation. They combine the MT output with their linguistic expertise to provide a better reading experience to human audiences.
Besides the cost savings, it is estimated that machine translation plus post-editing is 40% more efficient than human translation alone. But what exactly do machine translation post-editors do, and how do they do it?
Types of Machine Translation Post-editing
Machine translation post-editing comes in two flavors: light post-editing and full post-editing.Light post-editing suggests a lighter touch, only asking the human editors to ensure that the MT output is accurate in meaning and understandable to the reading audience. However, this means that style is not taken into account, grammar and syntax may be awkward, and the text may sound as if it were produced by a computer. It’s the most economical option, but for reasons of quality, light post-editing is typically only used when a translation is needed urgently and/or for an organization’s internal purposes.
Full post-editing, on the other hand, calls for a higher level of involvement by the post-editor. (This makes it more expensive than light post-editing, but still less expensive than full human translation.) In addition to making sure that the MT output is accurate in meaning and understandable to the reading audience, full post-editing addresses the text’s grammar, syntax, and punctuation, ensuring they are correct and appropriate. The result is similar in quality to a human translation, although it may not yet match the style of a native-speaking translator. Full post-editing is typically used when a machine-translated text is intended to be published, or widely disseminated inside or outside an organization.
MT Post-editing Strategies
How do they do it? Let’s examine some of the things that post-editors watch out for.Light post-editors use the machine translation output as much as possible. However, they take special care that information has not been inadvertently added in or left out. They also edit anything they have identified as offensive or culturally unacceptable.
In addition to the above, full post-editors correct any grammatical and syntactical errors. They pay particular attention to terminology, making sure that the terms have been translated in the appropriate way (or left untranslated per the client’s wishes). They also ensure that the spelling and punctuation, as well as formatting, are correct.
Read more at http://www.responsivetranslation.com/blog/machine-translation-postediting/#r4ZiiLOHouYJ8E2O.99
No comments:
Post a Comment