Wednesday, December 18, 2013

Annual Review–Most Popular Posts of 2013


“Disruption is not something we set out to do. It is something that happens because of what we do,” stresses Brian Solis. Disruption changes human behavior (think: iPhone) and it’s a mixture of both ‘design-thinking and system-thinking’ to get there. So as an innovator, where do you begin if you don’t start with attempting disruption. To boil down Solis’ message into a word: ‘empathy.’

That’s right, empathy. Empathy drives the core of your vision as an innovator, or so it should says Solis.

Solis says that there are only two ways to change human behavior, by manipulating people, or by inspiring them. If you choose the former, good luck on your journey, but if you would prefer to attempt the latter with your innovative attempts, then you should start with empathy: the why of your product or company. That is how you will capture attention, and hold onto it, especially in the technologically, socially-driven world today.”

The excerpt above is from this post on The future of innovation is disruption (emphasis mine).

“The end of business as usual takes more than vision and innovation to survive digital Darwinism however. It requires a tectonic shift from product or industry focus to that of long-term consumer experiences. Businesses that don’t are forever caught in a perpetual cycle of competing for price and performance. It is in fact one of the reasons that Apple can command a handsome premium. The company delivers experiences that contribute to an overall lifestyle and ultimately style and self-expression. Think about the business model it takes to do so however. You can’t invent or invest in new experiences if your business is fixated on roadmaps and defending aging business models (SDL & LIOX?).”

This excerpt is from a fascinating article on the collapse of the Japanese consumer electronics industry and especially Sony, Panasonic and Sharp.

These quotes I think are particularly prescient for the professional translation business which is changing quietly and dramatically as we speak. Technology, new production models, changing buyer requirements and open collaboration models are changing the business in both very subtle and obvious ways in an increasingly global world. While many feel discomfort, very few feel like they understand what is going on, or have a clear sense for what they could do to deal with these changes. (It is more than: “I have to use MT “.) This blog claims to focus on these broader issues even though much of what I cover focuses on the translation technology impacts of these changes. These broader supply and demand forces are the primary forces behind many of these technology changes and adoption and it makes sense to try and unravel this through discussion and closer examination.

Thus if we look at the big themes of the year (not just in this blog, but also at conferences and the broader internet discussions) we see how the industry in evolving. Here are the most popular posts on this blog in 2013 (in order of popularity) around these key themes:

  • Post-editing compensation and practice
  • Clarifying new waves of misinformation on MT technology and practice put forth by alleged experts
  • Different views on the changes in the professional translation business
  1. Exploring Issues Related to Post-Editing MT Compensation: This article continues to get attention today even though it was written early in 2012 and it still shows up regularly in the top 3 posts, virtually every week. The post has links to several interesting posts on post-editing and I think this is possibly one of the reasons why it continues have long-term value, as it gathers different opinions and viewpoints in a useful and unbiased way. The popularity of this post suggests that this is an important issue to resolve in a fair and equitable way to enable broader MT adoption. All parties involved need to work together to establish trusted and equitable compensation as this could be a key driver or obstacle to broader MT deployment. It would be useful for translators especially to step forward and suggest ways to do this more efficiently and accurately. For example this post by Jason Hall shows that simply equating MT output quality to TM matches may not make sense, and that leveraging MT is entirely different from leveraging TM. However most observers still continue to miss the fact that MT output is the result of engineering efforts and can be managed to a great extent.
  2. Emerging Language Industry & Language Technology Trends Much of what I said about the overall trends in the industry in this post still hold true for the coming year. I was surprised at how little progress was made in understanding MT and how the glib talking and over promising just never seems to stop. MT is difficult to do well but increasingly easy to do badly, so I suspect we will see many unhappy translators who will  be expected to clean-up after incompetent MT practitioners for a pittance.
  3. Translation Pricing & PEMT Process Management This post documents ELIA Munich conference sessions that describe the huge variance in pricing for translation services, which I found quite shocking and was probably quite unsettling for many buyers of translation services. A discussion here also helped me to realize how tenuous many agency to customer relationships are and how easily they can be displaced by competitors. Finally, some good discussion on PEMT from actual practice. I think I saw clear signs that the day of the generic translation services agency are coming to a head and that specialists will rule in future. I predict that trust will be the most effective differentiator in the professional translation business and that this is earned by demonstrated competence and real expertise in specific subject areas. 
  4. Dispelling MT Misconceptions This was my response to a article in Multilingual magazine that I felt was filled with half-truths and gross generalizations that I believed were more a product of ignorance than malice. There is a some spirited discussion in the comments as well which I never censor unless they are clearly SPAM or personally malicious. These comments are helpful in getting opposing views also aired.
  5. Understanding ROI with Machine Translation Technology This post focuses on the issues that matter most for maximizing Return on Investment for MT. In a nutshell it can be stated as 1) Domain Focus, 2) Ensure the MT quality is the highest possible as “good” MT produces the highest productivity and least amount of translator backlash and discontent, and is harder to duplicate by instant DIY means, 3) Use these good MT systems a lot across multiple customers which also means you start developing real expertise in selected domains.
  6. Translator Strategies For Dealing With PEMT This was an attempt I made to provide some basic guidelines to translators to identify PEMT projects that are worth considering versus ones that are not. There is a very interesting discussion in the comments as well and one that I think might be worth a close look for anybody who wants to get a better sense for the human factors involved in automation and MT deployment. This is an area that I think deserves much more attention by MT vendors and all practitioners who wish to create win-win scenarios.
  7. Understanding MT Customization This was a post whose intent was to provide clear differentiation between an expert managed system and a typical DIY system. It evoked strong reactions from several MT vendors and perhaps ended up as a kind of a MT vendor brawl (definitely not my original intention) in the comments section. However, still useful if you want to see how different MT vendors approach the market. My bias/position is clear, most people who try DIY will produce sub-optimal results and and most DIYers don’t know how to do it themselves. MT is difficult even for experts, and if you cannot produce systems that are better than the public systems why bother?
Solis says that innovating for the next ten years will be part problem-solving, part design-thinking. But there are four aspects you should apply when you set out to create something, in order they are:
    • Empathy (the why)
    • Context (the connected world in which you are building something)
    • Creativity (in your approach to problem-solving)
    • Logic (the rationality to test what you have created)

I think it is quite possible that the business of translation is moving towards new business models to a kind of platform based approach.  Solis and others believe that creating this  platform is way disruption will come. The closest that I have seen in this business is that provides an example is Smartling but others like Gengo, Cloudwords  also point to directions of how this might evolve. They all change how customers buy and how the work gets done.

In considering one of the finest examples of how change can be brought about by inspiration rather than manipulation I think we must look at Nelson Mandela. I spent my childhood in Rhodesia under a government where institutionalized racism was the law of the land. I always felt as I grew older  (12-13) that the future was going to be bloody and violent. How could it not be? I saw what looked like innocent African men to me being beaten to the ground by policemen, and learnt how to cope with tear gas in my bedroom as a basic childhood survival skill (cover your eyes with wet towels).  I was thus astonished and amazed that one man could have so much influence when power shifted. Nelson Mandela though far from being a perfect saintly man, is a shining example of how a man can face adversity and oppression and still be graceful, joyful and civilized. In a speech in India he said: “I could never reach the standard of morality, simplicity and love for the poor set by the Mahatma, while Gandhi was a human without weaknesses, I am a man of many weaknesses.”

Mandela dancing and at peace with the world

For those who are not familiar with the man here is a wonderful tribute on the Brain Pickings site that includes his inaugural address in full.

“The greatest glory in living lies not in never falling, but in rising every time we fall.”

An Indian mystic’s viewpoint on the man

I wish you all Happy Holidays, Merry Xmas and a joyful, healthy and prosperous New Year.

Monday, December 2, 2013

IOLAR– A PEMT Case Study - Moses Revisited

IOLAR a Slovenian LSP, was interested in building a custom machine translation engine to translate technical engineering content from German to Slovenian. This language combination is difficult and has a relatively complex source language (for MT) combined with a very difficult target language (for MT) that, like other Slavic languages, has a large number of inflected forms. 

While translator productivity was important, the primary objectives were to ensure a high level of writing-style consistency and terminological accuracy. As there was no specific and directly related translation memory available to train the system, several hundred thousand segments were gathered from several somewhat related sources, in a much broader domain than technical engineering. This data was combined to form a single corpus that was used to train the engine. 

Earlier Attempts with Moses
Based on the widespread publicity around Moses and the increasing number of publicized Moses Case Studies, IOLAR decided to try and use Moses to accomplish their machine translation objectives. Part of the decision to deploy Moses in-house was based around concerns over data privacy. Sharing data with a Do-It-Yourself (DIY) Moses provider was a concern as many of these DIY providers are also translation agencies or are closely related to an LSP who may compete for the same business. 

IOLAR invested six months in an attempt to build a DIY Moses system of useable quality for this rather difficult language pair – German to Slovenian. A computational linguistics expert was hired and he spent three months building IOLAR’s own custom engines using DIY Moses technology. At the end of the six month period, IOLAR's Moses system was still producing unpredictable and unusable results. There were many problems with word order, terminology consistency, unknown words and incorrect inflected forms. Attempts made to understand and address the problems were unsuccessful. 

IOLAR compared the output from their Moses engine with Google Translate output and found that Google produced much better translation quality than their own system. However, neither IOLARs Moses engine nor Google Translate provided quality and related productivity gains that would create any advantage to the business. Many segments needed to be completely retranslated when post-editing was attempted. "Since our initial internal efforts did not progress with the desired speed we turned to Asia Online to deal with the growing urgency being communicated by our clients," said Simon Bratina, IOLAR's Executive Technical Director. 

Asia Online Custom Engine Training Plan
Asia Online addressed IOLAR's data security concerns with a contract that provides comprehensive protection of the data and ensures that IOLAR maintains all the appropriate rights to the data and that Asia Online can only use the data for the purposes of customizing IOLAR's engines. 

IOLAR provided the same translation memories that were used in their custom Moses engine for analysis and inclusion into the Asia Online custom engine, and worked with Language Studio™ Linguists to create a Customization Training Plan that addressed their specific goals. The plan identified issues and gaps in the training data and created a roadmap to address them. 

Language Studio™ Linguists are specialist linguists that have had comprehensive training in the creation of commercially viable, high quality custom engines. Commercially viable means that the output actually helps professional translation work get done more efficiently. The linguists, who possess very different skills from an NLP or computational linguistics academic, focus on fine tuning MT engine data and algorithms to minimize post-editing efforts.  Language Studio™ Linguists use human cognition to determine which tools and automated processes will be applied to refine and create MT-related data to achieve the optimal results for a client - something that an automated process is not capable of today. A unique plan is developed for each custom engine, with a broad suite of data analysis and data manipulation tools used in conjunction with language and domain specific approaches to ensure optimal data preparation when building a custom engine. This differs considerably to the DIY model where data is simply uploaded and immediately processed (by algorithms sometimes) without human analysis. 

Four key issues were identified that were deemed would greatly increase translation quality, and steps were added into the plan to address these issues: 

Issue: IOLAR’s translation memories were from multiple sources and included inconsistent terminology. This would result in inconsistent terminology in the translation output.
Solution: In addition to the standard data cleaning that is part of the Clean Data SMT model, Language Studio™ tools were used to normalize terminology so that when translating the terminology choices were limited to those preferred by IOLAR. 

Issue: The domains that the translation memories originated from, while related, were not a match to the desired target domain of technical engineering. This resulted in many technical terms being unknown and significantly lowering the quality of translations. In Statistical Machine Translation (SMT) an unknown term can have a very negative impact on translation fluency and overall translation quality. 
Solution: Language Studio™ Advanced Data Manufacturing tools were used to perform gap analysis which identified several thousand unknown technical terms. Language Studio™ Advanced Data Manufacturing resolved the unknown terminology which were validated by IOLAR’s linguists specialized in the domain. 

Issue: The writing style of the translation memories was varied and not relevant to the target domain of technical engineering. Even if an understandable translation could be produced, it would be in the wrong context and style, and therefore needed a large amount of editing in order to deliver publication quality.
Solution: Language Studio™ Advanced Data Manufacturing tools were used to manufacture appropriate grammatical structures and contextual data in the correct writing style. This was driven by a deep analysis of the client’s translation memories, and automated manufacturing of data that would adapt the writing style to the client's requirements. 

Issue: As Slovenian is a heavily inflected language, one of the very common issues was that the correct term was being translated, but in the incorrect inflected form. In many cases, the correct inflected form was not in the translation memories provided by IOLAR.
Solution: Language Studio™ Advanced Data Manufacturing tools were used to manufacture appropriate inflected forms in the correct context. This data would be used to ensure that the correct inflected form was available in the training data and thus reducing the number of incorrect inflections in the output. 

Initial Results
In contrast to IOLAR’s Moses engine, the custom engine created with Language Studio™ was built quickly and without the need for specialized computational linguistics or NLP skills from IOLAR. This freed up IOLAR’s translators to be able to work on more important tasks such as terminology refinement and validation. The resulting Version 1.0 engine was considerably better than IOLAR’s previous internal efforts with Moses and was also higher quality that Google. While there was still plenty of room for improvement, this initial engine was useable for starting the pilot project. 

Language Studio™ uses "Blind Test Sets" to measure initial quality using BLEU and other automated quality assessment metrics. Productivity metrics are used to validate the automated metrics. The initial Language Studio™ custom engine was 32 BLEU points better than Google Translate and 34 BLEU points better than Microsoft Translator. While BLEU is a useful indicator of quality, human productivity when post editing is a much better metric to indicate success, quality and value. 

Quality Improvement Plan
Many of the error causing issues in a custom engine are not visible until the engine has been trained and the output can be inspected. This is particularly true of more complex language combinations such as German to Slovenian. The first version of a Language Studio custom engine is called a Diagnostic Engine for this reason. Much like the Custom Training Plan, the Quality Improvement Plan is based on a deep understanding of the specific issues that have been determined when Language Studio™ Linguists reviewed the output and data. Using their extensive experience in customizing thousands of translation engines, Language Studio™ Linguists created a plan specific to IOLAR's custom engine that delivered the most rapid improvement with the least effort. 

In addition to the Quality Improvement Plan, Language Studio™ Linguists guided IOLAR through the post-editing process and showed them how initial post edited data could be fed back into the engine and used to quickly improve translation quality. Some training was also provided to IOLAR’s team on how best to leverage runtime customization features in Language Studio™ such as Runtime Glossaries and Post Translation Adjustments which further improved quality and corrected some capitalization and formatting issues.
As the initial custom engine in its diagnostic release stage was good enough for the production usage test, the Quality Improvement Plan was able to incorporate valuable post editing feedback at this stage. While IOLAR was processing and post editing, Language Studio™ Linguists identified several improvement paths and manufactured additional data to improve grammar structures, word order and terminology consistency.
On receipt of the post edited data, analysis of the edits was performed by Language Studio™ Linguists again additional data was manufactured to reinforce the edits. These changes created an immediate 4 BLEU point increase that was validated by a noticeable increase in post editing productivity. 

The IOLAR experience is an example of how a DIY approach might not work for professional production scale machine translation. During their “learning by doing” approach to DIY machine translation IOLAR spent a lot of time trying to understand why their initial efforts were producing such unpredictable results, and found that on their language combinations the free online MT engines were easily outperforming their own Moses efforts

The IOLAR example highlights an inherent issue with DIY machine translation, whether Moses based or from a commercial service – it implies that the user knows how to do-it-themselves. This case study demonstrates clearly that high quality machine translation requires considerably more effort, knowledge and skill than simply loading data into a system for training. Achieving a quality level that was useable for efficient post editing was clearly not the simple task that some at TAUS and third-party DIY proponents had conveyed.
From a business perspective it was clear that
outsourcing to an expert was a better strategy than a
DIY struggle, and I would say that our investment in
Asia Online’s Language Studio™ technology was one of the
best technology investments that we have made.
Some of the very technical segments were the same quality as human translation.

– Simon Bratina,   
Executive Technical Director, IOLAR
While some DIY Moses efforts are successful, few DIY Moses users know how to address or even identify the cause of problems when they do occur, even if they have some knowledge or training in the core technological concepts. Moving beyond the initial problems in a DIY Moses custom engine is a significant challenge, even when expert NLP specialists or computational linguists were on staff. Skills in understanding data, not just algorithms and tools, are required to address the challenges in adapting, refining and creating data to address issues, either preemptively or as a remedy to issues. 

Without a deep understanding of the cause of problematic machine translation output and corrective strategies to remedy them, the only improvement path available for most DIY Moses users is to upload post edited machine translations or additional translation memories. As there is little or no understanding of the impact that the new data will have, often the issues are not resolved and in many cases new issues and problems are introduced. 

Language Studio™ Linguists provided IOLAR with the deep understanding of issues and provided efficient solutions to resolve critical issues affecting the quality of machine translation output. This ability to understand the data and error patterns has been gained through the creation of thousands of custom engines. Language Studio™ Linguists played a considerable part in taking this project from unsuccessful beginning on DIY Moses to being a considerable success in Language Studio™. 

The overall conclusions and results drawn from IOLAR's collaboration with Asia Online:
  • Working with an expert results in a much improved and significantly more efficient overall process.
  • It is safer for an LSP to work with a non-LSP for technology that is as strategic as good MT can be.
  • The long-term expertise and tools and capabilities like data manufacturing that Asia Online brought to bear on the process made it possible to reach high quality levels in just a few iterations.
  • IOLAR noticed that there was a clear improvement in the machine translation output quality after the first iteration (incremental training) and they were surprised to see that "some segments were the same quality as human translation."
  • The data manufacturing and refinement tasks performed by Language Studio™ Linguists and further refined by IOLAR's staff had greatly reduced the number of unknown words and incorrectly inflected forms, and delivered consistent terminology across translations.
  • IOLAR achieved their core objectives of ensuring a consistent writing style and broad terminological accuracy that the clients had stated were of critical importance.
  • IOLAR accomplished an improvement in the overall production efficiency.
  • IOLAR realizes that while Moses may work for some simple cases where there is plentiful data in the target domain and language pair, deep expertise is required to produce successful systems outside of this atypical scenario. Even on these simple cases, it is now understood that with refinement of data along paths recommended by specialists, an even better result is possible.
  • IOLAR is now making savings where it matters – in building the competences for an efficient post-editing when machine translation is used. While Moses is technically “free”, there are significant costs in staffing, hardware and other resources. There is also considerable risk in deploying a Moses system, even when hiring experts with computational linguistics experience.
  • Even if IOLAR’s Moses system had delivered a quality that was better than Google, the savings and costs when compared to their investment in Language Studio™ would have been marginal. It has become clear to IOLAR that the Total Cost of Ownership (TCO) in a Language Studio™ system far exceed what was possible with DIY Moses solutions.
A significant success factor of the collaboration between Asia Online and IOLAR is that IOLAR better understands the specifics of machine pre-translated text for difficult language combinations. IOLAR now has developed a growing expertise and competence with the post-editing requirements and necessary competences and will be able to approach customers that need such translations with confidence.

Friday, November 8, 2013

Translator Strategies For Dealing With PEMT

I recently had the opportunity to speak to a group of translators and interpreters about machine translation and how it increasingly impacts their work lives. Given that more and more agencies are using MT nowadays,  it is now much more likely that a translator might be approached to do post-editing work and thus my message to the translators at the event focused on how to assess these opportunities (or hazards) and maximize the benefit of any interaction.

Translators have much more power than they realize and I predict that they will eventually learn to separate the wheat from the chaff. Translation agencies will hopefully figure out that while MT technology will proliferate, the shortage of “good” translators will only intensify in a future where global companies want to translate 10X or more the volume of information they do today. 

We see many examples of MT use by agencies today but very few of these would qualify as skillful and appropriate and even fewer would be considered fair to the post-editors. It is my sense that MT technology will only offer long-term competitive advantage to those who use it with skill and real expertise and have skilled translators involved in the process. It is very easy to dump data into an instant MT portal and get some kind of an engine, but not so easy to get an engine that provides a long-term cost and efficiency production advantage.

If you are one of those translators who feels that they will NEVER do post-editing work or have decided that you simply don’t want to do it because you have plenty of “regular” work then this post will probably not be of any interest. I am one of those people who believe that MT will continue to gather momentum and that it is useful to translators to understand why and determine when to get involved or not. (And this is not just because I am involved with the sales and marketing of this technology.) It just simply makes sense at a common sense level. The first thing to understand is that all MT engines are not equal and that free online MT is not the best example of professional use of this technology even though it can be surprisingly good in some languages.

Since there is a great deal of variation in the specific MT output that translators are expected to post-edit,  I think it makes sense for a translator to understand each unique opportunity as it comes along, and determine whether it is worth his/her time and engagement. Some MT opportunities can pay better than standard translation work, if the word rate to MT quality ratios are properly determined, and thus I think it makes sense for translators to understand when this is actually the case. Early experiences with incompetent or unscrupulous MT practitioners have helped PEMT work develop a reputation for being mind-numbing work that is poorly compensated. This IMO is more a reflection of the quality of these early efforts than of the real possibilities of the technology when used with expertise. 


Thus I have come up with a simple checklist for a translator to evaluate a potential PEMT “opportunity” and decide whether to engage or not.

1) Compensation is linked to actual work effort

The MT output quality to word rate (financial compensation) relationship is a fundamental issue for translators. It is important to understand the “average” output quality of the MT output and then understand the effort required to fix it to target quality levels, and ensure that it is related to the compensation offered.

Since with PEMT we are generally talking about asking translators to accept a lower rate than they normally charge, it is important that there is a modicum of trust with the agency in question. This would allow a fair and reasonable rate to be established that matches the effort required to get the MT output to required target levels. This subject is dealt with in some detail here and here. The better you understand your own personal productivity with the specific MT output you are dealing with the more informed your decision will be. The specific effort level can be assessed quickly by doing a small test with a “representative sample” of a 100 or so sentences. The throughput measurements you make can then be used to extrapolate and calculate an acceptable rate. 

So if your normal throughput is 2,500 words/day (313 words/hour) and you find that with the test MT output you can expect to do 5,000 words a day, it would be reasonable to accept a rate that is 60% of your normal rate and even 50% might be fair if you feel the sample is very representative and you do not mind this type of work. (I would err on the higher side as the test is only as good and representative as your test sample.)

A critical skill to develop in these scenarios is the quick assessment of the MT output quality and determine what your work throughput and thus acceptable rate is. Remember small grammar and word order errors are much easier to correct than word salad and bad and inconsistent terminology problems which require research. The rapid assessment of the quality of the MT output should be an important part of determining when a project is worth doing or not. Having a basic understanding of BLEU, Edit Distance and other methodologies is useful as this can expedite assessment of the PEMT opportunity. Asia Online offers free software to run BLEU and develop your own error classification based calculations.

Some things to be wary of include:
  • Agencies that establish an arbitrary lower word rate independent of language and MT output quality. This is a pretty good clue that they don’t know what they are doing and a sign that there will be dissatisfaction all around.
  • Agencies using DIY MT who don’t really understand what they are doing. Expect great inconsistency and variability in the output quality and usually lower overall quality which means a greater PEMT effort.
  • Agencies that have the same rate for tough languages like Japanese and easier languages like Spanish PEMT work. I would generally expect that that the effort would be greater for tough languages and so they should be paid at higher rate.
  • Agencies that give you MT output that is lower in quality than you could get on your own from Google or Microsoft. This is a sign that they do not understand what they are doing.
  • There are many agencies out there that have very little understanding of the complexities of MT and are only using it as a way to reduce costs. They will give you crappy output to edit and expect you to fix it for a fraction of a reasonable rate. Identify these agencies and let fellow translators know who they are. Avoid working with them.
  • Hourly rates may actually be better for some kinds of MT projects where the translator is expected to only do a partial correction. Research suggests that it is very hard to define how far a partial correction goes.
An example of agencies that do it right and use objective and trusted measures to establish fair compensation include Advanced Language Translation and Omnilingua. It is worth understanding their process.

2) Trust and communication around technological uncertainty

I think that one of the main reasons MT has taken so long to gain momentum is the low levels of trust within the supply chain and unfortunate early experiences with MT where rates were lowered unfairly and translators were expected to bear the brunt of incompetent use of MT technology. The stakeholders all need to understand that the nature of MT requires a higher tolerance for “outcome uncertainty” than most are accustomed to. Though it is increasingly clear that domain focused systems in Romance languages are more likely to succeed with MT, it is not clear very often how good an MT engine will be a priori, and investments to measure this need to be made to get to a point to understand this. 

The stakeholders all need to understand this and work together and each make concessions and contributions to make this happen in a mutually beneficial way. This is of course easier said than done as somebody has to usually put some money down to begin this process. The reward is long-term production efficiency so hopefully enterprise buyers are willing to fund this, rather than go the fast and dirty MT route as some have been doing. Agencies that are new to MT and post-editing are those most likely to get it wrong and translators should seek out agencies that are sensitive to resolving the uncertainty in a fair way.

Some specific things that translators can watch for include:
  • The quality of the dialogue and rapport with the project managers at the agency.
  • Some agencies provide very clear examples of what they expect you to do with different kinds of errors. This is a good sign and helps focus the work in the most efficient way. Some like Hunnect develop an online training course for post-editors to help clarify this.
  • The agencies that are willing to work with translators to deal with this technological uncertainty are the ones to focus on. Again Scott Bass from ALT provides wise words on PEMT Best Practices and provides an example of what a win-win scenario looks like.
3) Ability to interact with and control MT technology 

One of the common complaints about PEMT is about the drudgery of error correction work. This does suggest that not all translators want to do this kind of work or are well suited to it. Many translators are also seeking to provide feedback and steering advice to the MT system to reduce the drudgery, however, not many MT systems can properly use and leverage this type of feedback. Some like the Asia Online Language Studio are designed from the outset to utilize this type of feedback. We are seeing now that many translators do realize that MT can be an aid, much like TM, to get repetitive translation work done faster. MT offers “fuzzy matches” for each new segment that is translated through the system. Good MT systems will produce the equivalent of high quality fuzzy matches and will be much more consistent in output quality than what most of us experience with free online MT (or most DIY efforts) as shown in the second graphic above. Bad MT systems will be inconsistent, unpredictable, produce lower quality output and generally be unresponsive to any corrective feedback, especially when the practitioners are simply dumping data into an instant MT engine making portal.
The following are some characteristics of superior MT platforms:
  • The ability to provide some initial error pattern feedback to reduce mind-numbing correction work.
  • Noticeable improvements in quality with relatively small amounts of corrective feedback.
  • The ability to control the MT output with terminology or repetitive error pattern corrections at run time in addition to the upfront overall training, as this can greatly enhance the speed of the post-editing work.
  • A defined process to take small amounts of corrective feedback to improve the engine BEFORE a production run to reduce the post-editing effort.
  • The ability to control the overall linguistic style of the translations to requirements.
This outlines some kinds of corrections that can be run at the time of running a translation through an existing Asia Online MT engine.

Examples of correcting problematic source text to make the post-editing task easier.
Example of using preferred terminology in the event that the original training chooses other terms.

If you have a good feeling about all three items in the list above PEMT can be just another kind of translation task and can sometimes be one that offers greater financial reward. 

There have been several studies of varying quality that examine how PEMT compares with regular translation approaches and we see mixed results and often experimental bias. I just saw this study on The Efficacy of Human Post-Editing for Language Translation from Stanford that attempts to measure this in as objective manner as possible. I like that they also summarize many previous studies. Some may find fault with this one too because they use oDesk, even though these were translators who had passed a 40 question skill/competence test. IMO the study is perhaps more objective and rigorous than most I have seen from the localization community and I think it is worth noting the key findings and is worth a closer look by anybody interested this issue. They ran a carefully monitored regular vs. PEMT comparison test for 3 languages (English to Arabic, French, and German) and found the following:
  • Most translators found the MT (Google Translate) useful and preferred it to not having a suggestion
  • PEMT reduces the time taken to get the task done
  • Across languages they found that the suggested translations improve final quality
  • Across languages, users provided the following ranking of basic parts of speech in order of decreasing translation difficulty: Adverb, Verb, Adjective, Other, Noun.
“Our results clarify the value of post-editing: it decreases time and, surprisingly, improves quality for each language pair. Our results strongly favor the presence of machine suggestions in terms of both translation time and final quality. If translators benefit from a barebones post-editing interface, then we suspect that more interaction between the UI and MT backend could produce additional benefits.”
I would love to hear what other translators may have to share about their PEMT experiences, both positive, negative and suggestions they might have to improve the process.  I would like even more to hear what they think about an ideal post-editing environment or workbench and recommendations they would have. 

If you are interested in my slides from the MiTiN presentation you can find them here.

Wednesday, September 18, 2013

Understanding MT Customization

While we have reached a point in time where many more people realize that machine translation (MT) produces the best results when it is properly customized, what customization actually means is still not well understood. 

There is a significant difference between shallow customization and deep customization in terms of the impact on the MT system’s output quality. The quality of output in turn has a direct impact on the potential business leverage and return on investment. There are a growing number of MT vendors, but very few real MT developers in the market today, and deep expertise is the key differentiator that leads directly to better output and better productivity. It is important for anyone considering purchasing an MT solution to understand the difference between the two types of vendors.

Generally, MT developers have created either Rules Based Machine Translation (RBMT) or Statistical Machine Translation (SMT) systems with hands-on coding at the deepest levels of the core MT engine and its surrounding technologies. Thus they are likely to have insight into how and why an MT engine works the way it does. They are also more likely to be able to coax an engine to produce better quality output by applying the optimal corrective actions to improve on initial results.

In contrast, most Do-It-Yourself (DIY) MT vendors provide little, if any, real innovation and focus on simplifying and packaging a collection of open source tools into a web based offering. Their primary emphasis is on simplifying the interface to these open source tools and enabling a user to build a basic MT system with user data instantly. I would characterize this approach as a shallow customization. When real understanding of the engine technology and data is required, few have the necessary skills needed to make this initial MT engine quality better on an ongoing basis and even less ability to make it reach levels of quality that provides real competitive advantage.

When evaluating MT vendors, there are a few simple things that anyone considering purchasing an MT offering should understand:

Is your MT vendor a serious developer of MT technology or do they simply provide/package other third-party or open source MT technology?
There are only a very small number of companies developing commercial enterprise class MT. Most MT vendors are users or packagers of third-party technology. Many do not have the depth of understanding to do anything but the simplest and shallowest MT customization tasks. These vendors will often present themselves as experts and sometimes claim to be technology agnostic. Some Language Service Providers (LSPs) that have a few years’ experience using open source or third party RBMT systems are presenting themselves as MT experts. Be wary of any vendor that claims deep experience in multiple MT technologies. Advanced skills in any MT technology require long-term investment and long-term experience to get to any kind of distinctive expertise. To get good results from any of these approaches require very different skill-sets and independent and unique expertise must be developed for each approach. The notion that a standard set of MT development skills that work anywhere and everywhere is a myth.

Any MT vendor that does not have a strong and experienced human skill and human steering component in the customization process will always deliver lower quality results. 

Does your MT vendor use a Clean Data SMT or Dirty Data SMT strategy?
The Clean Data SMT approach was pioneered by Asia Online in 2008. Most MT vendors do not have the technology or rigorous data analysis and data cleaning processes to deliver a Clean Data SMT approach, and so take the easier Dirty Data SMT approach. Clean Data SMT has many benefits such as more rapid improvement from post editing and provides the ability to manage and control terminology so that it is consistent. Dirty Data SMT by its very nature is unpredictable and inconsistent and therefore is difficult to manage and much slower to improve with corrective feedback.

Does your MT vendor claim that MT is easy?

3 Monkeys
Some MT vendors claim that MT is not complex. One DIY MT vendor even likens those who claim MT to be complex to be monkeys. The reality is that running an open source MT solution or using a “upload and pray” solution like that of many DIY MT vendors has become very easy. 

Building an instant MT engine is not the same as delivering a production quality MT system that provides production efficiency. Indeed, a significant number of DIY custom MT engines deliver translation quality well below the quality of Google.
Delivering high quality MT requires skill, a deep understanding of the different approaches to MT and the inner workings of the technology, a deep understanding of the data used to engineer the customization process and a range of tools, skills and knowledge that permit optimization to deliver the highest possible quality. There will be unique requirements to each and every engine – after all, the point of customizing is to match the translation output to a particular customer writing style and audience. This can only be achieved with human cognition and guidance and cannot be fully automated. 

The impact of real expertise is clear. Asia Online customers can speak on the record of achieving productivity gains greater than 300%, while DIY MT vendors typically claim that they can deliver productivity gains between 20-40% if any at all.

Does your MT vendor give you control of the data and the process?
Many MT vendors today provide very limited control of core data elements and typically rely on a simple “upload and pray” web interface that promises instant results. They generally lack the ability to manage, control and normalize data used to customize an MT engine and generally do not have any data analysis and data manufacturing capabilities. A developer like Asia Online provides multiple levels of control, both during the development and translation process that enable much better output quality and thus higher productivity.

What is expected of you as a user in order to customize and MT engine?
If the answer is nothing more than uploading your translation memories (TMs) then a red flag should already be raised. Machine translation can be very high quality when managed with expertise, but expecting good results without any knowledge investment and real expertise is not realistic. 

Just as in any high quality focused human translation project management, special tools, processes and expertise are required to get better results. 

Any custom MT technology that does not require your involvement in steering the customization process will deliver considerably lower quality output - often worse than anybody could do with Google or Bing. MT systems that produce good quality output require human steering, guidance and control. This is possible with today's technology, but does require more effort than just uploading some translation memories.
How much effort does it take and how quickly can the customized engine improve after the first version?
Dirty Data SMT systems offered by DIY MT vendors require significant amounts of new data to improve the system after an initial system is in place, usually around 25% of the total training data that the custom MT engine is built on. So if your engine has 3 million segments provided by your MT vendor and 200,000 segments provided by you, to see any improvement you will need at least 640,000 new segments to see a noticeable improvement in quality. Getting this much additional data is usually beyond the reach for nearly all users of MT systems. As the customization approach is Dirty Data SMT, errors are very difficult to trace and correct. The standard means to correct issues is to add more data and hope that the problem is resolved. 

Clean Data SMT systems such as Asia Online’s Language Studio™ can learn and improve with just a few thousand edits and every edit counts. Terminology is consistent, and there are tools to identify common problems ahead of time and means to automatically resolve them. Data manufacturing is also applied to amplify edits and corrective feedback and ensures they are applied to the engine in a broader set of contexts. The cause of errors can quickly be traced and the errors can be rectified using a number of problem analysis tools. The resulting improvement is rapid and noticeable even with a very small effort by a single person. 

Bottom Line: Creating a high quality custom MT engine requires deep expertise, control and broad experience, elements that are usually not present in the "upload and pray" approach provided in a DIY MT model. Developing high quality MT is complex and in 2013 still an expertise based affair. 

To simply upload a translation memory and expect high MT quality to come out is wishful thinking. A computer cannot automatically know your preferred terminology, vocabulary choices, writing style, target audience and purpose. Just like a human translation project, achieving quality requires effort, time, management and skill. 

Customizing an MT engine to produce “near-human” output quality levels is possible and there are many proof points where raw MT output has been able to produce 50% or more of the MT translated segments requiring no editing at all - i.e. they were “perfect”, with many of the remaining segments having minor issues that could be quickly edited. A fully customized MT engine built on the Clean Data SMT approach consistently deliver 150%-300% (sometimes even greater) productivity gains. The long-term ROI impact is clear relative to the meager productivity that instant MT approaches sometimes produce. 

MT in 2013 is still a complex affair that requires deep expertise and collaboration with experts if your intention to build long-term business leverage through more efficient translation production processes. There is no advantage to a system that any of your competitors could create instantly and there is no value or business advantage to just dabbling with MT.

“When conceiving the idea of Moses, the primary goal was to foster research and advance the state of MT in academia by providing a de facto base from which to innovate from.

Currently the vast majority of interesting MT research and advancements still takes place in academia. Without open source toolkits such as Moses, all the exciting work would be done by the Google’s and Microsoft’s of the world, as is the case in related fields such as information retrieval or automatic speech recognition
Philipp Koehn

As a platform for academic research, Moses provides a strong foundation. However, Moses was not intended to be a commercial MT offering. There are considerable amounts of additional functionality, beyond providing a web based user interface for Moses, that are not included in Moses that are essential in order to offer a strong and innovative commercial MT platform. “ 
Professor Philipp Koehn, University of Edinburgh, Chief Scientist, Asia Online

Addendum: This post triggered a strong reaction from Manual Herranz at Pangeanic and I am including a response I made on his blog in case it does not make it through the approval process there. 

My primary point in my blog posting is that expertise, long-term experience and a real understanding of how the technology works is necessary and critical to get the best results. Most DIY users do not have these characteristics and thus are very likely to get much lower quality results. Pointing this out, to my mind is not equivalent to "bad mouthing competition",  I am simply comparing approaches and pointing out the value implications.

Also, while I claim that expertise does matter, I do not suggest that Asia Online is the only company with this expertise. There are several other MT experts including RbMT developers like Systran and a specialist like Tayou in Spain.

I do believe that  MT technology is complex enough that it does require specialization, and that developing real competence with MT is difficult enough that it is unlikely to be successfully done by a company  whose primary business is being a translation agency. It is clear that you disagree.  I am also pointing out that the value received by a customer is very likely to be lower for a DIY user. I can understand that you may have a different opinion to mine and assure you that my observations are not borne of virulence.

Historically we saw many LSPs develop their own TMS systems too, but most people in the industry would concede that  the best TMS systems have come from companies that focus and specialize in the development of these tools e.g. MemoQ, Memsource, Across, XTM etc.. We have also seen the SDL acquisitions of software companies like Idiom, LW and Trados result in what most perceive as reduced customer responsiveness, quality and commitment to these products. Buying critical production infrastructure from a competitor generally does not make sense in any industry and thus we have seen the momentum slow down on all the SDL software acquisitions. IMO Specialization matters and with technology this complex, one will get the best results using technology developed and managed by specialists for the foreseeable future.

Anyway, I wish you peace and health.


Wednesday, August 28, 2013

Understanding ROI with Machine Translation Technology

As the use of machine translation gathers momentum in the professional translation world, it is interesting to see that much of the essential economic rationale for effective deployment of this technology is still somewhat muddy and unclear. MT use on it's own does not guarantee ROI.The MT system has to produce output that actually improves the production process to generate meaningful ROI.  Poor quality MT impacts goodwill, reputation and is also a waste of time, effort and money. There are many in the industry who continue to view machine translation (MT ) with the same “project-oriented mentality” that is common with traditional translation work that sometimes involves the production use of translation memory (TM ).  However, MT is fundamentally different, and I think it requires a different mindset in design and technology deployment to maximize economic benefits. TM is often used as a way to reduce costs and often MT is seen as a new way to lower costs without any real understanding of it’s viability and value in the translation production process. This sub-optimal use of MT has resulted in translator resistance and results that are often less than impressive.

To elaborate, a project-orientation is something that makes sense and works well with cottage industry approaches that typify historical translation project work, and most CAT tools and translation memory technology fit well within this paradigm. This project approach is widely prevalent in the language service provider world, where teams and tools are assembled to get a translation job or project done as jobs come in. There is little specialization in terms of subject matter knowledge on the translated content and there are very few LSPs who develop any kind of domain specialization or in-depth subject matter expertise. Every translation job is seen as being of equal weight, and the general objective is to quickly assemble  a team (translators and reviewers) to get source content converted to target content, using TM if available. The best service providers build a base of trusted translators, who they work with on a regular basis to ensure “quality” and institute processes to minimize errors and produce standardization in the production process. In this world it is very easy to replace an LSP (even large ones like SDL and Lionbridge) since their key value add is project management and translation management software sometimes, and thus LSPs are always vulnerable to being switched out to a predatory competitor passing by.  

MT on the other hand requires a different perspective. In most cases, the successful use of MT in professional translation work is the equivalent of building a production line. In general, one would (should) not build a production line for a single project unless one expected to do many similar kinds of projects. One should only build production lines for projects that one expects will have large volumes or where one expects repeat business on an ongoing basis. Production lines require specialization and it is unlikely that a single production line will do every thing well. Having a well functioning production line will give the producer a cost advantage but good, efficient and effective production lines are never created instantly. They always require investment and refinement and uncommon expertise.  The greater the efficiency of the production line, the greater the cost advantage which can result in meaningful barriers to competition.  MT done right can provide long-term cost and competitive advantage.  While the “free” engines offer useful value in some languages, these systems are usually not considered of adequate quality to be useful in professional translation settings where the final deliverable is work that looks like it was done by human translators. Many of the MT systems in use today are developed by naïve LSPs who send very low quality output to post-editors and expect them to fix it for lower than standard rates. Thus the huge and justifiable hue and cry in the translator community about mind numbing post-editing work. 

Thus, when we look at the current adoption of MT in the professional translation world we frequently see the following:
  • A focus on the initial outlay for MT experimentation that often leads to adoption of “free” and open source technology as the initial focus is only on start-up costs,
  • A rush to Moses and a large number of substandard MT systems that produce output lower in quality than the “free” MT from Google and Microsoft (all the other free engines are hardly worth the bother),
  • Some who claim “expertise” are merely building simple dictionaries for RbMT systems (that have been around for 50 years with little or no quality advancement in that period) or throwing data blindly into an instant Moses setup,
  • Very little general awareness of the deep expertise and experience required to tune, adjust and modify MT systems to meet business production requirements at a meaningful level of utility,
  • Some LSPs claim to provide MT services for other LSPs, which to me is the equivalent of Ford asking Kia to build cars for them. It makes very little sense to me why an LSP would go to SDL (or any other LSP) to get them to build MT systems for them.(A death wish? Reckless at least. ) Those who do this might want to listen to this Zappa song. (Warning: some might consider this NSFW)
What is common to all of these early initiatives is that they all have a focus on a relatively low initial investment strategy, very rarely is there any deep expertise involved or required, and they essentially provide no real barrier to competition since these efforts can be easily duplicated or adopted by any competitor.  Interestingly, the technology has gotten good enough that even naïve attempts can sometimes produce productivity improvements of 5 and 10 percent.

However, to maximize economic benefit from MT technology I think requires all or at least some of the following:
  • A clear understanding of your production efficiency before you use MT. It is difficult to understand the impact of MT on the translation production process if you do not have an understanding of the baseline efficiency.
  • A clear subject domain focus rather than the generic “anything and everything” focus that so many MT initiatives start off with, as this domain focus is a critical requirement for producing higher quality output,
  • Deep MT engine development expertise to ensure that the MT system output is of the highest possible quality.  The quality of the output is directly related to productivity improvements in the production process and are the essence of the economic benefit of using this technology. This also usually means that most do it yourself (DIY) efforts will not be your best foot forward and actually result in lower ROI and higher total cost of ownership (TCO) and frequently result in failure,  

  • Repeat and regular use of “good” MT engines will result in greater ROI. This means that MT is not a great strategy for a single project that may never happen again unless it is of substantive volume. (This seems pretty obvious but you would be amazed how often this is overlooked.) The more work you do in a single domain, the greater the benefit and the greater the leverage and the more useful it is to have an efficient production line
  • Manage expectations with customers, translators, editors and ensure that compensation and benefits are equitably shared so that win-win scenarios are created.

Thus a simple formula to maximize ROI (from my somewhat biased perspective) would include the following:
  1. A clear subject domain focus that could have multiple sub-domains as shown in the example where an Automotive domain MT system could be further refined for different clients and specific types of content.
  2. Work with an expert to develop these engines as long-term experience and deep expertise really do matter and the knowledge required to do this well Is not easily or quickly acquired.
  3. Focus on getting more clients in this domain and demonstrate the cost and timeliness advantage that a good MT system would provide. For the customer this would mean lower cost, faster turnaround at equal or better (yes better) overall quality. Building a large customer cohort will also enable a service provider to develop real subject matter expertise and provide value beyond the basic project management and translation workflow management.
  4. Continue to invest in refining the MT systems so that the engines produce continuously improving output. This will positively affect your future cost and turnaround times as it is much faster to process very high quality MT output.
  5. Expand the use of a good quality MT engines to new types of content that would not get translated otherwise. Thus, in the automotive scenario it would be possible to translate all kinds of internal product discussion related documents and emails, competitor websites, trade journals and international market market feedback and coverage. While this new type of content may not go through the same quality assurance and post-editing it could still be quite useful to monitor international markets and share more information with dealer and customer networks, even in a raw-MT format.  There is much evidence from the information technology sector that current support content and background technical information is valuable in building customer happiness.
  6. Shared benefits so that new processes are more easily accepted.

A basic rule of thumb is that the efficiency of the MT system (the productivity impact) matters much more than initial start-up costs. 

I am willing to bet that a language service provider who provides superior MT solutions will be considered a much more valuable partner to most corporate customers who are interested in sharing and monitoring business related content in international settings. While it is easy to duplicate the capability of most DIY MT systems, the quality of expert managed and developed systems are hard to match, and usually provide much greater productivity than DIY efforts which reduces the TCO and in most cases provide a clear long-term barrier to competitive service providers. MT is still a very complex undertaking in 2013 and the systems that produce the best ROI will very likely require expert guidance and input. Remember that long-term advantages only come after you have built a distinctive advantage in your MT systems and thus the key to the highest long-term ROI is an MT system that produces the highest quality output possible.