Pages

Wednesday, May 14, 2014

Improving the MT Technology to Translator Dialogue

While we see that MT technology adoption continues to grow, hopefully because of clearly demonstrated benefits and measured production efficiencies, we still see that the dialogue between the technology developers / business sponsors and translators/post-editors is often strained, and communications can often be dysfunctional and sometimes even hostile.

While there is a growing volume of material on “how-to-use” the technology, much of this material is of questionable quality, there is still very little discussion about managing human factors around successful use of the technology. The growth of instant, do-it-yourself (DIY) tools only unleashes more low quality MT output into the world and there are translators who are expected to often edit (fix) very low quality MT output for a pittance. Getting good quality MT output requires real skill, expertise and preferably some considerable experience. The actual translator experience with “good MT” is not going to be so different from working with TM (though MT errors are quite different from TM errors) and is likely going to be very different from the negative experiences described in translator blogs.

The history of MT has indeed been filled with eMpTy promises beyond the real possibilities of the technology, and more recently we see lots of sub-par DIY systems built by mostly incompetent practitioners that do cause pain/fatigue/stress/frustration/anger to translators who engage or are somehow roped in to clean up the mess. This fact does not however lead to a conclusion that the outlook for MT is bleak and hopeless in my eyes. 

Rather, it suggests that MT must be approached with care and expertise, not just in terms of basic system development mechanics but also in terms of managing human expectations and ensuring that risks and rewards are shared amongst the key stakeholders, and that transparency and equity should be guiding principles for MT projects in general.

I don't expect that MT will replace human translators, but I do expect that for a lot of business translations with largely repetitive content with a  short shelf life, it will continue to make sense. Most of the corporate members of TAUS (who also pay for a lot human translation work) are driven to deploy MT because they are indeed faced with more volume and content that is very valuable for a few months but with little value after that. The basic business urgency requires that they explore other approaches to getting material translated. They have often done this independently of their key translation agencies who were very slow to catch on to this need. Many translators do not seem to realize that much of the content that MT focuses on is material that would simply NOT get translated if MT were not available and can sometimes create new human translation opportunity. It is not always a zero sum game. Also, while some MT advocates can be over-zealous at times I think very few are actually bent on deception and fraud as is sometimes claimed.

MT does bring about change in traditional work practices and can sometimes have adverse economic impact (especially when misused or incompetently used) on translators. In some ways MT technology is getting better, and in some “easy” language combinations even DIY initiatives can produce some kind of minimal production advantage. But really steering an MT system to make it work and respond in a way that it is an experience that professional translators want to repeatedly engage in, does take more skill than dumping data into an instant Moses system. Though the risk of running into incompetent MT practitioners is still high, we are seeing many more successful collaborations that show the potential and promise of this technology when it is properly used.

Much of the anger and even rage from the translator side is “passionately” stated in this blog post by Kevin Lossner. I will paraphrase some of his key objections and and other points I have heard in the broader translator community, at the risk of getting it wrong. The issues seem to be:
  • Messages from industry gurus and from CSA &TAUS in particular about how the business of translation is changing and their vision of the impact of automation on translators,
  • Messages from MT vendors (me included) about the value and urgency and benefits of using MT,
  • The possible negative impact of MT on cognitive and professional skills of translators or just the general nature of post-editing work,
  • The link between the professional work effort and the compensation,
  • The degree of involvement in the development of MT systems,
  • Lack of education and training related to MT,
  • General professional respect.
  • The overall commoditization impact on translation work.
It is clear to most of us who have had successful MT implementations that post-editing is not suitable for everybody. There are translators out there who have developed very keen expertise in some domains and can translate at speeds and quality levels that would be hard for most MT systems to match. But there are also many translators who will benefit from a well developed MT system in the same way that they may benefit from the use of translation memory and other CAT tools. When properly done, working with MT output is not so different from working with TM. The nature of the errors are different but MT can also respond and improve as corrective feedback is processed.  

We have already reached a point in time, where the reality is that we have more “rough” translation done by MT in a day than ALL humans do in a year. The free online MT engines are used about 250-500 million times a month, and while it may still be true that MT has not penetrated the professional translation world in a substantial way yet, MT is now commonly used by many French and Spanish translators going in and out of English, and probably many other language pairs too.  There are still some who question the veracity of the increasing volumes of information that companies must now translate to ensure global visibility for their products and services but many companies now understand that making more and more product related content multilingual is a key to international market success. 

The translator concerns listed above however do need attention, and should be addressed in some way by all those who wish to maximize the potential for successful MT initiatives. John Hagel has an interesting and somewhat bleak viewed essay on The Dark Side of Technology where he describes the combined impact of all the new digital technologies which include:
  • A world of mounting performance pressure,
  • An accelerating pace of change,
  • Increasing uncertainty,
  • Digital technologies are coming together into global technology infrastructures that straddle the globe and reach an ever expanding portion of the population. In economic terms, these infrastructures systematically and substantially reduce barriers to entry and barriers to movement on a global scale.
This is perhaps what is being felt both by individual translators and by translation agencies and thus we often see reactive behavior at both these levels. We see many adopt the zero sum game view of the world, and there is increasing short-sightedness and often a breakdown of trust.

While I do not have a definitive prescription for success in dealing with the human factors involved in an MT project,  I think it is possible to outline some factors that I have observed from partners like Advanced Language Translation that constitute what I consider are best practices.

It is important to understand that the better the MT system and it's output is, the better the ROI and translator/editor work experience. MT systems that can respond to the needs of professionals using it for real work are very different from ones where the users have no real control of what happens beyond putting some data in. So if I were to list some recommendations on how to approach these basic communication and trust issues I think they would include the following:
  • Build the best MT system you can, which means it should never be done in a hurry and preferably developed by experts who can tune it and adjust it as needed in response to translator feedback.
  • Manage expectations of all key stakeholders, especially with regard to the evolutionary nature of MT system development. It is not as easy as 1-2-3 and requires expertise and patience.
  • Get MT systems up to an acceptable average quality level with the involvement of senior trusted translators before unleashing the system to a larger group of translators/editors.
  • Involve Project Managers and senior translators in MT system development with experts so that you can build organizational intelligence and skills on specific data cleaning, data preparation and system assessment.
  • Involve key translators in the rate setting process to establish fair and reasonable compensation rates that are trusted.
  • Don’t involve translators who are fundamentally opposed to MT technology. There are translators who do not benefit from MT because of very special and unique skill sets.
  • Provide specific examples of corrections for a variety of different types of output errors for post-editors to model.
  • Ensure that the nature of the task is understood and compensation issues are clear BEFORE setting production deadlines.
  • Focus on fixing high frequency error patterns with a small test team and test data set before general release.
  • Feed back error corrections and ask for general feedback from editors on an ongoing basis and incorporate as much of this into the system as possible. Monitor ongoing progress to ensure that MT system remains consistent over the project and over time.  
  • Retune and retrain the MT engine quickly and as frequently as possible.
  • Develop deeper system tuning skills over time as key team members begin to understand how the system responds to various kinds of feedback and corrective adjustments.
What more can be done to make post-editing MT work better understood and thus hopefully a less threatening or demeaning technology?  I see PEMT as a natural evolution of the business translation process. It is simply a new approach that enables new information to be translated, or a new way to do repetitive tasks but it can also be a means to build and develop strategic advantage. A guest post on the TAUS site has made a plea for translator education (not training), but I think it unlikely that the recommendations given there will solve the problems I have listed above. 

The most successful translators and LSPs all seem to be able to build “high trust professional networks”, and I suspect that this will be the way forward i.e. collaboration between Enterprises, MT developers, LSPs and translators who trust each other. Actually quite simple but not so common in the professional translation industry.

I feel compelled to re-use a quote I have used before because I think it fits very well in this current context.
Disruption is not something we set out to do. It is something that happens because of what we do,” stresses Brian Solis. Disruption changes human behavior (think: iPhone) and it’s a mixture of both ‘design-thinking and system-thinking’ to get there. So as an innovator, where do you begin if you don’t start with attempting disruption. To boil down Solis’ message into a word: ‘empathy.’ That’s right, empathy. Empathy drives the core of your vision as an innovator, or so it should says Solis.
Solis says that there are only two ways to change human behavior, by manipulating people, or by inspiring them. If you choose the former, good luck on your journey, but if you would prefer to attempt the latter with your innovative attempts, then you should start with empathy: the why of your product or company. That is how you will capture attention, and hold onto it, especially in the technologically, socially-driven world today.”
The excerpt above is from this post on The future of innovation is disruption (emphasis mine).
“The end of business as usual takes more than vision and innovation to survive digital Darwinism however. It requires a tectonic shift from product or industry focus to that of long-term consumer (customer) experiences. Businesses that don’t are forever caught in a perpetual cycle of competing for price and performance. It is in fact one of the reasons that Apple can command a handsome premium. The company delivers experiences that contribute to an overall lifestyle and ultimately style and self-expression. Think about the business model it takes to do so however. You can’t invent or invest in new experiences if your business is fixated on roadmaps and defending aging business models (SDL & LIOX?).”
This excerpt is from a fascinating article on the collapse of the Japanese consumer electronics industry and especially Sony, Panasonic and Sharp.



The way forward in developing win-win scenarios and excellence in these challenging times is collaboration between trusted partners. Collaboration curves hold the potential to mobilize larger and more diverse groups of participants to innovate and create new value. In trusted relationships and networks critical knowledge flows happen more easily. Benefits and risks are shared more willingly and together participants are driven by a desire to learn and reach new levels of performance. In this context, zero sum relationships that focus on dividing a fixed pie of rewards evolve into positive sum relationships where participants are driven by the opportunity to expand the overall pie.  When there is a real prospect of expanding rewards, we are much more likely to trust others than when everyone is focused on how to get a bigger share of a fixed pie. I think it is also likely that agencies that regard translators as valued partners in a demonstrable way at an organizational level, will likely lead the innovation and evolution of how business translation gets done.  Hegel says also that a new narrative based on opportunity is needed.
Like any great narrative, it must be crafted.  “Craft” is an evocative term because it suggests that narratives are not just created on paper, but built through the actions that we begin to take as we start to see the opportunity ahead. Narratives emerge through action and interaction as we collectively begin to sense an opportunity and learn through action what it will take to achieve that opportunity.
No single person can be responsible or create this collaboration, trust and opportunity narrative and I look forward to seeing those who do help carve a path for all to learn from. Revolutions often happen from many small acts (balls) that are set into motion, rolling together in the same direction gradually building momentum and some revolutions happen slowly after some initial sputtering and misfiring.

36 comments:

  1. Well, that may be enough. But an overhaul of the system could be more. Linguistic dogma tells us that language is for communication and communication is assumed to be sharing, whereas it is not. Think of the constraints on communicating ANY content. And think of the fat that in the Triangle of Reference yo have three elements to align, not just pairs based on set theory. Translating is the only human activity besides science that requires us to verify and check on reality in content. Abstract words take you to another level, metaphysics as opposed to physics that helps you stay with reality. A language where abstract word are used for the sake of brevity and not tested for making sense will never be suitable for MT.There are three factors that keep changing all the time without synchronisation: the world, or its chunks, the reference, the signs that various people assign to them as verbal symbols and the mind where you have the former to connected somehow. And that is just one triangle of one person. Back to the drawing board to stop wasting and fooling around us.

    ReplyDelete
  2. This is a great blog post that absolutely nails the issues involved. We need to build trust and not impose new workflows. We need to take translators and post-editors on the journey with us.
    By Gillian Searl

    ReplyDelete
  3. There is no dialogue because many professional translators think it is not anything serious, that can be used in professional translation. The attitude towards MT might be similar to the one towards robots--in law, medicine and politics.

    ReplyDelete
  4. Kirty,
    Thank you for an interesting article.

    I have some things to say, but for the sake of brevity I will focus on one thing now. You wrote that working with MT is not different than working with MT, albeit the errors are different. I think that this is one issue that stands in the core of disagreement between MT advocates and what I consider to be professional translators. TM by itself is useless. It is as only good as the information it contains. Use a bad TM and it will be useless. The type of TMs from some agencies that I used to look at in the past were beyond poor. From my experience, those who in the past claimed that a good TM and glossary are much more important than the human skills involved (i.e. translators are interchangeable as long as the technology is there to hold their hand), are pretty much the same ones who are now heading the MT-lobby. The terminology and the technology might have changed, but the gap remains the same. In my opinion the gap stems from different perspective and different business models. Agencies and technology developers are running a business with mostly fixed costs (somethings with little to no real experience and knowledge about the translation process), and as such take the business approach to cover those fixed costs, mainly turning the profession into a big data problem that need the right technology as the solution. Conversely, professional expert translators have mostly sunk costs, they now the expertise involved, and they judge a project by the time and effort needed, not by a random statistical measurement of word and "similarities". For them, the words are not the raw material nor the product, they are just how the work is manifested. The process itself is cognitive, skillful, and demanding. Translation is not data. The only purpose of translation is to facilitate seamless communication, and communication is a key element in our daily personal and professional lives. There is not a single item in this world that needs translation that doesn't have some value. Otherwise, MT and other technology would not have been even pursued in the first place. They were, because there is a business opportunity "in translation".

    As long as the technology developers and sponsors will largely continue to treat translation as a big data problem that just need the right algorithm implements, and as long most adopters will continue to see translators as one big cohesive cannon fodder, I suspect that no real dialog will be developed with real professional translators (I know some who really like to engage in a professional, respectful dialog and conversation) who know a thing or two about their work and how the world around us really work, because the gap is unbridgeable.

    And like I always say, the merits of technology are a separate discussion from its abuse. The topics you have touched upon are not strictly technology related, but focus more on its abuse. Most translators who "oppose" MT don't necessarily oppose the technology; they oppose its abuse by unscrupulous entities. What Kevin Lossner and others are saying is not intended to dismiss MT as a technology, it is just a reaction to the plethora of false claims, misuses, and unethical practices that flooding the market.

    ReplyDelete
  5. Shai, I think we are making the same point. Poor quality TM does not result in improved productivity but some agencies use TM merely as a means to reduce payment to translators. Just as getting TM to useful quality and functionality requires skill so does MT.

    We are in a phase where it is very easy to generate low quality MT which is often used to push rates down but there are still very few who operate MT with enough skill that translators find it actually useful.

    Just as translators have learned to understand useful TM from low-quality or useless TM, they can also learn to understand when to engage with MT (if it is responsive to feedback, can be steered by feedback etc..) and when not to e.g. when a low rate is offered for editing and the MT quality is very poor. Low rates are not a problem if the MT output quality actually does enhance individual translator productivity and thus even though the rate is lower they may actually make more per day because productivity is enhanced.

    But every MT situation involves unknowns and thus trust is a key ingredient for success.

    ReplyDelete
  6. "We have already reached a point in time, where the reality is that we have more “rough” translation done by MT in a day than ALL humans do in a year."
    - Would you mind to provide a source of this claim? Thank you.

    ReplyDelete
    Replies
    1. This article very plainly states this http://www.theatlantic.com/technology/archive/2012/04/google-now-translates-as-much-text-in-a-day-as-human-pros-can-in-a-year/256409/

      "In a given day we translate roughly as much text as you'd find in 1 million books. To put it another way: what all the professional human translators in the world produce in a year, our system translates in roughly a single day. By this estimate, most of the translation on the planet is now done by Google Translate." Franz Och, Google

      When you add Microsoft and other free engines on the web I think we can safely assume this volume is 50% or more higher. I have talked to people involved in the Bing Translate at Microsoft and they have daily volumes in excess on 10M users per day even in 2011.

      I assume that these volumes are all higher today than they were a few years ago.

      Here is another reference:
      http://singularityhub.com/2011/06/23/kurzweil-speaks-on-the-future-of-computer-translation-video/

      Delete
    2. I don't think that these numbers, even if true, hold much meaning.
      First, what is called MT is actually Language-Mapping-Statistical-Algorithm that converts words and sentences in a vary degree of success between languages, and is largely based on past human translations. This is fundamentally different from what I consider to be the true nature of the translation work. Whenever I'm asked if Google has yet to drive me out of business, my answer is that Google [Translate] and I don't do the same type of work.

      Second, even if those numbers are true, what is their significance exactly? The amount of content that needs translation is not finite. Therefore, the amount of content that is being processed through MT doesn't come necessarily on the expense of the content translated by humans, and vice versa. In those articles there is also no mention of the type of content being translated and for what purposes (again the approach of the technology sponsors who weigh translation by the number of words as if the words are a mere raw material and product). There are so many factors that can contribute to these numbers: greed, ignorance, importance of the context, and availability to name a few. It is also never a surprise to learn that a free or cheap service gets more attention and traffic than a (true) premium one.
      What I find amusing in these kind of statements is that the translation market is so segmented and fragmented that trying to claim a market-wide trend is not very relevant. In practice, each of us has a clear view of the market segments in which we operate, but too many think that their little sliver of the market is representative of it all.

      And lastly, this is yet another manifestation of the social engineering campaign, this time using the oh-so-old, yet effective, social proof concept. Talk about numbers and how many people are using it, and let the audience reach the inevitable conclusion that if so many people are using this, this is bound to be a great service and one might even be considered socially awkward not to use it.

      Delete
    3. MT as you say is a data transformation (that can be reasonably accurate for many repetitive kinds of translation scenarios) but It is not the equivalent of professional human translation. I agree with you -- what Google does is different from what you do.

      What is the significance of the MT usage numbers exactly?
      That many people (other than your customers) are interested in translating information and get gist translation for low or no cost even if it just a web page that they skim through. These users are random people on the web who would never pay a translator but if they have a serious need based on info they may uncover using MT, then some tiny portion of these millions of users may actually hire a professional to get it done properly. I agree that most MT is not competing with what human translators do. I am not saying that MT is a replacement for you. I think most people involved with MT are very clear on this. I also mention this in my post that MT has not touched the professional translation world very deeply, right after I say millions of internet users do casually use MT according to reports from Google and Microsoft.

      The quality requirements in the professional translation world are different and thus more MT use will require greater expertise and more real collaboration with translators. This was the whole point of my post - I think we actually mostly agree on this.

      Global companies have to translate a lot of content that is "non traditional" and this is the primary role that MT is playing today. When carefully done it can also help with documentation translation in a similar way to that of good TM.

      Finally MT is a tool and is not a replacement for a professional translation -- I think most of us involved with the technology see this.

      Delete
    4. "This article very plainly states this http://www.theatlantic.com/technology/archive/2012/04/google-now-translates-as-much-text-in-a-day-as-human-pros-can-in-a-year/256409/"

      - You and other advocates of MT don't forget to mention everywhere how a customized adjusted MT system, tailored to specification/project/language needs, is crucial on one hand - and then on the other you don't mind to count in whatever inadequate and not customized MT output from Google Translate/Bing?

      I think better consistency would make your statements more trustworthy than purposefully mixing apples and oranges.

      Delete
    5. Thomas

      The contrast presented by these examples is given to make two different points.

      1) That for many people MT is useful as a substitute for real translation. Thus, while MT is imperfect we have evidence that many (millions) find it useful. Generic users on the internet are information consumers who have to deal with a language barrier. They are often the customers that global enterprises wish to communicate with. Their growing acceptance of MT suggests that MT has utility in general as a way to communicate with global customers, even though it is clear that a machine attempt at translation is rarely if ever as good as a human translation. The speed and ease of doing it have very high value for millions.
      2) To improve the odds of making MT useful to translators or professional translation purposes much more care needs to be taken to get MT output quality levels higher, so that professionals are not forced to discard the MT as useless. Generic MT may often have no value to translators since it is so far off the mark but customized MT can be useful. Microsoft has documented that millions of users provide feedback saying that they find raw MT from customized systems of technical knowledge base data useful. Most of these "translated" articles would never pass any kind of professional translation quality check.

      I think you will find that I have been consistent over time through various posts on this blog as 1) suggests MT in general is useful 2) Customization makes it useful for professional use.

      Delete
    6. I was absent from the conversion for a while, sorry about it.
      I will comment briefly (also down the page).

      It is obvious that MT has a demand among private users browsing the web. I always claimed that the best use and application of MT is for content discovery purposes. However, when I commented about the significance of the number quoted by Google I meat two things:
      1) The amount of content being translated by MT does not necessarily come on the expense of the amount of content translated by professionals, and vice versa.

      2) The number alone is meaningless - even if you assume that it is true. What about some details about how it is broken down? What type of content was translated? How much duplicate content it contains, how much has been "forcefully" translated by Chrome and other implementation of MT? What about the satisfaction level of the users?

      Just throwing around numbers is a poor and transparent attempt to influence perception by using social proof.

      Also, maybe, just maybe, the recent change of heart of some - and few at this point - investors who are now backing out of their investment in commercial MT "solutions" (although they are moving to support some "human automation" platforms, which is just as silly if not more) signals that are also starting to understand that the commercial value of MT is far less universally applicable than what they were led to believe.

      Delete
    7. Shai

      There are unfortunately a huge number of bad "MT" implementations in the commercial arena as so many are looking for a short cut - instant results, lowered costs with little effort. That is a clear recipe for problems. Even basic issues like "Does it make sense in this context?" are skimmed over as you point out.

      Delete
    8. My latest series of comments was not meant to argue against things that you said. I wrote them in the larger context of the topic of this post in an attempt to clarify just how deep the distrust goes and the major reasons it stems from. I believe that some of those who spread false claims don't have malicious intent, they just see and treat translation as a big data problem, failing to understand what the translation service is all about, the limitations and strength of the technology in this context, and are completely indifferent to the potential risks of irresponsibly using technology.

      Technology abusers and misguidance hurt the reputation of the technology just as bad translators hurt the reputation of the profession. One of the problems is that the discussion often lumps everything together into general categories, whereas in practice there are are many professional/technical differences, different circumstances, and different needs.

      Delete
    9. Shai

      Thank you or your comments. I think it is useful to anybody who will take the trouble to read through them as you articulate many things that it would be useful for technology developers to listen to with more care.

      I think that when translators can learn to differentiate between the different kinds of "MT solutions" in the market, we will see much more successful collaboration and ignorant proponents will be challenged from a more informed perspective.

      My basic advice to translators -- understand the kind of MT you are dealing with first before you accept the rates and conditions. And in these early days, if there is no trust between the involved parties, you should expect that the PEMT experience will be very negative. Just as there many kinds of translators there are at least 50 shades of grey with MT and I for one really want to understand how to communicate critical information about each specific MT engine to translators so that they can make informed decisions on whether they want to engage or not.

      Delete
  7. Yes, trust is a key element in every human transaction, and I don't know how MT can expect to create even the illusion of trust in light of the lack of transparency that is backed up by false claims a FUD campaign and a social engineering effort to alter quality perceptions and expectations.

    A technology that tries to manipulate and frighten people into adoption it instead of being adopted by merit will never generate trust among professionals.

    The perception gap between how the technology developers and sponsors see the translation process, and their underlying motivation, and how professionals see it and know it to be true is just to big and the last thing it creates is trust.

    ReplyDelete
    Replies
    1. Shai - MT is just a tool and as such does not make any claims per se. People who use it or propose it do, and these people vary in their claims and strategies. While there are some out there who do use threat-based strategies many of us don't. I don't think it is fair to say that all these people using or proposing are making false claims and running FUD campaigns. The perception gap is large and I think it is worth trying bridge it or reduce it. MT is not necessarily suitable for every professional translator but there are some who do not see it as something to be feared IF the output quality is usable and if they are paid fairly for work performed to further improve this output. But I understand the history of MT use/advocates has not engendered trust. I hope that this can change by more constructive dialogue and more transparency though clearly it will be difficult,

      Delete
    2. I don't mean any disrespect, but please allow me to be blunt. Parts of the technology/agencies lobby have declared war on the translation profession; not you, not the technology itself, and not all - but large, or at least vocal, parts did. They don't do so because they have a true solution to better the world, they do so for short to medium-term financial gain. They have devised a semi-artificial problem and not offer a "solution" for it, a solution that only relatively few need in a commercial environment.

      There is no trust whatsoever between professionals and technology developers/advocates, and don't see how it can change. The damage is already done. When you first strike, while showing contempt to the profession and service, and engage in unethical practices just because you are will funded and connected, there is no place left for discussion because there is no trust and sometimes not even basic respect.

      eMpTy Pages is, for example, the only MT-related blog/platform that I participate in without being censored, without being personally attacked by being called an ignorant self-serving Luddite or dinosaur with no clue about anything; without me, my expertise, and profession being constantly insulted by people who are themselves the ignorant; or without getting replies in the line of "MT is here to stay, deal with it or die". So, I don't even think that the technology lobby is really interested in any form of discussion. Personally, there are some people and outfits that I will never engage in any discussion with because I have no respect for them as people, I actually despise them.

      Delete
    3. I suggest that there is no lobby as such -- but there are many making ignorant and reckless claims that hurts those of us who are trying to be more measured and and careful with our claims. This thread between you and others on this post inspired me to write this post: http://kv-emptypages.blogspot.com/2014/05/monolithic-mt-or-50-shades-of-grey.html

      I think the best and most accepted use of Expert MT will come from collaboration with translators and we all need to be clear to not dismiss the technology per se because there are a few or many who make ignorant and reckless claims. Professional use MT is not as easy as 1-2-3 -- just like good human translation it takes work and processes and communication to happen-- and also shared objectives.

      Delete
  8. Kirty, thanks for your admission that "post-editing is not suitable for everybody". To my mind, this is a first step towards clearing up the massive confusion in the terminology used. But I would go further:
    1. Translation and bi-lingual editing (a.k.a. PEMT or traditional "proofreading") are DIFFERENT jobs with DIFFERENT skill sets. Some agencies and most language technologists fail to understand this difference, and this is one of the causes of the frequent misunderstandings and aggravations in this dialogue.
    2. In some cases, the skill sets of "translation" and "editing/proofreading" overlap. Therefore, many translators are happy to accept proofreading and editing jobs as an extension of their work. But it is unhelpful to assume that this is always the case. At the very least, it wastes time - for example when I have to write back to an agency telling them that I don't offer proofreading, and they then have to look for someone else.
    Here, it would be helpful if you adjusted your own terminology. The people who handle PEMT jobs are not translators (at least not while they do this work), they are bi-lingual editors or proofreaders. In some jobs you would like them to have translation skills and qualifications to augment their editing work, but please stop speaking of PEMT as a normal task for translators.
    3. It is counter-productive to take someone with a highly developed skill set in one area (translation and an excellent writing style) and ask that person to ignore what they do best, and to repair an imperfectly written text instead.
    4. The literature on "PEMT" speaks of the different levels of quality needed in the edited text. For the basic level ("good enough" or similar) you don't need a qualified translator, you just need someone with a reasonable level of writing competence in the target language and a reasonable grasp of the source language. This could be someone who is not confident enough to handle "real" translation work.
    5. This grading system for levels of editing should be reflected in the job description and the remuneration. For a proper editing job to executive standards, including a thorough retranslation of problematic passages and stylistic revision of clumsy phrasing, you should probably expect to pay more than for a straight translation job. For a general once-over for plausibility to achieve a standard that is "just about good enough", the remuneration will presumably be lower.
    6. As you point out, translators usually use TM systems and sometimes even MT engines in their normal work. But the workflow involved, and the economic framework, are completely different from MT/PEMT. When I translate, I have my own reference resources, including my own TM and occasional reference to dictionaries, on-line MT resources etc., and I determine which resource I use on a sentence by sentence basis. In a PEMT system, a single automated production method is used for the whole text, and the editor then has to evaluate the automated translation as a whole.

    ReplyDelete
    Replies
    1. Victor -- This is good feedback for all the technology developers and sponsors to hear. It is quite striking to see how the wrong terminology for the work performed, the skills required, and conflation of roles can create so much confusion. I think if we all did this better it could greatly alleviate much of the misunderstandings.

      I think it is in the interest of anybody (bilingual editor) involved in doing PEMT work to clearly set guidelines on several of the parameters you mention and just as you would reject translation jobs that are unreasonable in terms of the remuneration/effort ratio I think translators can define some key parameters for MT related work as well. Translators should refuse to work for low rates on low quality MT output -- as that is exactly how sponsors will be forced to develop better systems. What is acceptable is very unique to each PEMT editor and some will have lower tolerances and there are many gifted translators who will choose to not engage. But I am hoping that translators/editors learn how to tell MT system and output quality before they draw conclusions as there in fact win-win scenarios possible.

      Delete
  9. Mr. Vashee,

    Kindly delete my previous comment as it included a wrong link. My comment below is the valid one.

    Thank you very much,

    Aurora Humarán

    Same as translators should not offer discounts based on the use of CAT tools, they should never post-edit for third parties.

    If I decide to invest in a Trados license, I am the one to profit from it, and will feed noboby's TM (but mine).

    If I decide to invest time or money on MpT, I am the one to profit from it, and will feed nobody's MpT (but mine).

    But this is better explained (confessed?) by CAPITA: "Machine translation technology is improving all the time and the translations are becoming more accurate and sophisticated. The amount of editing required of a human translator will gradually decrease and this approach to translating will become more and more cost-effective," or by yourself, Mr. Vashee, here: http://www.asiaonline.net/EN/Resources/Articles/IntroductionToYourMTDepartment.aspx

    ReplyDelete
    Replies
    1. Aurora

      I am not sure how you can avoid giving TM back. When you send back a translation - are you not in fact sending back the target side of source material given to you, which is easily integrated and built into a TM by the client or LSP?

      The whole point of post-editing MT and recycling it is to reduce the number of dumb errors so that editors focus on real linguistic issues rather than correcting dumb computer errors. Computers can only get it correct if it is very similar to previous segments and generally there are always some linguistic errors that are very hard to overcome no matter how much feedback you provide. One view of this is that this re-cycling of PEMT is logical to reduce future mindless work, and another is that it is using your effort to take away your future work.

      You could play with Moses as it is easily available today and control it all if you so choose but you are likely to find that it is more difficult than you imagined and that steering these systems requires a lot of work and expertise that is not really translation related.

      As I said there will be some or many translators who will choose to stay away from MT, but there will also be some who will see that MT is just another tool to avoid doing repetitive, mechanical work. The best translators rarely need to work with MT because they handle translation problems that MT does not do well and they also work at very high efficiency levels. But a lot of business communication material is well suited for MT because it is rapidly changing, short shelf life stuff that loses value very quickly. It is not worth real in-depth translator attention because of this rapid loss of value to the final consumers of the material. But perhaps we just see the world differently.

      We are already seeing more and more data that suggests that corporate websites are becoming increasingly irrelevant and not trusted for providing accurate information. This will affect what material gets translated and also will affect how it will get translated if it is well understood by the corporation that nobody is reading manuals for example.

      Thank you for your comments.

      Delete
    2. "...but you are likely to find that it is more difficult than you imagined and that steering these systems requires a lot of work and expertise that is not really translation related."

      The same is true about translation, and this is part of the problem. In my opinion, part of the effort of the technology lobby is aimed at shifting the type of expertise and experience required (and paid for) from those of "traditional" translators an into the IT world. If MT will get more traction in the commercial world, I believe that all the claims about cost-effective solution, etc. will be thrown out-the-windows and be replaced with claims that this service requires a lot of resources, technical expertise, etc.

      You can see similar trends happening already with the general moment to the cloud, and how past promises and claims are gradually being back tracked from.

      Delete
    3. Exactly -- any technology that is complex requires skill, expertise and experience to get real value and utility.

      Again I go into this in some very specific detail in this post: http://kv-emptypages.blogspot.com/2014/05/monolithic-mt-or-50-shades-of-grey.html

      Delete
  10. Oh dear, Kirty, you seem to be shooting yourself in the foot in your latest reply to Aurora. She makes a very valid point - that it does not make business sense for a translator to donate TMs or MT output to somebody else's database, because this merely reduces the translator's future earning potential while handing over the economic benefits to someone else free of charge.

    Your argument based on the reduction of dumb errors is the typical hot air from the MT industry that I and others have complained about for a long time.

    It is no better than the TAUS moral-blackmail-daylight-robbery argument, which roughly suggests that translators should donate their TM/MT material to a Big Brother Database for free, because this is for the common good of humanity as a whole, and if this accelerates an erosion in the translators' future earnings, then that is all the better because it means that the rest of humanity can get a better deal on translation services. It is a theory of business suicide by the few (= translators) to benefit the human race. And just incidentally, someone somewhere will then own the Big Brother Database and charge others (including the altruistic translator) for the pleasure (???) of using it.

    In the article you claimed to be arguing for the interest of the translators and bilingual editors, and your response to my comment even seemed encouraging. But now it seems that the tower of bricks which you are building with your hands is all too easily pushed over by other parts of your anatomy.

    ReplyDelete
    Replies
    1. Victor,

      For any translation job that a translator does for a client, they have to send back target text at least. This makes it very straightforward for a client (agency or enterprise) to build TM whether the translator actually provides TM or not. The more efficient clients organize TM so that future work in the same general area has some possibility to leverage historical translation work. So I am not sure how you could prevent this, especially since most clients claim to own the TM.

      With PEMT work, the biggest complaint is often the kind of errors that need to be corrected. Editors do not enjoy mind-numbing work that is repetitive and mindless. The professional use of MT as far as I know is always attempting to get MT to a level that reduces this aspect of PEMT as much as possible.

      MT makes most sense when there are ever growing volumes of information or when there are new kinds of content that would not be translated without MT, that require very rapid turnaround. One of the things I point out in this post is that when you have a zero-sum game view of the world, MT is viewed as a threat. But if you see that this enables new kinds of content (which is growing in volume) to be translated then it is also an opportunity to get new kinds of work.

      There are many translators who want nothing to do with MT but there are also others who see it as new kind of tool that can aid productivity and provide competitive advantage and want to develop more competence. I still believe that the best MT systems are yet to come and will come from informed translator feedback and engagement. But as our conversation shows we have much to cover in terms of learning to communicate better. In this case bringing about change and communicating is even harder because so many have said that computers will replace humans in the past. Some at Google still do.

      Delete
  11. Kirty, I am also intrigued by the final point in your comment - the ditching of proper translation methods because "nobody is reading manuals".
    I agree that very few people read manuals systematically from beginning to end (I certainly don't). But when I refer to a manual, I want to find accurate instructions for dealing with a specific problem, and if manuals are written and translated according to your "don't care because nobody will read them anyway" principle, then this is a lost hope (except that by an accident of biography, I happen to be a native speaker of the one language that is more often given a reasonable translation).
    Incidentally, the one type of content which is even less read than manuals is the legal mumbo-jumbo in the "terms and conditions". But here, I have not yet seen a serious argument for the approach "machine translate it and hope for the best". Perhaps you would like to integrate this into your "nobody reads it anyway" theory.

    ReplyDelete
    Replies
    1. Victor
      I am not suggesting that manuals have no importance - I am only saying that their relative importance has changed (i.e. reduced) and I have gone into some detail on this in previous posts. Global enterprises generally tend to pay great attention to any thing that involves security and legal risk even though nobody may read them. Legal agreements also tend to have a much longer shelf life thus getting it perfect is much more important. Even Microsoft who uses raw MT extensively for knowledge base content uses HT for security and legalese. But these companies are also finding that information that they have less control over,e.g. in social media is having a huge impact on their traditional business communication models. This is something that the translation industry or even the marketing departments of global enterprises do not control but have to adapt to. Many Info Technology companies are finding that users are producing better support material than they are and that often other users prefer these materials to corporate product material. I do realize and agree that documentation remains important for those who refer to it when they need it. Also most countries require that some documentation always be provided but we do see a shift away from static paper content to much more dynamic content on the web that changes based on use patterns and customer feedback.

      Delete
  12. Mr. Vashee,

    I will take some time to reply, as I will reply with... an article I started drafting in reply to your comments. It's a good opportunity to share with you why MpT is obviously, not so obviously and subtly negative for transltors.(Should you not know).

    And of course "we just see the world differently," or I would have invited you to head a chapter in IAPTI and/or you would have offered me a job with AsiaOnline. We represent different sides of a fact. That does not make as enemy, only two human beings with conflicting interests.

    By the way, I still "owe" you a reply re MpT offering better results than real translation. I did not forget, but I will reply here before that.

    Aurora

    ReplyDelete
    Replies
    1. My reference to MT producing “better” output than humans was related to some very specific use cases that I think are worth clarifying and detailing here, as I am not saying that MT in general produces better translation than humans.

      We have been involved in several projects where there has been high volume 500K to 1M words+, very technical and terminology rich domains ( automotive, technical engineering and data center information technology and infrastructure). We have received very specific and very carefully measured quality feedback from end-clients who remarked that the MT based work was more consistent and terminologically accurate than historical TEP production of the same type of project. This was across several language pairs including English to Spanish, Chinese, French and Slovenian amongst others. We have noticed that several different clients have remarked without solicitation that a recent large translation project seemed to have “noticeably higher quality” and asked what had changed in the production process. Terminological consistency can be challenging to maintain over large projects where many translators are involved and subject matter expertise may vary by individual but this is an area that MT excels in – ensuring that terminology is consistent and normalized across a large translation project. Many enterprises dealing with technically complex products will value this kind of consistency over grammatical and linguistic style superiority that humans are more likely to produce.

      Delete
    2. This is my last post for now.

      First, judging the the quality of translation by the terminology consistency is a weird choice of metric. Second, comparing the quality of large projects produced by MT and a bunch of probably poor human translators is very misleading. I have an idea about how these large projects are managed by the agencies who undertake them, and the results just reflect it.

      Having a 1 million words project translated "overnight" is not a translation problem, it is a managerial problem. There are limitation to anything, even to what machines and technology can do, so one should plan a project accordingly. If you come up with unreasonable demands and budget for a translation project, it is not a translation issue that can be resolved by applying the right technology, it is a managerial problem of those who treat translation as an afterthought. This is why in one of my above comments I called the "language barrier" a semi-artificial problem.

      There are already orders of magnitude more low quality translations being produced every day than good ones. This is not an indication that the world demands low translation quality, nor a translation problem. It is a human problem.

      The short-term financial benefits for the technology lobby are what they are and somewhat speculative. The long-term costs of poor translation, however, are very real. The damages of the increased use poor language over the past 15-20 years are only now starting to become evident and their long-term costs are yet to be determined.

      Delete
  13. Knowledge technology got stuck at processing of keywords. And as long as scientists fail to define meaning in a natural way, there will be no significant progress in MT.

    Just to help scientists:

    Intelligence is the ability:
    • to group what belongs together;
    • to separate what doesn't belong together;
    • to leave out what is no longer relevant;
    • to learn from mistakes;
    • to plan future actions;
    • to foresee the consequences that the planned actions will have.

    Semantics / meaning is a subset of intelligence, defined by the first three abilities: grouping, separating and omitting knowledge. And grammar provides us clues how to group knowledge that belongs together, how to separate knowledge when it doesn't belong together, and to know when knowledge is no longer relevant.

    But because current techniques have been developed without foundation, we have to redo most research done in the past 60 years. I've already made a start. It is open source: http://mafait.org .

    By Menno Mafait

    ReplyDelete
  14. If you have a long list of countries and language names to translate, will you really want to translate it manually, looking up names in the dictionary? I for one would hate to lose my time like this when I can have it translated by MT and then put through post-editing in a fraction of time it would take to do manually.

    ReplyDelete
  15. You got a really useful blog I have been here reading for about an hour. I am a newbie and your success is very much an inspiration for me.

    English to French Canadian Translation & Canadian French Translation

    ReplyDelete
  16. Nice to be visiting your blog once more, it has been months for me. Well this article that ive been waited for therefore long. i want this article to finish my assignment within the faculty, and it has same topic together with your article. Thanks, nice share.
    Tech Talk

    ReplyDelete