Tuesday, November 9, 2010

The Machine Translation Community Building Bridges with Translators

The American Machine Translation Association (AMTA) recently held their annual conference in close proximity with the ATA in an attempt to build bridges and foster a growing dialogue between these two communities. When I entered the world of MT (I have always preferred the term automated translation) I had the good fortune to work with Laurie Gerber at Language Weaver who encouraged engagement with translators. While her voice was not heard there, she has always stayed true to this vision and she was instrumental in influencing me to also reach out to the world of professional translators as a core business strategy. She has long been a clear voice encouraging the broad MT community to reach out to translators and she was visible in Denver last week making sure that ATA guests were engaged and making all the right connections or just having a good time.

It is clear to me that the path to better quality MT, that really does fulfill the promise of sharing information, knowledge more freely in the world can only come from a close, cooperative and collaborative relationship with professional translators. 

The conference began with a keynote from Nicholas Hartmann who is the current ATA President and also a past technical marketing translator. He gave, what I thought was an articulate, considered and clear perspective of the translator vis-à-vis translation technology and MT while pointing to some directions for real collaboration in future. I thought it would be valuable to restate what I heard, as there were several key messages for the MT community. A published paper version of his speech is also available on the AMTA website (but it is a really hard to get to these resources as the unique URL is not easily displayed.)
The ATA has 11,000 members, of whom 70% are freelancers and Nick had carefully prepared to be their voice, expressing their concerns and needs in this forum. (Here is the twitter stream). He stated that the bad blood with translators was originally created with the historical overstatement of MT capabilities in the 60’s where MT was expected to replace translators: FAHQT (Have you noticed that this sounds a lot like f**&ked?). He noted that many translators do in fact use some form of translation technology today even though they find the future vision of being post-editors at “burger flipping wages” abhorrent. He gave some examples of human translations that went beyond the literal, to show how only a human could make the non-literal interpretations to correctly translate some example phrases. The examples proved that even a “perfect” literal translation can be nonsense at times and asked if the future of MT is as T.S. Eliot says:
“That is not what I meant at all. That is not it, at all”
Some additional points he made included:
  • Translators have a very different view of quality which is linked to their code of ethics to render the source material accurately
  • MT makes sense where something is better than nothing
  • MT is really only  “a probability distribution over strings of letters and sounds” (especially SMT) (part of a quote from Martin Kay in which he specifically cautioned the MT community NOT to consider language so simplistically)
  • Translators want it to be acknowledged that their work is critical to feeding and improving SMT systems with HT corpus
  • MT should be matched to task and purpose and could be unfortunate in the hands of the wrong people e.g. the infamous Welsh street sign
  • SMT that is often built using flawed TM and thus one could hardly be surprised at some of the results and in this case past performance will be a predictor of future performance
  • MT must be edited and checked to avoid serious errors
  • Post-editing has come to mean cleaning up really bad quality MT output at very low wages even when everybody doing it understands that they could have done it faster without the MT
  • The data pollution of some SMT systems perpetuates and is difficult if not impossible to remove
He then went on to answer the following question. So what do translators want and need?
“We want to work together constructively. We want technology that we can use. Machine assisted translation does make sense to us but we do not want tools that make our jobs harder.” Translators want to have a hand in making the tools but want the dialogue to be realistic. They also do not want a role of PEMT drudgery and asked for technology that assists translators to be more productive.  He ended his talk saying that he had enjoyed meeting many in the MT community and he hoped that the dialogue would continue as, “We are all in the same business.”

I did not stay long enough to really get the reaction of the MT crowd as I rushed off to tekom in Germany that same evening. Of course there was one somewhat hostile question immediately, but I think many MT users want to know how to work with translators and as you will see from my previous blog entries that I am a big believer in this rapprochement.

Nick was followed by Jost Zetzsche who continued on the theme of building bridges and improved communication and he pointed out how translators have a self-perception of being bridge builders, language lovers, artists, cultural intermediaries in contrast to the techie, computer science self image that many MT practitioners have. Clearly cultural and communication problems can arise from this. He was self critical and admitted that translators need to learn more about MT technology and not resist it like they did with TM, but also pointed out some foolish statements made by MT proponents that any average translator would see as unfortunate or stupid. Some examples, Jaap van der Meer’s statement about letting a thousand MT systems bloom. Some of you may realize that this is really close to something that Mao Zedong said to flush out dissidents and eventually execute them. The other screamer he listed (without reference to the source, other than it being a major MT vendor executive) was, “It’s quite a magical technology when you see it (MT) work” by Mark Tapling of SDL (He says this with a smile at the end of the video clip).  (Dude, it’s a data transformation!!! Really ?!? I wonder if Tapling thinks that spreadsheets adding numbers up and Powerpoint slide transitions are also magical? Perhaps from a 19th century frame of mind this is all pretty magical). Jost contrasted this to a Twitter conversation he had with @kvashee ;-) about features that would make MT more useful to translators. ( I assure you I did not instigate this comparison.)

Some things that he asked for: Give us (translators) challenging tasks, we want to participate in “making it better”. He stated that he wanted to see that his corrections had immediate and direct impact on the system. (One of the biggest complaints that translators have about MT is that the systems make the same error over and over again.) He asked that MT vendors talk to translators in “real” language and “admit what your tools can and cannot do”. He ended on a positive note by saying that we as humans tend to demonize the unknown (HIC SVNT DRACONES!) and invited the audience to enter into each others terra incognita, and put our myths behind us. 

While this event was a good and constructive start, I hope that the dialogue between translators and MT developers continues beyond this conference and produces real innovation and collaboration. One of the first subjects that needs elucidation and better definition is “post-editing”. It was clear in several very instructive presentations in the “Commercial User Track” that the concept of post-editing needs development and clarification. There were several very good presentations and we see many successful MT implementations being discussed on a regular basis now. Check out and type #AMTA2010 to see a cool Twitter summary of the conference. Chris Wendt of Microsoft and I were also voted to the AMTA board, representing commercial users and I thank all those who voted for me and hope to help drive our common interests and agenda.

There was an interesting demo of several post-editing tools that are available in the market currently. Lingotek showed their translator workbench, PAHO showed their Word Macro based post-editor which is perhaps the longest running and most widely used direct post-editing tool around. GTS showed a promising looking community management and basic editing environment for Wordpress blogs. These examples all suggest that the tools to make post-editing more interesting are going to continue to evolve, and that while these are wonderful examples the best is yet to come. We also see that increasingly community and collaboration are intertwined with post-editing and this connection to MT is likely to develop further, as new kinds of people are drawn into the translation process. AMTA will make much of the content from this conference available on their website and I am sure many will find it useful if they can actually find it. (They need a serious update to their website).

Unfortunately, I had to rush off to the Tekom conference in Germany  at the end of the first day and missed the rest of the conference, but I kept in touch via the GTS blog since the twitter stream died pretty quickly after I left. I noticed Jost mentioned in his blog that Laurie had accomplished what she had set out to do, years ago i.e. bring MT developers and translators closer together. However, while Laurie may be happy at this initial accomplishment, I would suggest that she stay around a while longer and make sure that we all set sail together with the wind at our backs. The journey together has just begun and we have many miles to go before we sleep.  

And this was tekom – a blur of meetings and some wonderful dinner conversations.