Wednesday, August 25, 2010

Translation Technology & Innovation: Where Can You Learn More?

I was recently trapped in the LA County Criminal Court Juror holding room for a day waiting to do my civil duty, albeit reluctantly. As I was waiting I had a wonderful twitter storm chat with @rinaneeman and @renatobeninatto and others about innovation in general and about innovative LSPs in particular. It was pretty intense and we covered a lot of ground given the format. We talked about the change in the overall business model (translation-as-a-utility), automation, new, more efficient ways of of doing translation work and much more. This is the best I can recapture (and click on show conversation) and I am not sure how to really see the thread in a nice chronological stream with all the people involved. If you know how to do this please let me know. However, one of the questions that came up in this discussion was where could one learn about these new technologies and processes (MT being just one of them) that facilitate innovation and allow one to address new translation problems?

I had believed that that there is very little formal training around and then Renato reminded us that regional associations play an important role in providing training. The next ELIA conference in Dublin in particular has a very strong focus on innovation and translation automation technology in addition to the traditional localization themes. I have found these smaller regional shows to be more effective in providing useful training and allows a much deeper dive into the reasons why this makes sense. The ELIA event has singled out MT and affiliated technologies as worthy of serious attention in direct response to member requests. I think this is wonderful not only because I have a prominent role at this event, as I will be presenting a keynote on broad changes impacting the overall world of translation as well as doing a detailed training session on how to get started with MT technology for those who really want to get down and dirty. It is also a sign that this technology can take the next step with technology developers and translation practitioners working together. I am a big believer in dialog, and this event is an example I think of an honest attempt to build this dialog.

In the keynote session I will look at how 2 billion+ internet users, community and crowdsourcing initiatives, translation technology, ever improving free MT, new attitudes to open collaboration and data sharing are impacting the professional translation world. I will explore how the shift to the project-less, translation-as-utility world will require new skills and new services from language service providers, explore and comment on emerging innovation and also point to the ever increasing market potential that becomes available to industry innovators who have competence with and understand the new dynamics.

ELIA Bridge
I will also run a training session that will go over MT technology in some detail and provide basic background on the technology fundamentals and point to what I think are keys to being successful with MT. I will try and make this as practical and useful as possible answering questions about RbMT vs. SMT,  MT engine customization strategies, MT quality assessment and relationship to post-editing effort, understanding data, skills required for different tasks etc. I believe that innovative LSPs will be the driving force behind creating really amazing MT systems in future and I will focus on the skills that I think will be most critical to enabling this kind of success.I will also explore new business opportunities that MT can enable to get you out of the software and documentation localization market. Hopefully this session is highly interactive and I am open to communication about what participants might want to most focus on and understand. The session is on Monday October 11th  so please feel free to communicate with me on this before then. 
Translation Production Line
As we setup translation production lines to handle 10X or 100X more content in the future we will need to link key processes together. Information quality focused processes and integrated and efficient post-editing will also be necessary to build efficiency. MT alone is not enough to solve the problems we face in the future and I think it will also be critical to learn how to clean up and “improve” source content before any kind of translation attempt. Frans Wijma will also provide guidance on Simplified Technical English which will provide attendees some insight on the IQ, controlled language, source simplification issue. Something that will be increasingly valuable to learn and do in future.

Those who stay in the MT track will also get to hear Sharon O’Brien talking about post-editing MT. She will answer all the following questions: How does post-editing of Machine Translation output differ from revision or QA activities in the localization domain? Are translators the best post-editors? Do they need specific experience and training? What guidelines should be given to post-editors? What productivity enhancements can be reasonably expected? Why do translators seem to dislike this task? I saw her speak at LRC (one of the best conferences I attended last year) and she has great insight and advice to offer on this subject.

And if that weren't enough to make you sign right up, there are also some great sessions on sales strategies for LSPs from non other than Renato, localization basics and next generation localization research from CSA, CNGL and the Gilbane Group. And all at a fraction of the cost of larger conferences. Check out the ELIA site for more details.

I hope to see you there and for those of you who don’t know, I am easily persuaded onto the karaoke floor. No alcohol required but unfortunately this is not because I necessarily sing so well. I went to a Jesuit (Boys only) School in India and had a teacher of mixed Indian/Portuguese (Goanese) extraction who used to exhort:

“Sing with gusto boys! Don’t worry about the notes, you will find them.” 

This is advice I have taken to heart, as my karaoke friends from the IMTT Cordoba 2009 event will also tell you. In spite of having nothing more than a laptop with tiny speakers to provide musical backing, we sang with gusto till dawn and indeed we did eventually find the notes. ;-)


  1. Professional organizations are made of LSP and/or freelancers who both are scared of MT and of technology in general. The translation industry is such conservative to make a red neck seem a revolutionary in comparison.
    You cited STE. There are people in this industry who fight controlled languages for being a latch in creativity, for being obsolete, and so on and so forth against plain language in general. There have no interest in the user, they are so vain in their close-mindedness to be those who are actually scary.
    I never seen a technology-focused conference in this industry with standing places only. Organizers usually struggle to the end with insufficient registrations to compensate expenses.
    Why are blogs, fora and facebook groups rallying against MT (and event TM's) so crowded with passionate angry "professionals"?

  2. Kirti, I wish my music teacher in elementary school had been half as wise :-) Maybe I should have gone to Catholic school after all.

    As you know, I don't think much of the whole bag of "you can all look forward to a future of post-editing" nonsense, and I think the average technical writer has about as much chance of controlling his language as Bill Clinton does of controlling the zipper on his jeans.

    Nonetheless, predictions of the future have a way of coming true, but on a timeline we don't anticipate and with success as well targeted as wishes granted by a crafty djinn.

    The success or failure of "firehose MT" in managing the heavy flow of bulk information on the Internet and elsewhere interests me only as an matter of curiosity, not for professional practice. I am, however, interested in what the future might hold for MT-like tools that can be applied by individuals with severe restrictions due to confidentiality, resources, etc. These would have to integrate well with environments I do use for my professional work. I've had a little taste of such things in various translation environment tools, but the state of the art at the individual level is really a joke. A lot of this content simply must not be brought into networked environments in any way, which excludes most of the options commonly discussed.

  3. Sounds like an interesting session. Are there going to be videos or recaps of it for those who can't attend?

  4. Shall we come prepared for a late night in Dublin?

  5. @Luigi I agree that "controlled language" can be mind numbing in itself. Personally I believe that "Simplifying" the source material is to simply just make it readable and consumable by humans. A lot of technical material could use this help. And you are right the technology has failed to deliver enough value to create the standing room only situation. Hopefully step by step we make enough progress and add enough value to reduce the anger in the blogs.

    @Kevin Actually it was High School in may case. The Jesuits in India are great educators.
    You are right that the technology has not reached the individual translator in a meaningful way yet. This is why I believe a dialog with professional translators is critical to get the right tools out there. TM has some acceptance but many don't even find that of value. Tools have to leverage the individual translator before we can say we have arrived. Data-rich tools like Linguee are an example of next generation tools that may be helpful. You are precisely the kind of person that the technology has to make sense to. We do have a vision but the progress is slow and hard.

    Also the future has to be a combination of many modes of translation including TEP, HT, Custom MT + PE, Custom MT and the free online MT we see all over the web. They all have a place, and the best MT systems will be steered and driven by expert linguistic feedback. The audience and the content will determine how it will be done (based on balancing time, money and required quality.) My point is that technology just gives you more options to get this done.

  6. @Kirti: "Personally I believe that "Simplifying" the source material is to simply just make it readable and consumable by humans. A lot of technical material could use this help."

    Are you seriously holding your breath expecting this to happen?
    My source texts often contain words which even Google has never heard of, but the authors almost always believe that they are standard technical terms in the field. Translating such freak German into standard German would be an expensive ***translation*** job in its own right (even if you find someone who can actually handle it).
    My faith in the feasibility of controlled language is similar to Kevin's (although I would select different imagery to illustrate it).