Pages

Thursday, September 16, 2010

Asia Online In The Conference Season

Asia Online principals will be very active in the conference season this autumn as we go and share our vision about best practices in using machine translation and our ideas on the continuing evolution of this technology. Please don’t hesitate to introduce yourself if we don’t recognize you. 
Dion Wiggins will be speaking at the LRC conference in Limerick, Ireland in a special free workshop that will be held just before the actual conference. This workshop will explore a number of elements of machine translation. Dion is a great speaker and I am sure this will be a entertaining and thought provoking three hours. He will also be on a panel during the conference.
LRCLogo
Section 1: The future of Machine Translation – What MT means to Enterprise and Language Service Providers
This session will explore key trends in the MT industry; address many misconceptions of about machine translation. We will explore a variety of MT concepts, technologies and provide a core background on MT, present, past and future. We will investigate a variety of attitudes towards MT and concepts relation to translation overall that often get blurred when it comes to MT. We will look at a variety of models for LSPs and enterprises to use MT – we will also explore models on how LSPs and enterprises can monetize MT and integrate MT into their business. We will explore a mystical word called quality and examine what it is and what it means to various organizations in a variety of situations.
Section 2: Asia Online Language Studio™ Translation Platform.
This session will expand on the first, with a live demo of Asia Online Language Studio™ Pro desktop tools and Language Studio™ Enterprise translation platform. We will explore best practices in customizing a translation engine and go through the key steps one at a time that Asia Online performs in order to deliver a high quality translation engine. We will look at the creation of training data, aligning text from multiple documents in different languages, cleaning the data to ensure only high quality data is used to build the engine. Finally we will look at how to improve an engine’s quality – a process that is unique to Asia Online and shows the true impact of clean data and how the normal process of editing can rapidly improve the overall quality.
Attendance is free but people must register by emailing lrc@ul.ie
ELIA Bridge
Kirti Vashee (that would be me) will be doing a detailed session on How to Get Started with MT at the ELIA Networking Days conference. I will also be presenting a keynote on the growing impact of MT on the professional translation world and what this might mean. There is also other great content on MT at this same conference presented by others. And thanks to @ParaicS I will also spend a day with CNGL researchers to share and exchange ideas on translation technology and improving collaboration in localization. Topics to be covered in the detailed session include:
  • MT Technology Overview – RbMT, SMT and Hybrids
  • Detailed SMT technology overview
  • Skills required to succeed with MT
  • Rapid Quality Assessment Of MT Output
  • Post-editing Practices & Pricing Approaches
  • New Revenue Opportunities Created by MT
  • Getting Started – Key Steps & Considerations
TAUSlogo
Additionally, we are also doing two presentations together with Moravia, one at the TAUS Annual User Conference on Machine Translation in the Imperfect World: where we discuss how SMT engines can be developed in situations when there are scarce bilingual (TM) resources. We will be doing an expanded version of this session at the AMTA conference where we discuss how strategies differ for different data availability scenarios. We are going to skip Localization World as there is very little focus on MT.
AMTAds_wFullName
In mid to late October we will also participate in a road show  on the East Coast (Boston, NYC, DC) with our partners Milengo, Acrolinx, Clay Tablet and Lingoport where we will describe end-to-end solutions for global enterprises in a “high personal interaction” seminar setting. Watch for announcements coming soon.
Partnernews20
In early November we will also participate at tekom tcworld conference where I will speak together with Across Systems on the integration of MT and crowdsourcing into Enterprise translation processes. I will also present how MT and post-editing can be used to leverage technical customer support efforts and make much more information available in self-service environments to increase customer satisfaction and build customer loyalty.
eidctagung_kl
Finally I will also be involved in a GALA Webinar on December 16th which will be an abbreviated and probably updated version of the Dublin presentations. The links and info will be up as soon as I send in a description to GALA.(Sorry Amy)  And I have just realized that I will also need to find a way to participate in a Proz virtual conference on October 13th while I am in Dublin. Maybe I could get a bunch of the CNGL people also involved with this if the time zone differences permit this?
gala-logo
vc-logo-2010-v2

Thursday, September 9, 2010

Emerging Technology for Community Translation

I have been talking about the changes going on in the translation industry for awhile with MT being a key technology capability needed to manage the increasing volume of content. I suspect that MT will become more important and quite possibly supersede  translation memory (TM) tools in many applications. However, I also believe that the best MT systems will be closely connected to translator feedback which of course requires some degree of integration with workflow and editing applications.

I have felt that it is also likely that other translation tools will also change to accommodate these new market requirements. TMS tools were originally developed and optimized for relatively static content, and have generally existed in isolation in localization departments. Over time they have started to get connected to content management systems (CMS) to increase production efficiencies. Very recently we have started seeing some of these tools becoming more collaborative and reaching out to translators to connect them into the content creation and translation management infrastructure. Automation, becomes much more imperative, as the volume of content that is translated grows. CSA has recently suggested that LSPs without a technology and automation strategy will become an endangered species. While there are a few new initiatives out there that help with these "new" problems, one of the more exciting ones I think is Lingotek.

I had an opportunity to talk to Rob Vandenberg at Lingotek about his translation collaboration platform or CTP as they call it. This is a web and cloud based platform and requires no local software to be installed. This to my mind is a next generation tool that is designed from the outset to incorporate and leverage TM, terminology, various MT systems and both internal and community feedback on translations in one common management framework. Lingotek is to some extent a TMS but much more community focused and content stream focused than any of the traditional TMS tools. Also like many of the best products out there it evolved out of a close collaboration with a customer; Adobe in this case, who by the way also uses a traditional TMS product quite intensively.

This differs from traditional TMS in that it “embeds translation tools within the content view” and is also designed to allow many different user groups (customers, partners, resellers, community and professional translators) to all work together on the same content.

Lingotek is focused on
Community Content  (like Dell IdeaStorm) and facilitates the translation of this content by either community, MT or professional translation and provides linguistic assets and a translator workbench to anybody involved in this translation effort. I have stated before that I think that conversations with real customers and partners is the future of building international markets, so I would expect that Lingotek and others like them will become much more important than the tools that focus on the traditional SDL (software and documentation localization) market.  The graphics below shows what the translator workbench screen looks like and how a user can override an existing translation.
lingotekscrnshot
Lingotek is connecting to a growing set of community content creation & collaboration tools like: Jive, Drupal, Sharepoint, Oracle UMC and soon will connect to Alfresco, Telligent and Sales Force. They plan to continue to expand on the supported set of “content containers” mostly driven by the platforms used by their customers.

Lingotek2

So content can be categorized as it is flows, by community administrators who decide how particular content needs to be translated by it’s relative value. So for example in the following hypothetical table, 1 is the least important content and 10 is very important content that requires and could only be done by professional translation processes.
Content Value Index Description
1-3 Blog comments
4-6 Blog entries
7-8 Marketing material, Basic documentation
9-10 GUI, Critical documentation, Product marketing materials

So typically an administrator would decide about routing content to the following three translation processes based on an assessment of value and the required linguistic quality of the translation :

  1. Assign to professional translation
  2. Assign to community, MT + community post-editing or
  3. Process through customized MT
The Adobe project in China is an example of how this can work. Adobe user groups in China are creating content in addition to assisting in the translation of Adobe corporate content. Then, in addition to seeing this new and ever expanding translated content on Adobe sites, user groups and partners are also allowed to place this content on their own sites too. This flow of content is generally helpful to anybody who is using these Adobe products. This is collaboration at work and I think this is a very cool of customer/partner engagement. This is an example of a global company talking directly to customers and partners in local markets and making them part of the translation and content creation process. I think that other global enterprises will follow because it simply makes sense. Communities make sense because customers trust voices in the community and anything that facilitates and expands engagement with local customers can only benefit the global enterprise. This is the stuff that builds customer and brand loyalty. 

While I am often mistaken as an SMT evangelist, I am most excited by new models of man-machine collaboration that frees information, knowledge and makes it pervasive across languages. I am excited by the problem that Lingotek is attempting to solve as I think it is the most exciting place to be in the global-business-driven translation market:
  • Making Community content multilingual
  • Making dynamic and continuously updated content streams multilingual
  • Allowing both internal and external professional translation resources to work together with community and crowd (think customer, not mob) volunteers
  • Creating tight linkages to traditional translation tools (MT, TM, Terminology) and providing tools for any capable volunteer to participate
I would not be surprised to see that community-focused initiatives become more important than traditional localization in terms of value to the final customers. The only other tool I am aware of that comes close to what Lingotek is doing, is the new version of Across where they have added many MT hookups and a crowdsourcing module. I think this need to manage the content flow, triage translation jobs and set up flows where both internal and external resources collaborate will grow in importance as the content continues to grow in volume. Here is an investor take on Lingotek and this is a slide presentation on the collaborative translation platform.

If anybody is interested in hearing the actual conversation we had, it is available here but be warned it is 50 minutes or so and was not intended to be entertaining.

I am planning on writing about crowdsourcing soon as I think it is a much misunderstood phenomenon that will become commonplace in the enterprise translation market. While I don’t claim to be an expert on this – I think anybody with an open mind can see it is inevitable and necessary.