Thursday, June 24, 2010

Going Beyond Transcreation and Finding High Value Translation Work

As I have recently been considering the downward price pressures and increasing commoditization of traditional professional translation work, I thought that it would be useful to consider what high value translation work looks like. The key question in determining value, I think is, is it related to how you translate or what you translate?

It is currently becoming popular to say that something called  “transcreation” is what value-added translation is and many LSPs are seeking to position themselves as transcreation firms rather than translation firms. (I would have chosen “culturally sensitive translation”).
Common Sense Advisory defines “transcreation” as a process by which new content is developed or adapted for a given target audience instead of merely translating existing material. It may include copywriting, image selection, font changes, and other transformations that tailor the message to the recipient.
Sounds suspiciously like localization doesn’t it?

I think the word is unfortunate, as it conveys nothing about what is involved and even people who have spent many years in professional translation, often have no idea what is meant by this word. Even worse, most potential customers for this type of service will have no idea what the word means. It is like calling a laptop a chipboard input/output mechanism. (As if “localization” was not bad enough.) Quite honestly, I can’t see very much difference from what this word suggests and what most people understand and mean when they use the word localization. You can see some banter about this in the MT meets Transcreation blog entry if you read the comments between Gordon and me. Generally, when real people, customers and friends (not industry insiders) have to ask what it means, I think you marginalize yourself and most would agree that obfuscation is not a great sales and marketing strategy. (Why are so many people who are involved with professional translation so averse to using the word translation in describing what they do? Could it be related to how they treat translators?)

LSPs involved with transcreation are generally focused on advertising and marketing communications messages which often have a higher profile than the SDL (software and documentation localization) that most in professional translation are involved with. Since marketing often has higher status than documentation/packaging/localization within most companies, it is often felt that this is higher value work. But is it really?

We see that the world of marketing is undergoing a transformation and what used to be considered critical corporate messaging is increasingly viewed as “corporate-speak” and is not trusted by the end-customers who matter the most. Jeremiah Owyang wrote a prescient blog entry three years ago where he predicted this shift. Several of his readers felt that the essay was important enough to translate, and it is now available in 11 languages. (Fan translation!)

Many others have added to these initial observations and Simon Mainwaring also has an interesting article on The death of corporate websites. The basic thesis of his essay is:
In the not too distant future static corporate websites will be replaced by their social equivalents.
This will happen because more and more consumers are engaged in daily conversations, often involving brands, across multiple applications, platforms and networks, wholly independent of these sites.
As these conversations become increasingly independent of these sites, falling traffic will render them ineffective in their current form. Instead, the online presence of each brand will necessarily expand out into the social space to stay in touch with their audience.
As a result, the online presence of a brand will increasingly become the sum of its social exchanges across the web and not the website that many currently call home.
Of course not everybody agrees, and some say this is more true for B2C than for B2B, but the need for changing the website approach is clear to most. Corporations still need websites and they still need advertising but they also need to understand what information has the greatest value, is trusted and learn how to create it or connect to it. They need to understand what is necessary to keep these websites relevant. Owyang points out that evolution from the website of yesteryear to one that is seamlessly integrated to relevant social networks is an evolutionary process and provides a path and road map.

This issue of the relevance of corporate websites matters, because increasingly, customers are making decisions about products long before they get to the corporate website. It makes more and more sense to follow these conversations in social networks as often this is where the highest value content will be. Valuable content is linked to influencers and dynamic conversations that naturally evolve in online social spaces. This content influences purchasing behavior and helps to form brand impressions and build brand loyalty. It is unwise to ignore it, as this is where brands, market dominance and leadership positions are increasingly being built. Remember that the whole point of localization or transcreation is to enhance and drive international business initiatives.

So while one definition of higher value is the extent of transformation during the translation process, I think the more important driver is the value of the content per se. My view of some of  the emerging high value content:
  • Conversations that are trusted by potential customers at various stages of the purchase process (e.g. Amazon, C-Net, Orbitz, Travelocity etc.)
  • Conversations and content that help build customer loyalty (this could include support and reseller community content as well)
  • Articulate and unfiltered opinions and reviews on the customer experience (Amazon, Orbitz, Expedia etc..)
  • Leading Bloggers who influence and help form brand impressions
  • Content that is co-created with customers that often facilitates comparison with competitors
  • Content that encourages collaboration with customers and key partners (e.g. Dell IdeaStorm)
So what is necessary to be a translation partner to an enterprise who is focused on this higher value content? These are new and still emerging customer requirements and to some extent really quite undefined, but here is a list that could provide some initial definition and possibly leverage:
  • Understanding of social network platforms (Facebook, Twitter, Blogs, etc..)
  • An understanding of the linguistic characteristics of the new “high-value” content
  • An understanding of MT and other automation tools to enable rapid low-cost solutions to be delivered to meet changing needs
  • Strong MT customization and post-editing skills so that high-value content can be quickly translated and good translation solutions developed 
  • The ability to rapidly turnaround the translation of high value content to impact active ongoing conversations
  • Rapid linguistic quality assessment skills to drive automated tools
  • Growing foundation of linguistic assets that go beyond traditional localization content TM and glossaries
I see that leading edge practitioners in professional translation will increasingly help customers solve linguistic problems around making high-value content multilingual as rapidly and cost-effectively as possible. The skills needed, go beyond what most localization projects require, as we head into a world where the amount of content/traffic on the web is expected to double every 2 years.

So what do you think? Does this make sense or would you also prefer to focus on the transcreation of “really honest advertising” like Gordon who gives you his view here?

Tuesday, June 15, 2010

Machine Translation Themes at Localization World Berlin 2010

I spent last week in Berlin at what is considered the premiere conference event in the localization industry and I also attended two other events that were focused on translation automation. About 500 people were in attendance, more if you count the people at the Translingual event.

It is clear that MT has become a much more central issue, and of great interest to the Localization World attendees as there were several sessions that were well attended. It was also clear that many have started to experiment or at least start serious explorations of the technology to better understand it and that people really do want to understand how and when to use the technology effectively.

I thought it would be useful to review and summarize the sessions even though I was involved in some of the sessions and to some extent this is self-promotion, I will try and make it useful to keep the dialogue going.

MT Pricing – Buyers, Sellers, Developers
This session had several short presentations from the panelists and a detailed introduction from Josef van Genabith of the CNGL on how MT cost/benefit/value could be viewed. The session had hoped to provide some insight on how to price MT, post-editing and better understand the value that MT could deliver to the various stakeholders. While some perspective was provided on these issues, I think it failed to answer the question that many in the audience had: What rate should I charge for post-editing MT output?  The problem with this question is that it depends on the quality of the MT system, the skill levels of the editors and probably the volume of the total project. There is not a single clear formula and MT does require customization effort and quality assessment before production use, for maximum benefit. I think this subject will be an area worth further exploration as there is a relationship between the quality of the MT system and the cost/benefit and thus the value of the system.(I think the session was “uneven” as one of the panelists put it.)

The feedback I got on the session was probably 40% positive, 60% negative and many said the discussion was derailed when one of the panelists said that MT should be “free” since it only costs the electricity to run. This ignores the real effort and skill required to get a good MT engine in place and this kind of specious argument could logically extend to saying all software should be free, since it too only requires electricity and a computer to run.

MT in the Real World — Successes, Challenges and Insight from Teams of Customers and Providers
I felt this was one of the best MT related sessions (even though I had doubts about the format before hand) as it provided both the customer and the vendor perspective on the same situation and described the following issues in three different “real world” situations:
  • The rationale for MT use
  • What were the challenges during the startup phase?
  • What were the costs and metrics used to measure success?
  • Descriptions of the ramp-up experience and potential expansion and scalability outlook
  • Brief overview of the results in terms of MT quality and business benefits
  • Some description of things that went wrong and what could have been done to avoid this
  • Lessons learned and recommendations for best practices
In an attempt at humor I made a comment about my customers’ unsuccessful evaluation of a ProMT Russian engine for patent translations in this session. (It was not my intent to impugn this product, as many will tell you that it is a fine Russian MT engine.) I wanted to also point out how critical a major terminology effort is in a patent MT application as I was aware of the 400,000 term effort made for the Japanese engine that we had built. Anyway I thought I should clear this up as many seemed to interpret my comment as a deliberate and intended slam on the competition. It was not.

Optimizing Content for Machine Translation
This was another MT session that I think had very high quality content, though much more technically complex and perhaps more demanding of the audience. The slides for this session are worth a look as they are dense and packed with information.
Karen Combe of PTC outlined several kinds of typical TM practices that can be problematic for SMT. She gave specific examples of the kinds of issues I highlighted earlier. This helped to highlight that TM in it’s natural state may not be quite ready to pour into an SMT training engine.

I found the contrast in getting data ready for the MT engine between Kerstin Bier and Olga Beregovaya very instructive. It showed how fundamentally different the SMT approach is from the RbMT approach with very specific and concrete examples on the data preparation strategy. It was interesting to see that some of the tags and TM metadata were very useful to a RbMT engine even though it could be a problem for SMT.

Melissa Biggs of Sun (Oracle) and Jessica Roland of EMC provided some examples of unsuccessful and successful uses of “controlled language” technology. A great session, filled with useful information.

TAUS Data Association Update
While not strictly focused on MT there was a lot of mention about MT in this session. A large part of the session focused on why the panelists had joined the TDA, but there was also some useful information from the session:
  • TDA currently has 2.5B words of TM and hopes to double this in the next year
  • TDA currently has 70 members and is trying to make it easier for smaller members to join
  • They are trying to get more open source tools available to help members process the data more easily.
  • They have annual operating costs of about $500,000
The three main uses of TDA data so far have been:
  1. Monitor terminology use and practices across an industry
  2. TM Leveraging
  3. Provide larger mass of training corpus for SMT (But use with care after cleaning and normalization)

Translingual Europe 2010: International Conference on Advanced Translation Technology was held on the day before and had much about MT initiatives across the world and especially in the EU. Dave Grunwald has provided a good summary of this event in his blog.

The EU and the DFKI are working hard to further the state of MT and related language technology and I think this conference could become a source of interesting initiatives. The moderator of my group of presenters threw us all off balance, by suddenly announcing that the presentation time would be 1/3rd less than we had assumed to that point, so some of the presentations looked really hurried.

The presentation by Microsoft on their effort to develop the Haitian Creole system and user presentations by the EC, EPO and Symantec were very interesting. Also it is good to hear that the EC will be funding more research to advance the state of the technology and that they are targeting small companies in particular as Kimmo Rossi stated in his opening presentation.

I hope they make Translingual an annual affair, loosely linked to Localization World,  as it has great promise to becoming a major MT and language technology event, especially if it becomes a two day event with one day focusing on policy issues and the second day focusing on practice. I also had the good fortune to celebrate Hans Uzkoreit's birthday party earlier in the weekend at the Wasserwerk facility which was also the scene of many interesting discussions on broader language technology issues. The open forum approach  brought forward many interesting ideas at the "Berlin Theme Tank" proceedings and Hans was clearly a driving force behind the discussions.