Monday, August 1, 2011

Translation Crowdsourcing

The phenomena of a crowd (or community) stepping forward and doing real translation work, often for no direct financial compensation is something that troubles many in the professional translation world. Mostly because they see this activity, as work being taken away from legitimate professionals or they see it as a ploy to reduce prices. While in some cases their fears may actually be justified, in the most successful uses of this approach I think it is clear that this is not true.

As I have said before, the growing momentum in the volume of content demands new production models. This momentum which exists both in the corporate world and also the general world out there simply cannot be addressed by ONLY using traditional professional translation production models. New needs require new approaches. For those who insist that the data deluge is a fiction, the rest of this post is probably irrelevant.

Another key driving force behind new crowdsourcing initiatives is the need to engage and interact with users in new markets. In new markets having active conversations with locals is key to building brand awareness and really learning about local needs and behavior. This is frequently a more important driver than cost containment as many in the industry think.

In some cases were it not for crowd-based localization efforts, it would simply not be done as it is not economically feasible to undertake the same efforts (and expenses) as are made for FIGS/CJK for “lesser” languages. Thus crowdsourcing is emerging sometimes as a means to get “lesser” languages done.

If we look at some of the most successful examples of crowd-sourced translation in practice, we can see that they have many if not all of the following elements in common.

A Crowd/Community That Is Invested:

·         TED Open Translation Project – Volunteer translators are often inspired by the content and wish to share it with their friends and countrymen. June Cohen has said that the volunteer translators in general do better quality work than the many of the paid professionals, who initially did a few translations to seed the project because of their passion for the subject and often their subject matter expertise. This effort has now enabled over 20,000 translations into 80+ languages of really challenging material. Many professionals also volunteer because they believe in the high general value of the content.

·         Facebook – Users who wish to build and expand the friend community in their particular language group. This effort has enabled Facebook to grow rapidly in international markets and accomplish very rapid coverage across 60+ languages. Had they used traditional means to do this it may have taken them years to get to the same point. Critics also often miss the point that engaging real users in the translation task also encourages rapid growth of the user base as “user translators” engage friends into their network.

·         Microsoft - MVPs (top accredited reseller partners) who wish to make technical support knowledge about Microsoft products more easily and widely available in their markets. Their efforts are rewarded by lower support costs and also an increase in product sales as more and more users look for self-service knowledge base information. Microsoft has been a trail blazer in making large amounts of knowledge base content available via MT, they are now adding crowd based editing to raise the quality of the translated information. Thus the most used and vital information tends to get the most attention and benefits all users.

·         Asia Online – Student users provide corrective feedback to continue to improve the translation quality of the Wikipedia and other knowledge content that is initially done by highly customized MT engines and paid translators. The students themselves will be the primary beneficiaries of this content, and their efforts will enable them to access high quality educational information. The volume of this information will likely increase a thousand fold.

 ·         Yeeyan:  has 150,000 registered users, who collectively translate 50 to 100 news articles every day from English to Chinese. Since its inception in 2006, the site has grown into a key gateway for Chinese speakers who want to follow international news. It has been so successful that it has attracted the attention of major news sources like The Guardian and ReadWriteWeb. Yeeyan is focused on addressing the problem of ghettoization of information by language through a community collaboration, where members both identify interesting content and also help to translate this content.

·          Adobe: This is a much more carefully managed effort designed to engage influential users, partners and customers to help provide relevant information for the broad Adobe User community in China.

T   Twitter: The translation center asks Twitter users -- all volunteers -- to help translate Twitter's interface into various languages. Once the basic support pages are translated, a select group of the "most active" translators are invited to work with Twitter to "maintain localized versions of the service." Twitter boasts that its translation center has 200,000 translators, and that the localization process for Dutch and Indonesian took just one month from the first call for involvement to its announcement. The availability of its interface in multiple foreign languages is bound to increase its popularity and effectiveness not only for online marketing but also for social and political activism.

Software Infrastructure That Facilitates Contribution & Participation:

In all of the cases above the companies involved crowdsourced translation initiatives need to invest in software that enables tasks to be parceled out, evolve as tasks change, enable efficient administration, maintain quality, gather feedback, and build self-sustaining eco-systems. The tools developed by dotSUB, Lingotek, Yeeyan and Asia Online are all unique collaboration and translation workflow management tools that enable these kinds of initiatives, They make little or no use of industry standard tools like Trados and TMS because of the highly proprietary, rigidity and archaic nature of these tools. These new-generation tools are much more open and are designed to evolve with technical and process advances on the internet today. It is quite possible that these community efforts could produce tools that supercede many of the tools in use today as these new tools focus on collaboration and sharing assets to enhance the efficiency of the collaborative translation process.

The Importance of Engagement and Higher Purpose:

It is interesting to note that translation is not the primary business of any of the companies listed in the examples above. In every case the goal and intent is to make more information available faster. Even for many of the corporations that are exploring crowdsourcing, the rationale is often more about customer engagement than cost savings. It is also important to note that none of these initiatives could even be attempted without the use of automation and large-scale community support and they are enabling initiatives that would not be possible otherwise. This is also true for Facebook who still had to use professionals to translate legalese that their community was not interested in translating.  The role of communities is likely to increase in future as more of the world comes online.

As we move forward we will see much more video and other rich content come online and already it is clear that the old approaches will not enable us to make this new content multilingual in an effective time frame. Crowdsourcing and automated translation will be necessary tools for an organization that seeks to communicate across the globe. As Clay Shirky has pointed out, the ‘cognitive surplus’ of the online population is a force that can be harnessed under the right circumstances and for the right purposes. It is likely that the professional translation world is going to see significant disruption in the coming years, as innovators figure out how to build sustainable models around community engagement, technology and organizational mission. However, as we have already seen, there is much that the crowd has no interest in doing and we should expect that this is not likely to change.

Crowdsourcing is here to stay and is a new mode of production that enables high–volume projects to be undertaken, engage with users and partners more deeply and participate in multilingual social networks where so many branding impressions are being formed. Managing crowdsourcing is also a major opportunity for savvy LSPs who have processes in place to recruit and manage the collaboration of dispersed volunteers and contributors.