Monday, June 20, 2016

The Larger Context Translation Market

This is another post inspired by my recent visit to Rio and dialogue at the ABRATES conference, where translators were eager to engage with MT in a meaningful way, and asked many questions about where the most interesting translation challenges were. In several conversations about “the translation market” I had a very clear sense of how there really needs to be a larger perspective on what this means, as the most interesting opportunities with MT tend to lie outside what is generally understood as the translation industry. 

For most of the people who attend “translation industry” and localization conferences, the most trusted description of the industry is the market that Common Sense Advisory (CSA) describes as the Language Services Market. A market where translation agencies provide translation, localization and interpreting services to buyers for a fee. This is a market that is estimated by CSA to be $38.16 Billion in 2015, with Lionbridge proudly claiming to be a “perennial list-topper” and the largest language service provider (LSP).  Their PR piece provides a clear description of what the CSA market definition covers. This link lists the Top 20 LSPs globally, measured by total revenue. Here is another view of the CSA market summary that shows geographical concentration of the industry.
*The Language Services Market 2015, Common Sense Advisory Research, June 2015
  This CSA sourced graphic from the Lionbridge blog describes the fastest growth segments in the language services market. Clearly, translation, software and website globalization are at the top of the growth list for this type of paid translation service.

The following graphic looks more closely at what the focus of the LSP Translation industry is, and we see the kind of content they focus on, and the tools and skills most relevant to addressing translation of this type of content. Thus, project management is the core business function and the most important tools are TMS systems and TM.

While some LSPs do use MT, it is generally not a mission critical tool. MT is used if the LSP is able to pull together an MT system that improves productivity and reduces costs, and is largely reactive i.e. often because the client insisted. But, it is important to understand that the focus is still on the same kind of content shown above. The translation industry has mixed success with MT, and maybe a few systems do become integral to the overall translation production process. But most do-it-yourself LSP MT initiatives fail or wallow in a kind of confused and isolated geekdom with mediocre results.  MT systems that consistently produce excellent output are the hardest to develop, so it is somewhat ironic that those that understand the least about how the technology works, try and build the most capable and efficient MT systems. The MT experience that Jost and other translators describe in blogs,  described as PEMT, MpT, MT+PE etc.. is presented as the great evil by IAPTI, or horrid commoditization of the work of translation by many others. Most often they are working with MT systems at arms length, and have no ability to steer or guide the  MT system development to make it more useful.  Hopefully, the notion of Moses-as-instant-magic is now widely understood as a limited success strategy, and the more savvy enterprises and LSPs leave it to experts, who also struggle to meet these consistent high quality output goals. Good MT systems will always take time, expertise and articulate linguistic feedback to develop.   

However, I think the new Adaptive Dynamically Learning MT that Lilt is producing has a very bright future with smart LSPs, and provides a platform to transform the MT experience into a much more predictable and worthwhile endeavor and will also allow translators to be much more engaged and involved in steering the MT system.

The Larger Context for Value Added Translation


Though the “translation industry” MT experience is mixed, I would argue that MT has been responsible for driving revenue or definable economic value in a variety of non-traditional scenarios, on a scale that dwarfs “the translation industry” as defined above. Interestingly, the most successful implementations of MT are done by global enterprises who often still work with LSPs for the static structured content, but for the higher value unstructured dynamic content are largely choosing to do it themselves (sometimes with some expert help) or build internal teams to address what they see as a long-term and strategic need to enhance international business initiatives.  It is useful to consider some case studies of these strategic MT initiatives by enterprises, to  understand this better.
The graphic above shows the volume of words that Google translates every single day with their MT systems as reported at Google I/O in April 2016. To put this in context, I saw a Lionbridge presentation a few years ago, where the CFO said they translated just over a billion words that year (2009). SDL who is probably the most MT savvy and active with MT LSP, recently claimed they are doing 20B words per month through MT. So, it is quite possible that Google alone, translates more words a week than the whole “translation industry” does in a year. When one considers that perhaps over 90% of Google’s revenue (~$78B/year) is generated from advertising linked to key words, it is quite possible that Google derives tens of billions of dollars from their MT technology initiative! It also gives them very specific intelligence on what matters to people across the world and what cross language content is the most sought after. The economic value of this knowledge is significant, and hidden in the advertising revenue they report. This knowledge of what matters across the globe is something the “translation industry” and SEO experts would love to know.

is another example of high value derived from MT. Their initial entry into MT was like Google related to search, but additionally they had a massive knowledge base in English for their software products that was difficult for their substantial global customer base to efficiently access. Thus, while Microsoft probably spent hundreds of millions on “translation industry” services for static content, this only covered a tiny fraction of what they needed to translate. Given that Microsoft gets as much as 70% of their revenue from non-English speaking countries, translation of all kinds of product related content is important. Making more technical support and customer care content rapidly multilingual was an imperative for executives who cared about the customer experience, and also generated huge savings in support costs and dramatically improved the user experience for the non-English speaking customer. The software industry measures the value of self-service content by something called deflection cost. So, if they can deflect a call to the support center, by making more knowledge base content available in more languages,  using MT, they can save possibly as much as a $100M+ per day given the size of their user base and actual volumes of knowledge base access. Add maybe another 50B words/day that their Bing MT does for the random internet user, and we have another stream of economic value coming from search words that generate advertising revenue across the globe. Their recent Skype STS initiative also will likely yield great benefit and new ways to monetize their translation technology expertise.

When you consider that both Intel and Adobe also use the Microsoft Hub MT to translate knowledge base support content, the deflected cost savings impact is easily worth hundreds of millions of dollars a day. This is not even considering the many other IT companies doing this on their own using other MT technology e.g. Symantec. The “translation industry” has a very small footprint in this kind of translation activity,which is now often considered mission-critical and probably involves several billion words per month.

The online eCommerce market is another example of economic value generated by competent MT efforts that is off-the-books of the “translation industry”. EBay decided some years ago that emerging economies were a huge opportunity worth strategic attention. So they acquired MT technology and built a competent MT team that had astrong linguistic collaborative component in the team. Based on presentations they made at the AMTA 2014 conference it was clear that there was a huge growth impact in the Russian market from their initial efforts to to make more Russian content available. It would be safe to say that the value of the impact is probably in the hundreds of millions of dollars of new revenue, from all the new markets that they have been addressing using MT. It is also really worth taking a look at what is involved in doing this. It takes focus on solving new kinds of translation problems and making sure the translation problems you solve do enhance the value of your MT efforts. This last link shows the special issues related just to Brazilian Portuguese. We should note that most competent MT efforts of any scale, move carefully, one language at time rather than trying to do 20 or 30 in a single go. We also see that Amazon acquired Safaba in 2015 and possibly have similar plans to make catalogue content multilingual to drive bigger volumes of international business. Alibaba and Baidu also have eCommerce focused MT efforts well underway but fewer details are available. The net economic value of all these type of MT initiatives: Probably in excess of $20B per year by my very rough estimates.

Recently Facebook surprised the world by announcing that they have a substantial MT effort underway after using the Microsoft Bing MT technology for several years. When asked why they did this Alan Packer said: Scale is one reason Facebook has invested in its own MT technology. The other reason is adaptability, they wanted technology that was optimized for their very specific needs. Facebook is now serving 2 billion text translations per day. The problem they had with Bing they claimed was, it was built to translate properly written website text and did not do well with the slang, metaphor and idiom typical in Facebook comments. Packer described Facebook language as “extremely informal. It’s full of slang, it’s very regional.” He said it is also laden with metaphors, idiomatic expressions, and is riddled with misspellings (most of them intentional). Additionally, as in the rest of the world, there is a marked difference in the way different age groups communicate on Facebook. They know that already 50% of Facebook users regularly use auto translation. This user group will only  grow as more people come online. Packer says that access to the translation product leads users to “have more friends, more friends of friends, and get exposed to more concepts and cultures.” The more people across the world that Facebook users can connect with, the longer they’ll spend on the social network, and the more revenue-earning ads they’ll see. The economic value of this is probably several billion dollars a year. Emerging social networks that are global will need to address the same problem. 

So what we see is that a select few companies are generating more economic value from solving very specialized and much more challenging translation problems than the whole gross revenue output of the “translation industry”.   Solving large-scale translation problems using MT is apparently a very high value proposition, and none of the global enterprises mentioned above considered going to the “translation industry” to help solve these really challenging and complex translation problems. Probably because it is very clear to any strategic observer at these companies, that most LSPs lack the vision, skill, interest and competence to solve these types of translation problems.  We can perhaps even generalize the core requirements for a larger set of global enterprises as shown below. There is a real mismatch in terms of skills and focus between the broader translation needs of global enterprises and the service focus of the “translation industry”. While the static content will likely remain important as a mandated requirement, it is not where long-term corporate value is built either for the enterprise buyer or the LSP in my opinion.
To illustrate this further let us consider the investor sentiment on value that can be gleaned from stock market data. While this might be a stretch of logic to some, I think we can fairly assume that investors value solutions to certain kinds of translation problems more than others. Facebook has seen a huge growth in mobile ad revenues and it seems that they are taking ad share away from Google recently, and so these Market Value/Sales numbers reflect very active market trends. The investor sentiment is that Facebook is probably better poised to gain $$$s from the next wave of internet adoption than any of the companies listed in the chart as they climb beyond 2B users, very few who speak English or French or German. As one analyst says: "Advertising budgets are moving towards Facebook, and it seems to be a winner in the online advertising world with measurable results." Contrast this with the investor sentiment for large LSPs, surely, it has something to do with long-term promise and potential. To me this suggests that investors in general view the LSP focus as lower in value but understand that translation can produce huge leverage in the right hands.

So to those wonderful translators at ABRATES who asked me what kinds of MT projects to get involved with, I would say the following:
  • Focus on companies who are solving interesting translation problems. They will have the most rewarding work and it might involve stepping out of the translation industry.
  • Stay away from LSPs who sell the Moses Mirage, this is likely to be the worst PEMT experience. MT systems that don't want translator feedback at a pattern level are not likely to be a professionally satisfying experience.
  • Work with people (LSP/Enterprise) who allow and want you (translator/editor) to provide feedback and interact with the MT development process.
  • Learn about Machine Learning and AI in other domains, and develop skills with Regex, Corpus Analysis & Corpus Editing and Pattern Identification skills to be considered valuable.
  • Explore Adaptive Dynamic Learning MT like Lilt (maybe others will appear soon)  to understand how MT can work with you and for you while you wait for the right opportunity. This is truly a paradigm shift that is worth at least some experimentation to see how the translator desktop could evolve.
  • Ease up on the need to have everything on the desktop. The future of Machine Intelligence solutions will require big data and big computing, so the best and most sophisticated tools will by definition only be available in the cloud. Lilt is a first generation example of this, others are coming. The cloud makes sense and allows new, more effective ways to solve old problems and is not a bad thing.
And for those who think the four companies above are the exception rather than the rule, it is worth noting that we are just beginning with what is possible to do with Machine Learning and Artificial Intelligence. Neural MT is just beginning and could drive a whole new wave of higher quality and more adaptive MT. The Machine Intelligence market is still nascent and we will very likely see big data + big computing + smart algorithms come together, to solve problems that we thought were beyond the scope of computers just last year.


  1. Glad to read you again...
    The problem with MT is that many customers expect vendors can help them use MT or use it on their behalf. Unfortunately, most LSP have not the necessary know-how and competences to help their customers and either claim, and pretend, to have them - and fail - or that MT is of no use or unfit for those customers asking for it.
    This is due to a widespread attitude that is the legacy of an obsolete academic approach and is based on a wrong belief in the importance of translation, per se, rather than with respect to the customer's business.
    LSP generally lack of strategic vision extending outside the realm of translation.
    I just published a short post mostly using the same approach and content (

    1. Luigi

      I will be writing very regularly for the next few months and have some subjects in mind.

      If you would like to cross post your big data post here as well I would be happy to do so.

      Happy Summer.


  2. Great and useful post - thank you guy! This is just what I needed.

  3. Wonderful article! Thank you so much!

  4. In case you missed it the traditional market has grown and I am sure LIOX will publicize their ranking again. But this again just emphasizes that the high value translation market is OUTSIDE the traditional market.

    Global market for "language services and technology" will be GT US$40 billion in 2016 via @CSA_Research #t10n

  5. This is an update to Facebook's earnings and it clearly illustrates why they have the most leveraged valuation -- their earnings are driven by Mobile and Video and they are having difficulty finding ad space to sell

  6. They know that already 50% of Facebook users regularly use auto translation. - There is no way this figure is correct.

  7. (to elaborate on my earlier comment): apparently according to Facebook, over 50% of users SEE auto-translated content. That does not mean they UNDERSTAND or APPROVE OF or even CHOOSE auto-translated content. And we also know that this 'content' is largely ads.