Pages

Showing posts with label Internet trends. Show all posts
Showing posts with label Internet trends. Show all posts

Thursday, September 24, 2020

NiuTrans: An Emerging Enterprise MT Provider from China

 This post highlights a Chinese MT vendor who I suspect is not well known in the US or Europe currently, but who I expect will become better known over the coming years. While the US giants (FAAMG) still dominate the MT landscape around the world today, I think it is increasingly possible that other players from around the world, especially from China may become much more recognized in the future. 

One indicator that has been historically reliable to forecast and predict emerging economic power is the volume of patent filings in a country. This has been true for Japan and Germany historically where we saw voluminous patent activity precede the economic rise of these countries, and recently we see that this predictor is also aligned with the rise of S. Korea and China as economic powerhouses. However, the sheer volume of filings is not necessarily a lead indicator of true innovation, and some experts say that the volume of patents filed and granted abroad is a better indicator of innovation and patent quality. But today we see emerging giants from Asia in consumer electronics, automobiles, eCommerce, internet services, and nobody questions the building innovation momentum happening in Asia today. 


Artificial Intelligence (AI) is heralded by many as a key driver of wealth creation for the next 50 years. To build momentum with AI requires a combination of access to large volumes of "good" data, computing resources, and deep expertise in machine learning, NLP, and other closely related technologies. Today, the US and China look poised to be the dominant players in the wider application of AI and machine learning-based technologies with a few others close behind. And here too deep knowledge and clout are indicated by the volume of influential papers published and referenced by the global community. A recent analysis, by the Allen Institute for Artificial Intelligence in Seattle, Washington found that China has steadily increased its share of authorship of the top 10% most-cited papers. The researchers found that America’s share of the most-cited 10 percent of papers declined from a high of 47 percent in 1982 to a low of 29 percent in 2018. China’s share, meanwhile, has been “rising steeply,” reaching a high of 26.5 percent last year, Though the US still has significant advantages with the relative supply of expert manpower and dominance in manufacture of AI semiconductor chip technology, this too is slowly changing even though most experts expect the US to maintain leadership for other reasons

Credit: Allen Institute for Artificial Intelligence

These trends also impact the translation industry and they change the relative benefit and economic value of different languages. The global market is slowly changing from a FIGS-centric view of the world to one where both the most important source language (ZH, KO, HI) and target languages are changing.  The fastest-growing economies today are in Africa and Asia and are not likely to be well served by a FIGS-centric view though it appears that English will remain a critical world language for knowledge sharing for at least another 25 years. These changes create an opportunity for agile and skillful Asian technology entrepreneurs like NiuTrans who are much more tuned-in to this rapidly evolving world.  I have noted that some of the most capable new MT initiatives I have seen in the last few years were based in China. India has lagged far behind with MT, even though the need there is much greater, because of the myth that English matters more, and possibly because of the lack of governmental support and sponsorship of NLP research.


The Chinese MT Market: A Quick Overview

I recently sat down with Chungliang Zhang from NiuTrans, an emerging enterprise MT vendor in China, to discuss the Chinese MT market and his company’s own MT offerings. He pointed out that China is the second-largest global economy today, and it is now increasingly commonplace for both Chinese individuals and enterprises to have active global interactions. The economic momentum naturally drives the demand for automated translation services.

Some examples, he pointed out:

In 2019, China’s outbound tourist traffic totaled 155M people, up 3.3% from the previous year. This massive volume of traveler traffic results in a concomitant demand for language translation. Chungliang pointed out that this travel momentum significantly drives the need for voice translation devices in the consumer market like those produced by Sougou, iFlyTek, and others, which have been very much in demand in the last few years.

There is also a growing interest by Chinese enterprises, both state-owned or privately owned, to build and expand their business presence in global markets. For example, Alibaba, China’s largest eCommerce company, is listed on the NYSE and has established an international B2B portal (Alibaba.com) where 20 million enterprises gather and work to “Buy Global, Sell Global.” Currently, the Alibaba MT team builds the largest eCommerce MT systems globally, often reaching volumes of 1.79 billion translation calls per day, which is a larger transaction volume than either Google or Amazon.

“All in all, as we can see it, there is a clear trend that MT is increasingly being used in more and more industries, such as language service industries, intellectual property services, pharmaceutical industries, and information analysis services.”

While it is clear that consumers and individuals worldwide are regularly using MT, the primary enterprise users of MT in China are government agencies and internet-based businesses like eCommerce. This need for translation is now expanding to more enterprises who seek to increase their international business presence and realize that MT can enable and accelerate these initiatives.

The Chinese MT technology leaders in terms of volume and regular user base are the internet services giants (such as Baidu, Tencent, Alibaba, Sogou, Netease) or the AI tech giants (such as iFlyTek). Google Translate and Microsoft Bing Translator are also popular in China since they are free, but they don’t have a large share of the total use if the focus is strictly on MT technology.

When asked to comment on the characteristics and changes in the Chinese MT market, Chungliang said:

“In our understanding, Sogou and iFlytek's primary business focus is the B2C market, and thus both of them develop consumer hardware like personal voice translators. Sogou was recently (July 29, 2020) purchased by Tencent (a major social media player), so we don’t know what will happen next. iFlytek is famous for its Speech-To-Speech technology capabilities. Thus it is natural for them to develop MT, to get the two technologies integrated and grab a larger share of the market.

As for the other important MT players in China, Alibaba MT mainly serves its own global focused eCommerce business, and Tencent Translate focuses on providing the translation needs of its users in social networking use scenarios. Like Google Translate, Baidu Translate is a portal to attract individual users who might need translation during a search. It also serves to expand Baidu’s influence as a whole. While Netease Youdao focuses on the education industry, and the Youdao Team integrates the Youdao online dictionary, direct MT, and human translation.

What are the main languages that people/customers translate? As far as we know, the most translated language is English, Japanese is second, followed by Arabic, Korean, Thai, Russian, German, and Spanish.” Of course, this is all direct to and from Chinese.”


NiuTrans Focus: The Enterprise

The NiuTrans team learned very early in their operational history and during their startup phase that their business survival was linked to providing MT services for the enterprise rather than for individual users and consumers. The market for individuals is dominated by offerings like Google Translate and Baidu Translate that offer virtually-free services. In contrast, NiuTrans is focused on meeting the enterprise demands for MT, which often means deploying on-premise MT engines and the development of custom engines. These enterprises tend to be concentrated around Intellectual Property and Patent services, Pharmaceuticals, Vehicle Manufacturing, IT, Education, and AI companies. For example, NiuTrans builds customized patent-domain MT engines for the China Patent Information Center (CNPAT, a branch of the China National Intellectual Property Administration, a large-scale patent information service based in Beijing.)

CNPAT has the largest collections of multilingual parallel data for patents, and services ongoing and substantial demands for patent-related MT needs in various use scenarios such as patent application filing and examination, patent-related transactions, and patent-based lawsuits. Given the scale of the client’s needs, NiuTrans sends an R&D team on-site to work with CNPAT’s technical team for data processing and data cleaning. This data is then used in the NiuTrans.NMT training module to develop patent-domain NMT engines on CNPAT’s on-premise servers. The on-site team also develops custom MT APIs on-demand to fit into CNPAT’s current workflow and customer servicing needs.


Besides powering and enabling the specialized translation needs of services like CNPAT, NiuTrans also provides back-end MT services for industrial leaders, including iFlyTek (also an early investor in NiuTrans), JD.com (the No. 2 eCommerce business in China), Tencent (the largest social networking company in China), Xiaomi (a leader of smart devices OEMs in China), and Kingsoft (a leader of office software in China).

NiuTrans has an online cloud API that also attracts 100,000+ small and medium enterprises interested in expanding their international operations and business presence. The pricing for these smaller users are based on the volume of characters these users translate and is much lower than Google Translate and Baidu Translate prices.

NiuTran’ Online Cloud User Locations

You can visit the NiuTrans Translate portal at https://niutrans.com

NiuTrans write and maintain their own NMT code-base rather than use open source options for NiuTrans.NMT and claim that they achieve comparable, if not better, quality performance with their competitors. Their comparative performance at the WMT19 evaluations suggests that they actually do better than most of their competitors. They are not dependent on TensorFlow, PyTorch, or OpenNMT to build their systems. Today, NiuTrans is a key MT technology provider, especially for enterprises in China.

NiuTrans.NMT is a lightweight and efficient Transformer-based neural machine translation system. Its main features are:

  • Few dependencies. It is implemented with pure C++, and all dependencies are optional.
  • Fast decoding. It supports various decoding acceleration strategies, such as batch pruning and dynamic batch size.
  • Advanced NMT models, such as Deep Transformer.
  • Flexible running modes. The system can be run on various systems and devices (Linux vs. Windows, CPUs vs. GPUs, FP32 vs. FP16, etc.).
  • Framework agnostic. It supports various models trained with other tools, e.g., Fairseq models.
  • The code is simple and friendly to beginners.

When I probed into why NiuTrans had chosen to develop their own NMT technology rather than use the widely accepted open-source solutions, I was provided with a history of the company and its evolution through various approaches to developing MT technology.

The NiuTrans team originated in the NLP Lab at Northeastern University, China (NEUNLP Lab), a machine translation research leader in the Chinese academic world going as far back as 1980. Like many elsewhere in the world, the team initially studied rule-based MT from 1980 to 2005. In 2006 Professor Jingbo Zhu (the current Chairman of NiuTrans) returned from a year-long visit to ISI-USC and decided to switch to statistical MT research working together with Tong Xiao, who was a fresh graduate student at the time and is now the CEO of NiuTrans. They made rapid strides in SMT research, releasing the first version of NiuTrans SMT open source in 2011. At that time, Chinese academia primarily used Moses to conduct MT-related research and develop MT engines. The development of the NiuTrans.SMT open-source proved that Chinese engineers could do the same as, or even better than Moses, and also helped to showcase the strength and competence of the NiuTrans team. Thus, in 2012, confident with their MT technology and armed with a dream to expand the potential of this technology to connect the world with MT, the NiuTrans team decided to form an MT company, converting the 30+ years’ of MT research work to developing MT software for industrial use.

Given their origins in academia, they kept a close watch on MT research and breakthroughs worldwide and noticed in 2014 that there was a growing base of research being done with neural network-based deep learning models. Therefore, the NiuTrans team started studying deep learning technologies in 2015 and released its first version of NiuTrans.NMT in December 2016, just three months after Google announced the release of its first NMT engines.

NiuTrans prefers to avoid using open source MT platforms like TensorFlow, PyTorch, or OpenNMT as they have developed deep competence in MT technology gathered over 40 years of engagement. The leadership believes there are specific advantages to building the whole technology stack for MT and intend to continue with this basic development strategy. As an example, Chunliang pointed me to the release of NiuTensor, their own deep learning tool: (https://github.com/NiuTrans/NiuTensorand NiuTrans.NMT Open Source (https://github.com/NiuTrans/NiuTrans.NMT). They are confident that they can keep pace with continuous improvements in open source with support from the NEUNLP Lab, which has eight permanent staff and 40+ Ph.D./MS students focusing on MT issues of relevance and interest for their overall mission. This group also allows NiuTrans to stay abreast of the worldwide research being done elsewhere.

NiuTrans understands that a critical requirement for an enterprise user is to adapt and customize the MT system to enterprise-specific terminology or use. Thus, it provides both a user terminology module to introduce user terminology into the MT system and a user translation memory module to introduce the users’ sentence pairs to tune the MT system. Another more sophisticated solution is incremental training. They incorporate user data to modify the NiuTrans model parameters to get the MT model better adjusted to user data features.

NiuTrans also gathers post-editing feedback on critical language pairs like ZH <> EN and ZH <> JP on an ongoing basis, then analyze error patterns to develop continuing engine performance improvements.


Quality Improvement, Data Security, and Deployment

NiuTrans evaluates MT system performance using BLEU and a human evaluation technique that ranks relative systems. They prefer not to use the widely used 5-point scale to assign an absolute value to a translation. Thus if they were comparing NiuTrans, Google, and DeepL, they would use a combination of BLEU and have humans rank the same blind test set for the three systems.

NiuTrans also has an ongoing program to improve its MT engines continually. They do this in three different ways:

  1. Firstly, as the company has a strong research team that is continually experimenting and evaluating new research, the impact of this research is continuously tested to determine if it can be incorporated into the existing model framework. This kind of significant technical innovation is added into the model two or three times a year.
  2. Secondly, customer feedback, ongoing error analysis, or specialized human evaluation feedback also trigger regular updates to the most important MT systems (e.g. ZH<>EN) at least once a month.
  3. Thirdly, engines will be updated as new data is discovered, gathered, or provided by new clients. High-quality training data is always sought after and considered valuable to drive ongoing MT system improvements.

NiuTrans has performed well in comparative evaluations of their MT systems against other academic and large online MT solutions. Here is a summary of the results from WMT19. They report that their performance in WMT20 is also excellent, but final results have not yet been published.

NiuTrans training data comes mainly from two sources: data crawling and data purchase from reliable vendors.

NiuTrans uses crawlers to collect the parallel texts from the websites that do not prohibit or prevent this, e.g., some Chinese government agencies’ websites that often provide data in several languages. They also buy parallel sentences (TM) and dictionaries from specific data provider companies, who might require signing an agreement, specifying that the data provider retains the intellectual property rights of the data.

NiuTrans gets the bulk of its revenue from data-security concerned customers who deploy their MT systems on On-premise systems. However, NiuTrans is also working on an Open Cloud https://niutrans.com offering, allowing customers to access an online API and avoid installing the infrastructure needed to set up on-premise systems. The Open Cloud is a more cost-effective option for smaller SME companies, and NiuTrans has seen rapid adoption of this new deployment in specific market segments.

International customers, especially the larger ones, much prefer to deploy their NiuTrans MT systems on-premise. For those international customers who cannot afford on-premise systems, the NiuTrans Open Cloud solution is an option. This system is deployed on the Alibaba Cloud that is governed by Chinese internet security laws that require that user data be kept for six months before deletion. The company plans to build another cloud service on the Amazon Cloud for international customers who have data security concerns. This new capability will allow users to encrypt their data locally, transfer the data securely to the Amazon Cloud. NiuTrans will then decrypt the source data on their servers, translate it, and finally delete all the user data and the corresponding translation results once the source data has been translated.


NiuTrans currently has 100+ employees, directed by Dr. Jjingbo Zhu and Dr. Tong Xiao, two leading MT scientists in China. Shenyang is the seat of the company’s headquarters and R&D team as well. Technical support and services are available in Beijing, Shanghai, Hangzhou, Chendu, and Shenzhen currently, but the company is now exploring entering the Japanese market, with the assistance of partners in Tokyo and Osaka. While NiuTrans is not a well-known name in the US/EU translation industry today, I suspect that they will become an increasingly better-known provider of enterprise MT technology in the future.


Monday, January 14, 2019

A Vision for Blockchain in the Translation Industry

Happy New Year

This is yet another post on Blockchain, a guest post by Robert Etches who presents his vision of what Blockchain might be able to be in the translation industry. A vision, by definition, implies a series of possibilities, and in this case, quite revolutionary possibilities, but does not necessarily provide all the details of how and what. Of course, the way ahead is full of challenges and obstacles which are much more visible to us than the promised land, but I think it is wise to keep an open mind and watch the evolution even if we are not fully engaged or committed or in agreement. Sometimes it is simply better to wait and see than to come to any final conclusions.

It is much easier to be dismissive and skeptical of upstart claims of fundamental change than to allow for a slim but real possibility that some new phenomenon could indeed be revolutionary. I wrote previously about CEO shortsightedness and what I called Roryisms. Here is a classic one from IBM that shows how they completely missed the boat because of their hubris and old style thinking.
Gerstner is however credited with turning around a flailing mainframe business.The cost of missing the boat can be significant for some, and we need only look at relative stock price and market value improvements over time (as this is how CEO performance is generally measured) to understand how truly clueless our boy Lou and his lieutenants at IBM, in general, were when they said this. The culture created by such a mindset can last decades as we see by the evidence. Culture is one of a company’s most powerful assets right until it isn’t: the same underlying assumptions that permit an organization to scale massively constrain the ability of that same organization to change direction. More distressingly, culture prevents organizations from even knowing they need to do so.

IBM’s chairman minimized how Amazon might transform retail and internet sales all the way back in 1999.
“Amazon.com is a very interesting retail concept, but wait till you see what Wal-Mart is gearing up to do,” said [IBM Chairman, Louis V. Gerstner Jr in 1999.]. Mr. Gerstner noted that last year IBM’s Internet sales were five times greater than Amazon’s. Mr. Gerstner boasted that IBM “is already generating more revenue, and certainly more profit, than all of the top Internet companies combined.”

AMZN Stock Price Appreciation of 36,921% versus IBMs 211% over 20 years


 IBM is the flat red line in the chart above. IBM looks just as bad against Microsoft, Google, Apple, Oracle, and many others who had actual innovation.


January 11, 2019
Amazon Market Value$802 Billion7.3X Higher
IBM Market Value$110 Billion

I bring attention to this, because, I also saw this week that IBM has filed more patents than any other company in the US in 2018, Samsung was second. In fact, IBM has been the top patent filer in the US for every year from 1996 to 2018. BTW they are leaders in blockchain patents as well. However, when was the last time that ANYBODY has associated IBM with innovation or technology leadership? 1980? Maybe they just have some great patent filing lawyers who understand the PTO bureaucracy and know how to get their filings pushed through. In fact, there have been some in the AI community who felt that IBM Watson was a joke and that the effort did not warrant serious credibility and respect. Oren Etzioni said this: “IBM Watson is the Donald Trump of the AI industry—outlandish claims that aren’t backed by credible data.” Trump is now a synonym for undeserved self-congratulation, fraud, and buffoonery, a symbol for marketing with false facts. IBM is also credited with refining and using something called FUD (fear, uncertainty, and doubt) as a deliberate sales and marketing misinformation tactic to keep customers from using better, more innovative but lesser-known products. We should not expect IBM to produce any breakthrough innovation in the emerging AI-first, machine learning everywhere world we see today, and most expect the company will be further marginalized in spite of all the patent filings. 

Some of you may know that IBM filed the original patents for Statistical Machine Translation, but it took Language Weaver (SDL), Google and Microsoft to really make it come to life in a useful way. IBM researchers were also largely responsible for conceiving of the BLEU score to measure MT output quality that was quite useful for SMT. However, the world has changed and BLEU is not useful with NMT. I plan to write more this year on how BLEU and all its offshoots are inadequate and often misleading in providing an accurate sense of the quality of any Neural MT system.

It is important to be realistic without denying the promise as we have seen the infamous CEOs do. Change can take time and sometimes it needs much more infrastructure than we initially imagine. McKinsey (smart people who also have an Enron and mortgage securitization promoter legacy) have also just published an opinion on this undelivered potential, which can be summarized as:
 "Conceptually, blockchain has the potential to revolutionize business processes in industries from banking and insurance to shipping and healthcare. Still, the technology has not yet seen a significant application at scale, and it faces structural challenges, including resolving the innovator’s dilemma. Some industries are already downgrading their expectations (vendors have a role to play there), and we expect further “doses of realism” as experimentation continues." 
 While I do indeed have serious doubts about the deployment of blockchain in the translation industry anytime soon, I do feel that if it happens it will be driven by dreamers, rather than by process crippled NIH pragmatists like Lou Gerstner and Rory. These men missed the obvious because they were so sure they knew all there was to know and because they were stuck in the old way of doing things.  While there is much about blockchain that is messy and convoluted, these are early days yet and the best is yet to come.

Another dreamer, Chris Dixon has an even greater vision on Blockchain when he recently said:
The idea that an internet service could have an associated coin or token may be a novel concept, but the blockchain and cryptocurrencies can do for cloud-based services what open source did for software. It took twenty years for open source software to supplant proprietary software, and it could take just as long for open services to supplant proprietary services. But the benefits of such a shift will be immense. Instead of placing our trust in corporations, we can place our trust in community-owned and -operated software, transforming the internet’s governing principle from “don’t be evil” back to “can’t be evil.”

========

2018 was a kick-off year for language blockchain enthusiasts. At least five projects were launched[1], there was genuine interest expressed by the industry media, and two webinars and one conference provided a stage for discussion on the subject[2]. Then it all went very quiet. So, what’s happened since? And where are we today?

Subscribers to Slator’s latest megatrends[3] can read that it’s same same in the language game for 2019: NMT, M&A, CAT, TMS, unit rates … how we love those acronyms!

On the world stage, people could only shake their heads in disbelief regarding the meteoritic rise of the value of cryptocurrencies in 2017. However, in 2018 those same people relished a healthy dish of schadenfreude as exchange rates plummeted and the old order was restored with dollars (Trump), roubles (Putin), and pound sterling (Brexit) back in vogue.

In other words, for the language industry and indeed for the world at large, “better the devil(s) we know” appears to be the order of the day.

There is nothing surprising in this. Despite all the “out of your comfort zone” pep talks by those Cassandras of Change[4], the language industry continues to respect the status quo, grow and make money[5]. Why alter a winning formula? And certainly, why even consider introducing a business model that expects translators to work for tokens?! Hello, crazy people!!!

But maybe, just maybe, there was method in Hamlet’s madness[6] and Apple was right when they praised the crazy ones[7]?

Let’s take a closer look at the wonderful world of blockchain and token economics, and how they are going to change the language industry … also the language industry.

 

Pinning down the goal posts

Because they keep moving! Don’t take my word for it. Here’s what those respected people at Gartner wrote in their blockchain-based transformation report[8] in March 2018:

Summary


While blockchain holds long-term promise in transforming business and society, there is little evidence in short-term reality.

 

Opportunities and Challenges

  • Blockchain technologies offer new ways to exchange value, represent digital assets and implement trust mechanisms, but successful enterprise production examples remain rare.
  • Technology leaders are intrigued by the capabilities of blockchain, but they are unclear exactly where business value can be achieved in the enterprise context.
  • Most enterprise blockchain experiments are an attempt to improve today's business process, but in most of those cases, blockchain is no better than proven enterprise technologies. These centralized renovations distract enterprises from other innovative possibilities offered by blockchain.
And now here’s a second overview, also from Gartner, this time their blockchain spectrum report[9] from October 2018:

 

Opportunities and Challenges

  • Blockchain technologies offer capabilities that range from incremental improvements to operational models to radical alterations to business models.
  • The impact of blockchain’s trust mechanisms and interaction paradigms extends beyond today’s business and will affect the economy, society and governance.
  • Many interpretations of blockchain today suffer from an incomplete understanding of its capabilities or assume a narrow scope.
The seven-month leap from little evidence in short-term reality to will affect the economy, society and governance is akin to a rocket-propelled trip across the Grand Canyon! Little wonder that traditional businesses don’t know where to start even looking into this phenomenon, never mind taking on a new business model that basically requires emptying the building of 90% of hardware, software and, more important, people.

But!

Why does Deloitte have 250 people working in their distributed ledger laboratories? Because when immutable distributed ledgers become a reality they will put 300,000 people out of work at the big four accountancy companies[10].

Why are at least 26 central banks looking into blockchain? Because there’s a good chance that private banks[11] will be superfluous in 10-15 years’ time and we will all have accounts with central banks.

Or there will be no banks at all …

Let’s just take a second look at that Gartner statement:

The impact of blockchain’s trust mechanisms and interaction paradigms extends beyond today’s business and will affect the economy, society and governance.

Other than basically saying blockchain will change “everything”, the sentence mentions two factors that are core to blockchain: trust and interaction.

Trust. What inspires me about blockchain is its transparency. A central tenet of blockchain is its truth gene. In a world in which even the most reliable sources of information are labeled as fake, blockchain’s traceability – its judgment in stone as to who did what, when and for whom – makes it a beacon of light.

Just think if we could utilize this capability to solve the endless quality issue? What if the client always knew who has translated what – and could even set up selection criteria based upon irrefutable proof of quality from previous assignments? It is no surprise to learn that many blockchain projects are focusing on supply chain management.

Interaction is all about peer-to-peer transactions through Ethereum smart contracts. It’s not just the central banks that will be removing the middlemen. Unequivocal trust opens the door to interact with anyone, anywhere. To a global community. These people of course speak and write in one or more of approximately 6,900 languages, so there’s a market for providing the ability for these “anyones” to speak to each other in any language. What a business opportunity! And what a wonderful world it would be!

Cryptocurrencies and blockchain: peas and carrots


You’ve gotta love Forest Gump – especially now we know Jenny grew up to become Claire Underwood[12] 😊


Just as Jenny and Forest went together like peas and carrots, so do tokens and blockchain.


Unfortunately, this is where many jump off the train. One thing is accepting the relevance of some weird ledger technology that is tipped to become the new infrastructure for the Internet, another is trading in hard-won Venezuelan dollars for some sort of virtual mumbo jumbo!
  1. All fiat currencies are a matter of trust. None is backed by anything more than our trust in a national government behaving responsibly. In 2019 that is quite a scary thought – choose your own example of lemming-like politicians.
  2. All currencies (fiat or crypto) are worth what the market believes them to be worth. In ancient times a fancy shell from some far-off palmy beach was highly valued in the cold Viking north. Today not so. At its inception, bitcoin was worth nothing. Zero. Zip. Today[13] 1 BTC = €3,508.74. Because people say so.
Today, there is absolutely no reason why a currency cannot be minted by, well, anyone. There is indeed a school of thought that believes there will be thousands of cryptocurrencies in the non-too distant future. If we look at our own industry, we have long claimed that translation memories and termbases have a value. Why can that value not be measured in a currency unique to us and with an intrinsic value that we all respect and which is not subject to the whims of short-term political aspirations? Why can’t linguistic assets be priced in a Language Coin?

Much has already been written about the concept of a token economy, though little better than the following:

An effective token strategy is one where the exchange of a token within a particular economy impacts human economic behavior by aligning user incentives with those of the wider Community.[14]

Think about this in the context of the language industry. What if the creation and QA of linguistic assets were tied to their own token? What if you – a private company, a linguist, an LSP, an NGO, an intranational organization – were paid in this token for your data and that the value of this data grew and grew year on year as it was shared and leveraged as part of a larger whole – the Language Community[15]. What if linguists were judged by their peers and their reputations were set in stone? What if everyone was free to charge whatever hourly fees they choose, and that word rates and CAT discounts were a relic of the past?

This is why blockchain feeds the token economy and why the token needs blockchain. Peas and carrots!

To end with the words of another Cassandra – a trendy one at that: Max Tegmark:
If you hear a scenario about the world in 2050 and it sounds like science fiction, it is probably wrong; but if you hear a scenario about the world in 2050 and it does not sound like science fiction, it is certainly wrong.[16]

The pace of change will continue to accelerate exponentially, and I believe blockchain will be one of the main drivers.

Already in 10-15 years, there will be some household (corporate) names and technologies that do not exist or have only just started today. And by 2050 the entire finance, food, and transport sectors (to name the obvious) will be ‘blockchained’ beyond recognition.

At Exfluency, we see multilingual communication as an obvious area where a token economy and blockchain will also come out on top; I’m sure that other entrepreneurs in a myriad of other sectors are coming to similar conclusions. It’s going to be exciting!

Robert Etches
CEO, Exfluency
January 2019


[1] Exfluency, LIC, OHT, TAIA, TranslateMe
[2] TAUS Vancouver, and GALA and TAUS webinars
[3] https://slator.com/features/reader-polls-pay-by-hour-deepl-conferences-and-2019-megatrends/
[4] Britta Aagaard & Robert Etches, This changes everything, GALA Sevilla 2015; Katerina Pastra
Krzystof Zdanowski, Yannis Evangelou & Robert Etches, Innovation workshop, NTIF 2017; Peggy Peng & Robert Etches, The Blockchain Conversation, TAUS 2018, Jochen Hummel, Sunsetting CAT, NTIF 2018.
[5] I am painfully aware that not everyone in the food chain is making money …
[6] Hamlet, II.ii.202-203
[7] https://www.youtube.com/watch?v=8rwsuXHA7RA
[8] https://www.gartner.com/doc/3869696/blockchainbased-transformation-gartner-trend-insight

[9] https://www.gartner.com/doc/3891399/blockchain-technology-spectrum-gartner-theme
[10] P.221 The Truth Machine, by Paul Vigna and Michael J. Casey
[11] Ibid. pp163-167

[12] https://en.wikipedia.org/wiki/Robin_Wright
[13] 9 January 2019
[14] P.69 The Truth Machine
[15] See Aagaard & Etches This changes everything for the sociological and economic importance of communities, the circular society, and the sharing society.
[16] Life 3.0 by Max Tegmark


=================================================================



CEO, Exfluency

A dynamic actor in the language industry for 30 years, Robert Etches has worked with every aspect of it, achieving notable success as CIO at TextMinded from 2012 to 2017. Always active in the community, Robert was a co-founder of the Word Management Group, the TextMinded Group, and the Nordic CAT Group. He served four years on the board of GALA (two as chairman) and four on the board of LT-Innovate. In a fast-changing world, Robert believes there has never been a greater need to implement innovative initiatives. This has naturally led to his involvement in his most innovative challenge to date: As CEO of Exfluency, he is responsible for combining blockchain and language technology to create a linguistic ledger capable of generating new opportunities for freelancers, LSPs, corporations, NGOs and intergovernmental institutions alike.

Tuesday, January 31, 2017

The Driving Forces Behind MT Technology

This is a  modified version of a post that was originally published on  Caroline Alberoni's blog.


-------------------------------


Machine translation (MT) today is as pervasive and ubiquitous as mobile phone technology. While some translators still feel threatened by the technology or feel the need to disparage it for it’s less than perfect translation, it is useful to understand why it is so widely used. At their annual developer conference in April 2016, Google announced that they are translating over 140 billion words a day across 100 languages. Baidu Translate can translate 27 languages and is growing, and processes around 100 million requests every day. Most of this use is from casual internet users who may be interested in translating a news story or some simple phrases. However, there is a growing impact on the professional translation business as well.

If you add the translation volume of Microsoft, Baidu, Yandex and other MT providers, we can certainly expect that more than 500 billion words a day are translated by computers on a daily basis today. This is probably more than 95% of ALL translation done and perhaps as high as 99% on the planet on a daily basis.


As Peter Brantley at Berkeley states in a personal blog:
"Mass machine translation is not a translation of a work, per se, but it is rather, a liberation of the constraints of language in the discovery of knowledge."



The need for translation of business content and other kinds of high-value information on the internet continues to grow, but the increasing use of MT also cause changes that affect translators and agencies alike. The most interesting translation work is increasingly moving beyond the focus of traditional translation work and is likely to do even more so in the future. Thus, the most lucrative and interesting new business translation opportunities, like at eBay for example, may require very different kinds of skills and competence but would still draw on traditional translation and linguistic competence. Translators and linguists today are often required to be “word corpus analysts” and today increasingly are involved in projects to steer MT technology to produce better results.
The professional use of MT is increasingly valid for all of the following:
  • Highly repetitive content where productivity gains with MT can dramatically exceed what is possible with just using TM alone
  • Content that would just not get translated otherwise
  • Content that cannot afford human translation
  • High-value content that is changing every hour and every day
  • Knowledge content that facilitates and enhances the global spread of critical knowledge
  • Content that is created to enhance and accelerate communication with global customers who prefer a self-service model
  • Content that does not need to be perfect but just approximately understandable
The forces that drive the increasing use of MT in the world, are largely beyond the control of the professional “translation industry,” continue to build unabated and can be briefly listed as follows:
  • The Explosion of Content Creation: The sheer volume of content that global enterprises, entertainment agencies, educational establishments, governmental agencies and any international commercial venture need to translate continues to grow by the minute. The amount of digital information increases tenfold every five years! In fact, it can even be said that we live in an age where more information is being created annually than has existed in the 500 years prior.
  • The Changing Content Value Equation: While historically corporate marketing communications had a great degree of control, today most consumers distrust this kind of messaging and would rather trust the shared opinions of fellow consumers. The value of business content increasingly has a very short shelf-life and thus traditional (slow and expensive) TEP (translate-edit-proof) approaches are increasingly questioned for information that may have little or no value after six months.  In actual fact, the fastest growing type of content is actually user-generated content (UGC) that is found in blogs, FB, Youtube, Twitter and community forums. It is estimated by IDC that 70% of the content on the web is UGC and much of that is very pertinent and useful to enterprises. This content is now influencing consumer behavior all over the world and is often referred to as word-of-mouth marketing (WOMM). Consumer reviews are often more trusted than “corporate marketing-speak” and even “expert” reviews which are often funded by the same corporations.We all have experienced Amazon, travel sites, C-Net and other user rating sites. It is useful for both global consumers and global enterprises to make this multilingual. Given the speed at which this information emerges, MT has to be part of the translation solution because of the volume and sheer rate of creation of this type of content. 

How much data is generated every minute? 

A case in point: The world’s largest travel review platform, TripAdvisor receives 315+ million monthly unique visitors, on its website, many contributing reviews. The combined weight of these reviews is considerable, and influence consumer decision making on final purchase selections to a very great extent. Having translations available in multiple languages online to support a purchase decision greatly enhances the possibility of a global consumer executing a transaction on the site.

Another very descriptive example by Juan Rowda at eBay:

"There are currently more than 800 million listings on eBay (over 1 Billion as of this writing). Considering that each listing has around 300 words, how long do you think it would take any given number of linguists to translate these listings? Did I mention that some of the listings may only be online for a day or a week and that the inventory changes continuously?

So, don’t even pull out your calculator. The answer is simple – human translation is not viable. However, if you really want to know, we estimate it would take 1,000 translators 5 years to translate only the 60 million listings eligible for Russia! For (these) listings, machine translation is clearly a much better fit in this scenario. "
 
  • Short Product Life and Development Cycles: The product life cycles in electronics, fashion, and many other consumer products get shorter all the time, so rapid, “good enough” product descriptions are increasingly considered sufficient for business requirements. The historical translation quality assurance cycles practiced in the 80’s and 90’s are not viable today as they simply could not keep pace with the rate of new product introduction.
  • Continuously Increasing Volume & Managed Cost Pressures: Enterprises are under continuous pressure to translate more content with the same budgets, and thus they seek out translation agencies who understand how to do this with rapid turnaround. Competent use of MT is a critical element of redefining the cost-time-volume equation for translating ever growing volumes of relevant business content especially given the extremely transient nature of a lot of this information.
  • Changing Internet User Base: As more of the developing world comes online it becomes imperative for these new users to have MT to be able to get some basic understanding of existing web content, especially knowledge content. The need is clear not only to global eCommerce sites but also to many local government agencies around the world who need to provide basic health and justice information and services to a growing population of immigrants who may not speak the dominant local language.
  • Widespread Acceptance of Free Generic Machine Translation: The universal availability and widespread use and acceptance of “free MT” on the internet have raised acceptance of MT in executive management circles too. This also drives the momentum for large new types of projects that would never have been considered in the TEP translation world. The fact that 500+ billion words a day are being translated by MT is clear indication that it delivers some value to hundreds of millions of internet users. As the MT quality continues to improve, albeit slowly, it puts further downward pressure on the price of translation work. It can also be said that for many languages MT has become an aid for translators as it can function as a dictionary, terminology or phrase lookup system.
Thus it is safe to presume, that it is very likely that MT is going to be a fact of life for many professional translators in the 21st century. And then, what new skills would a translator need to understand and be considered a valued partner, in a world where MT deployment and “opportunities” will continue to abound?

MT today, has already proven itself in professional use scenarios with many Romance languages, but we are still at a transition point in the use of MT in many other language combinations, and thus the MT experience can often be less than satisfying for translators in those other languages, especially when working with translation agencies who are not technically competent with MT.

The New Skills in Demand

At a high level, the skills that matter in working with the professional use of MT, that we can expect will grow in value to global enterprises and agencies involved in large MT projects are as follows.
  • Understand the different kinds of MT systems that you would interface with. Translators that understand the different kinds of MT are likely to be much more marketable.
  • Understand the specific output quality of the MT engines that you are working with. Provide articulate linguistic feedback on MT output. Being able to provide articulate feedback on error patterns is perhaps one of the most sought after skills in professional MT deployment today. This ability to assess the quality of MT output is also beneficial to a freelancer who is trying to decide whether to work on a PEMT project or not.
  • Develop skills with new kinds of tools that are valuable in dealing with corpus level tasks and manipulations. It is much more likely that MT projects will involve much larger volumes of data and data preparation and global pattern modification skills become much more useful and valuable.
  • Develop skills in providing pattern level feedback and develop rapid error pattern identification and correction. Being able to devise a rapidly implementable test and evaluation routines that are useful and effective is an urgent market requirement. This paper summarizes the specific linguistic issues with Brazilian Portuguese that provide an idea of what this actually means.
  • Develop a corpus view that involves linguistic steering rather than segment level corrections. This is a fundamental change of mental perspective that is a mandatory requirement for successful professional involvement with MT. Understanding the competence of the translation agencies that you engage with is also a key requirement as it is VERY easy to mismanage an MT project and most translation agencies that attempt to build MT engines on their own are quite likely to be incompetent.

What can a translator do?


  1. Learn and educate yourself on the variants of MT.
  2. Experiment with major public engines from Google, Systran, and Bing and with specialist tools like Lilt, SDL Adaptive MT and SmartCAT that allow easy interaction with MT.
  3. Understand how to rapidly assess MT output quality BEFORE you engage in any MT project.
  4. Don’t work with incompetent translation agencies who know little or nothing about MT but only seek to reduce rates with crappy do-it-yourself engines.
  5. Experiment with corpus management tools.


While it is quite possible that MT will never be quite good enough to be used for the translation of literary work and poetry where linguistic finesse and deep semantic insight is essential, it is clear in 2017 that MT has a definite role to make much more information multilingual in the global enterprise and any international communication. The MT technology has evolved over the years and is now beginning to use a new development methodology based on neural networks similar to those formed in human brains. Early results of this Neural Machine Translation are clearly better than the current technology, and we are in a period of inflated expectations of what is possible, but there is a reason for optimism and I think we should only expect that MT will become even more universal and widely used in the years to come.

Wednesday, February 29, 2012

Highlights from Recent Coverage on MT Related Subjects

This is a summary of what I think are some interesting recent articles on the web on subjects relating to MT.

The Big Wave, an Italian initiative that focuses on the changes happening in language technology released details and proceeding papers from their conference held in Rome in the summer of 2011. There are many interesting papers related to MT, controlled language and collaborative translation related issues. These papers provide a balance of practitioner, academic and user perspectives on these subjects and are worth a close examination.
Some highlights include:

Linguistic resources and MT trends for the Italian language by Isabella Chiari discusses the implications of various kinds of data and their value for building data-driven MT systems and provides some specifics for EN <> IT MT systems. The paper is a great overview on the kinds of data that can be used and also provides insight on what data to use and where to use it with summary implications. It also makes a great case for the inevitability of corpus driven approaches in MT (without meaning to) by providing the theoretical rationale for this and points to rising momentum of the data driven approach.

Productivity and quality in MT post-editing – by Ana Guerberof provides specific evidence of the productivity advantage of MT over TM and new segments in a translation workflow.

“In this context, it seems logical to think that if prices, quality and times are already established for TMs according to different level of fuzzy matches then we just need to compare MT segments with TM segments, rather than comparing MT to human translation. “

This study also helps to establish that in reality MT is just a new kind of TM fuzzy match. Even though the test only involved a small number of translators and a small amount of work, it was done with care to ensure the translators saw a mixture or MT, TM and new segments in a way that was “blind” and then carefully measured the productivity of the translators in processing these different segments.

 

The results show that MT had higher productivity than TM or New segments and that on average MT produced higher productivity. (We are certain these results would have been more pronounced with an Asia Online customized system). Interestingly this study also shows that weaker translators seem to benefit more from MT and TM than the “best” translators. There are some interesting observations about the error analysis which showed that TM produced the greatest amount of final errors.

 

I would hypothesize that a test with more translators in the pool, and a bigger set of test data would be useful to do, as the results would establish the benefits of the use of customized MT much more clearly. It may even be useful to include “bad” or free MT to show how differently translators react to a segment that looks like it is an 85% match and to one that looks obviously like raw free MT or instant customization (50% TM match) that some use today.

 

 Why Machine Translation Matters: Trends & Best Practices 

This article summarizes the forces driving the increasing use of MT which can be summarized as:

External Forces in the World at Large :-

  • The digital data explosion and its impact on new content that begs to be to translated quickly

  • The global thirst for knowledge and information

  • The growing online population that does not speak English or FIGS but represents a major commercial opportunity for global enterprises

Internal Forces affecting Global Enterprises :-

  • The growing importance of customer conversations and user generated content which affects purchase decisions and impacts customer loyalty

  • The growing importance of open collaboration in B2C relationships

  • The Rise of Asia and BRICI which requires huge amounts of new content in new languages

These forces, together amount to a shift towards more dynamic content, and increase the need to handle streaming flows of information that simply cannot be done without more automation and MT.

 

MT: the new 'lingua franca' is a fascinating perspective by Nicholas Oster, a historian of world languages on how MT is enabling linguistic diversity on the Internet.

“Between 2000 and 2009, Arabic on the internet grew twentyfold, Chinese x20, Portuguese x9, Spanish X7 and French x6, while content in English ‘only’ tripled. Proportionally, then, English is declining in importance relatively quickly. “The main story of growth on the Internet … is of linguistic diversity, not concentration.”
Ostler sees a key role for MT in this new environment. Just as the print revolution changed the ‘ground rules of communication’ in 16th century Europe, he expects that language and translation technology will revolutionize global communications tomorrow, removing the need for a ‘single lingua franca for all who wish to participate directly in the main international conversation.’

Translation errors or nuances in both humans and computers can naturally have an important impact. But there is no point in dismissing MT by judging it by some presumed norm of ‘perfect’ human translation. MT is a revolutionary tool that can help the world communicate better. TAUS will be welcoming Nicholas Ostler as a speaker at the upcoming TAUS European Summit on May 31 – June 1 in Paris.

When Machine Translation Usefulness Is Higher Than Quality:  

This article provides some interesting feedback for those who insist that MT only has value when it approaches human quality, and since MT rarely reaches human quality it has very limited value. In this study, English news was translated into FIGS by MT, but users were always given access to the English source. The study measures the usefulness of the MT in the context of assessed translation quality as shown below and interestingly MT is considered useful even when the quality falls short of excellence. Since this study was performed some time ago we would assume that the usefulness curve continues to shift upwards, driven by improving MT quality, whatever some translators may think about the quality.

clip_image001

The graph shows that although the machine translation quality was evaluated as being far from perfect, the translation’s usefulness was regarded as higher than its quality. However, this applies only when translation quality is above certain threshold. Bad or poor quality machine translations are naturally deemed as useless.

 

This result confirms what many MT proponents have themselves experienced. Pure MT can be rough – often obscure, frequently humorous – but it can be useful. If one really has little facility in the source language, pure MT translations, however clumsy, can be a boon to understanding and, by extension, to productivity.

 

The graph below illustrates the breakdown of responses to the question, “How would you rate the overall quality of the newsletter translation?” by language group. Note that Germans felt the quality was more lacking, possibly because the MT was poorer in quality or possibly because they had higher expectations. It is actually well known in the MT community that German <> English is more difficult than English <> Romance languages.

clip_image002

When we segment answers to the question, “How would you rate the usefulness of the newsletter translation?” by the respondents’ English ability, we see an even stronger vote in favor of MT by the two lower groups. Thus users who had a self-measured poorer English ability, found the MT much more useful. In fact even many who responded has having “Good” English ability found the MT very useful or essential.
clip_image003


There have also been some interesting discussions in LinkedIn that cover the dialogue and tension between translators and MT advocates and also expose some of the hyperbole that some MT enthusiasts are prone to. While the discussion does meander between translator emotions about plans to “eliminate” them and less than scrupulous business practices by some MT vendors, it is an interesting thread. In their rush to get on the technology bandwagon some LSPs may overlook the privacy and data security issues that they inadvertently agree to when they use instant Moses and DIY kits.  So caveat emptor.


In Machine Translation in the European Union : Renato provides some summary coverage from  a recent conference of the ever expanding use of MT in the European Union internal administration.

 

Interview with Translator David Bellos: author and award-winning translator David Bellos knows a thing or two about translation would be an understatement. With over 40 years of experience, he has achieved international recognition for his works as a translator and biographer and has an impressive list of acclaimed publications to his name.

Some interesting excerpts from the interview:

“What I expect is that machines will allow the demand for translation to carry on growing, and for translation to become an ever more integral part of the world we live in.

However, since there are almost 49 million translation directions between all the languages in the world and there is never going to be a 49-million-fold community of translators, machines might well be a useful adjunct to actual translation for many of the under served directions that exist.

Tuesday, March 8, 2011

The Changing Face of Localization (Professional Translation)

I was at a party recently where somebody asked a Language Service Provider, what they did in their professional work. It was amazing to witness how completely mystified the questioner was by the response which included the word “localization” several times. It took several minutes of conversation before the person (who admittedly was a little slow) gathered that it involved translation for business purposes in some way. To my view “localization” is not a great word, to get the general-world-out-there, engaged and interested in, or even just understand what you do. Looking at the localization entry in Wikipedia explains the confusion felt by the average guy on the street; the word has different meanings in translation, psychology, medicine, physics, mathematics and more recently even in location based services like FourSquare and Facebook Places.

Does it really matter? (I think it does, especially in interactions with people outside the translation industry, = the real world?) I have always felt that it is important to be able to communicate what you do, quickly and easily in casual social settings to enhance your professional life. Good casual social interaction can often lead to useful professional references and interactions, but only if people actually understand what you do. It matters even more when you as an industry are trying to increase your visibility to the world out there. I think the word localization may have made sense when the focus was only “software and documentation localization” (SDL), but this view of what we do is increasingly being questioned in terms of overall value to generating and facilitating international business.(BTW the word “transcreation”, to my mind is even worse in terms of obfuscation and classic HUYA-ness.)

Ironically, I had a brief Twitter exchange with Ultan O’Broin (aka @localization) discussing this shift.

(Unfortunately the service I used to show the conversation is now defunct. And since Twitter makes it so hard to get old conversations it is pretty hard to retrieve those snippets.) 


We were basically discussing data interchange standards in the translation industry (TMX, XLIFF) and Ultan said something that I thought was very insightful about the old SDL view of the business:”people don't get "structure". Obsessed with formatting, still”. This helps to explain the relatively low status of localization professionals in most global enterprises. The view is that, the localizers handle the translation production of carefully formatted material that goes into product packaging, and some pro-corporate, self-congratulating, mostly irrelevant content on the corporate web site. Thus, it is not surprising that localization professionals have kind of a secretarial status in most internationally focused business groups. They provide basic services and assistance to international business initiatives. As Ultan said, they have an administrative assistant view rather than a system administrator view on information flows related to international business initiatives.

As I have stated in previous posts the world is changing, and to stay relevant we need to also change what we do, how we do it and why we do it. At the executive level of global enterprises, it is increasingly becoming clear that customer decision making processes have changed, largely due to open and free access to more information. This information is increasingly created outside the global enterprise and is not easily controlled by stakeholders within the global enterprise. In many industries global customer conversations are MORE influential in driving customer behavior (and corporate sales) than corporate content. To be relevant, we need a new mindset that looks at the flow from information creation (internal and external) to information consumption and has an honest and real focus on the final customer. Real conversations with real users matter more than corporate content and some are beginning to realize this. Value needs to be defined by how useful a customer finds the information, not by how many translation and formatting errors there are in a user manual that few are likely to read. Ultan is at the leading edge of this new focus in an area called User Experience (UX) and thus we should all be listening to what he and others like him have to say.

Here is a more detailed overview on these broad changes from my viewpoint at a recent Localization ;-) Technology Roundtable seminar in Palo Alto:


An interesting aside: I was informed by an SDL Plc marketing representative that I would not be welcome at their recent SDL Innovate  event in Palo Alto because of “my position at Asia Online”, however they did admit that, “we will look forward to seeing you at future industry events.” To be honest, I did apply as Kirti Vashee, CEO of Maya Acoustics (which I truly am involved with). But unfortunately the expert marketing department sleuths there tracked me down as the author of this blog post. (Hmm, could it have been my name?)  Or perhaps because I think that associating SDL with Innovation is oxymoronic, or perhaps because I represent competition that is feared and formidable. Apparently Renato and people from TermWiki were also denied admission into the compound.

Interestingly the keynotes brought forth new supporting data for many of the themes I have been writing about in the last year (I think so anyway): Openness, Customer Focus & Collaboration, Standards and Information Flow. Maybe I am biased, but do you really see these as themes that resonate and receive meaningful commitment at SDL?

Some highlights (gathered from Twitter, thank you @scottabel) from Toby Bell of the Gartner Group:
  • People are becoming brands
  • More stuff is uploaded to YouTube in 60 days than all television networks combined have created in 60 years (Yes indeed, UGC is for real)
  • Everything is interactive. If you're not polling or offering live interactive contact with customers, you're missing out on the engagement opportunity
  • Manufacturers, retailers now allow customers to create documentation and interestingly this content is often better than what their own employees create
  • You must tune the experience (with the "right" content and information about your products and services) to the users goals
  • Corporate leadership still views web content management as a publishing function. It's not. It's really about customer experience.
Some highlights from Marcia Metz, EMC on Information Liquidity:
  • Information is an asset that can be turned into revenue
  • We can't keep pace with the growth of the volume of information and speed and efficiency are becoming more critical to business success
  • We are working to provide information as a service. Content creators and consumers should be able to collaborate for best results
  • Information liquidity requires a comprehensive platform that is standards-based, organized and manages the full information life cycle from co-creation to consumption
There was also another interesting Twitter based discussion that focused on the dubious value of TMS systems. While I do understand that translation projects have been messy historically, and that some level of automation is required to make things more efficient, I have many doubts about the "solution" that many have chosen. Most of my doubts are about relative value not absolute value. Why are there so many TMS systems? Why do they all have such small installed bases? Why does every LSP and Corporate Localization department think that their translation project management process is so unique that it can only be properly automated by creating a new TMS system?  Could this not be better accomplished by using more mainstream (= installed base of hundreds or thousands) collaboration and database tools? Jaap van der Meer of TAUS stated at the now defunct LISA's final standards summit event: GMS/TMS will disappear over time, in favor of plug-ins to other systems. Adam Blau surprised me at our technology roundtable meeting in Palo Alto when he said that milengo does not use or believe in TMS systems. The reason: Too much investment for too little return and a reduction in overall flexibility. You can see him say it in his own words here. He also provides some interesting observations on best-of-breed tools that they use, and the issues related to developing a specific technology agnostic strategy in his talk.

Also, the future of translation I think will see more deployments of collaborative communities or crowdsourcing. The Monterey Institute of International Studies (MIIS), announced recently that they’re deploying Lingotek’s hosted Collaborative Translation Platform. As we head into more 10 million and 100 million+ word translation initiatives, this kind of collaboration facilitating infrastructure becomes more and more necessary. It is interesting to see that few if any universities ever adopted translation memory and TMS tools into their curriculum in the past. Tools that enable and facilitate open collaboration and easily integrate into mainstream content management software infrastructure become ever more important. The people that lead the charge in these new initiatives are often not from the localization community and they seem to understand that data must flow freely for the tools to be useful.


We are at a point in time where it can be recognized that professional translation efforts focused on customer conversations are actually impacting overall international business success. Quite possibly we are at a point where what we do (enable and facilitate global customer conversations), is seen as driving customer satisfaction and customer loyalty and thus international revenues across the globe.

And if there is somebody out there who could educate me about personalization, I would love to learn more, as I think that personalization and mobile will also grow in importance as drivers for building international business. The coming shift to mobile is hopefully obvious to everybody, there are already 4X as many mobile (cell phone) users in the world as there are online users and we should expect that a lot of information consumption will shift to these devices as they get more powerful. This does not mean that PCs go away, but rather, that the conversation becomes more mobile and free form.

So if the future is about more free flowing conversations with customers and much more dynamic internal and external content, we should be thinking about new ways to describe what we do. I think our new description will likely include terms like professional translation, collaboration, global customer engagement and effective use of translation technology. While traditional localization work is unlikely to disappear, I think the best is yet to come and some of us will be lucky enough to be involved with world changing initiatives.


For a cool response on "what do you do?" see what that other localization expert had to say: "We are building technology that facilitates serendipity." --Dennis Crowley, Foursquare co-founder, as quoted by the Los Angeles Times. For me personally, I like the sound of: "We develop technology that enables global enterprises to talk to their customers across the world and we also help to address and alleviate information poverty in South East Asia".