Tuesday, December 25, 2018

The Global eCommerce Opportunity Enabled by MT

The holiday season around the world today is often characterized by special holiday shopping events like Black Friday and Cyber Monday. These special promotional events generate peak shopping activity and are now increasingly becoming global events. They are also increasingly becoming a digital and online commerce phenomenon. This is especially true in the B2C markets but is also now often true in the B2B markets.

The numbers are in, and the most recent data is quite telling. The biggest U.S. retail shopping holiday of the year – from Thanksgiving Day to Black Friday to Cyber Monday, plus Small Business Saturday and Super Sunday – generated $24.2 billion in online revenues. And that figure is far below Alibaba’s 11.11 Global Shopping Festival, which in 2018 reached $30.8 billion – in just 24 hours.

When we look at the penetration of eCommerce in the U.S. retail market, we see that, as disruptive as it has been, it is still only around 10% of the total American retail market. According to Andreessen Horowitz, this digital transformation has just begun, and it will continue to gather momentum and spread to other sectors over the coming years.

The Buyer’s Experience Affects eCommerce Success

Success in online business is increasingly driven by careful and continued attention to providing a good overall customer experience throughout the buyer journey. Customers want relevant information to guide their purchase decisions and allow them to be as independent as possible after they buy a product. This means sellers now need to provide much more content than they traditionally have.

Much of the customer journey today involves a buyer interacting independently with content related to the product of interest, and digital leaders understand that understanding the customer and providing content that really matters to them is a pre-requisite for digital transformation success.

B2B Buying Today is Omnichannel

In a recent study focused on B2B digital buying behavior that was presented at a recent Gartner conference, Brent Adamson pointed out that “Customers spend much more time doing research online – 27% of the overall purchase evaluation and research [time]. Independent online learning represents the single largest category of time-spend across the entire purchase journey.”

The proportion of time 750 surveyed customers making a large B2B purchase spent working directly with salespeople – both in person and online – was just 17% of their total purchase research and evaluation process time. This fractional time is further diluted when you spread this total sales person contact time across three or more vendors that are typically involved in a B2B purchase evaluation.

The research made evident that a huge portion of a sellers’ contact with customers happens through digital content, rather than in-person. This means that any B2B supplier without a coherent digital marketing strategy specifically designed to help buyers through the buyer journey will fall rapidly behind those who do.

The study also found that just because in-person contact begins, it doesn’t mean that online contact ends. Even after engaging suppliers’ sales reps in direct in-person conversations, customers simultaneously continue their digital buying journey, making use of both human and digital buying channels simultaneously.

Relevant Local Content Drives Online Engagement

Today it is very clear to digitally-savvy executives that providing content relevant to the buyer journey really matters and is a key factor in enabling digital online success. A separate research study by Forrester uncovered the following key findings:
  • Product information is more important to the customer experience than any other type of information, including sales and marketing content.
  • 82% of companies agree that content plays a critical role in achieving top-level business objectives.
  • Companies lack the global tools and processes critical to delivering a continuous customer journey but are increasingly beginning to realize the importance of this.
  • Many companies today struggle to handle the growing scale and pace of content demands.
A digital online platform does enable an enterprise to establish a global presence very quickly. However, research suggests that local-language content is critically needed to drive successful international business outcomes. The global customer requires all the same content that a US customer would in their own buying journey.

Machine Translation Facilitates Multilingual Content Creation

This requirement for providing so much multilingual content presents a significant translation challenge for any enterprise that seeks to build momentum in new international markets. To address this challenge, eCommerce giants like eBay, Amazon, and Alibaba are among the largest users of machine translation in the world today. There is simply too much content needs to be multilingual to do this with traditional localization methods.

However, even with MT, the translation challenge is significant and requires deep expertise and competence to address. The skills needed to do this in an efficient and cost-effective manner are not easily acquired, and many B2B sellers are beginning to realize that they do not have these skill in-house and could not effectively develop them in a timely manner.

Expansion Opportunities in Foreign Markets

The projected future growth of eCommerce activity across the world suggests that the opportunity in non-English speaking markets is substantial, and any enterprise with aspirations to lead – or even participate – in the global market will need to make huge volumes of relevant content available to support their customers in these markets.

When we look at eCommerce penetration across the globe, we see that the U.S. is in the middle of the pack in terms of broad implementation. The leaders are the APAC countries, with China and South Korea having particularly strong momentum as shown below. You can see more details about the global eCommerce landscape in the SDL MT in eCommerce eBook.

The chart below, also from Andreessen Horowitz shows the shift in global spending power and suggests the need for an increasing focus on APAC and other regions outside of the US and Europe. The recent evidence of the power of eCommerce in China shows that these trends are already real today and are gathering momentum.

The Shifting Global Market Opportunity

To participate successfully in this new global opportunity, digital leaders must expand their online digital footprint and offer substantial amounts of relevant content in the target market language in order to provide an optimal local B2C and B2B buyer journey. 

As Andreesen points out, the digital disruption caused by eCommerce has only just begun and the data suggests that the market opportunity is substantially greater for those who have a global perspective. SDL's MT in eCommerce eBook provides further details on how a digitally-savvy enterprise can handle the new global eCommerce content requirements in order to partake in the $40 trillion global eCommerce opportunity.

This is a slightly updated post that has been already published on the SDL site

Happy Holidays to all.  

May your holiday season be blessed and peaceful.  

Click here to find the SDL eBook on MT and eCommerce.

Thursday, November 15, 2018

The Growing Momentum of Machine Translation in Life Sciences


This first post in an ongoing series takes a closer look at the emerging use and acceptance of machine translation (MT) in the Life Sciences industry. We take a look at the expanding role MT is likely to have in the industry over the coming years and explore some key use cases and applications.

The Life Sciences industry, like every other industry today, feels the impact of the explosion of content and of the driving forces that compel the industry to use MT and machine learning (ML). The growth is caused by:
  • The volume of multilingual research impacting drug development
  • The increasing volume of multilingual external consumer data now available (or needed), which influence drug discovery, disease identification, global clinical research, and global disease outbreak monitoring
Consumers share information in many ways, across a variety of digital platforms. It has become increasingly necessary to monitor these platforms to stay abreast of trends, impressions, and problems related to their products.

It is useful to consider some of the salient points behind this growing momentum.

MT use has exploded

The content that needs translation today is varied, continuous, real-time and always flowing in ever greater volumes. We can only expect this will continue and increase.

The use of global public MT portals is in the region of an estimated 800 billion words a day. This is astounding to some in the localization industry who account for less than 1% of this, and it suggests that MT is now a regular part of digital life.

Everyone, both consumers and employees in global enterprises, use it all the time. This use of public MT portals also involves many global enterprise workers, who may compromise data security and productivity by using these portals. However, the need for instant, always-available, translation services is so urgent that some employees will take the risk.

Some large global enterprises recognize both the data security risks entailed by this uncontrolled use and the widespread need for controlled and integrated MT in their digital infrastructure. In response, they have deployed internal solutions to meet this need in a more controlled manner.

Why Life Sciences has not used MT historically

There are several reasons why Life Sciences has not used MT, including quality requirements, lags in technical adoptions, global need and non-optimized MT capabilities.

The Life Sciences industry needs high quality, accurate translations given that often the life and death of human beings could be at stake if a translation is inaccurate, creating a subject-matter-expert-dependent and verified quality mindset. The industry saw little benefit from using MT since it was so hard to control and optimize. Depending on the kind of errors, there can be catastrophic consequences from failures and thus a general “not good enough for us” attitude within the industry. Occasional breaking news about MT mishaps did not help.

The Life Sciences industry is not typically early adopters of new technologies. Historically Life Sciences organizations have focused on technology and innovation in targeted areas but that is changing as the need to innovate in multiple areas is only increasing to stay competitive. It is no longer a nice to have, it’s a must-have. At the same time, technologies like machine translation have evolved and improved significantly over the last few years which has impacted how MT is viewed. Machine Translation is now seen as a viable and effective solution to address certain global content challenges.

There’s a concern about risk management/mitigation. Life Sciences organizations have been concerned about the risk involved in leveraging machine translation due to the data security aspects as well as the ability to handle their industry-specific terminology requirements. Generic MT solutions like Google do not provide adequate data security and tailoring controls for the specific needs of an enterprise. Once something is translated using Google Translate it is potentially available in the public domain. Data privacy and security is a top priority for Life Sciences companies and the need for an Enterprise MT solution that provides the benefits of MT technology but with the necessary security, elements are essential. Additionally, there were many use cases where the enterprise needed to have the MT capabilities deployed in private IT environments, and carefully integrated with key business applications and workflow.

But compelling events are forcing change..

The massive increase in the volume of content in general and high volumes of multilingual content from worldwide digitally connected and active patients and consumers are key drivers for the enterprise adoption of MT across the industry.

In the Life Sciences industry, an exponential increase in internal scientific data (particularly in genomics and proteomics data) has triggered global research. This research has led to new ways to develop drugs, knowledge about disease pathways and manifestation, and to the development of tailored treatments for individual patients. Keeping abreast of potentially breakthrough research, much of which may be in local languages has become a competitive imperative.

Source: Arcondis
 The huge increase in patient-related data such as the data from central laboratories, prescriptions, claims, EHRs and Health Information Exchanges (HIEs) provides an immense opportunity to analyze and gain insights across the entire value chain, such as:
  • Drug Discovery: Analyzing and spotting additional indications for a drug, disease pathways, and biomarkers
  • Clinical Trials: Optimizing clinical trials through better selection of investigators and sites, and defining better inclusion and exclusion criteria
  • Wearables: Wearable technologies generate a significant amount of data to monitor patients, such as tracking key parameters and therapy compliance
  • Aggregated data: The ability to aggregate data from multiple reporting sources has also increased the volume and flow of such data.

The Impact of Social Media

Signals related to problems and adverse effects may appear in any language, anywhere in the world. The need to monitor and understand this varied data grows in importance as information today can spread globally in hours. Safety concerns can have serious implications for patient health and on a company’s financial health and reputation. These concerns need to be monitored to avoid derailing a drug that may be on track to become an international success.

Additionally, another important use for machine translation is in the social media and post-marketing area. Life Sciences organizations can compile large amounts of data from multiple languages leveraging MT technology. Monitoring sentiment across all language groups allows Life Sciences organizations to track market-specific issues, sentiment and explain trends. It also helps develop marketing and communication strategies to handle dissatisfaction and avoid crises or to build further momentum to ride positive sentiment.

Applying MT to Epidemic Outbreak Predictions

ML and AI technologies are also applied to monitor and predict epidemic outbreaks around the world, based on satellite data, historical information on the web, real-time social media updates, and other sources. For example, malaria outbreaks predictions take into account temperature, average monthly rainfall, the total number of positive cases, and other data points.

Increasingly the aggregated data that makes this possible is multilingual and voluminous and requires MT to enable more rapid responses. Indeed, such monitoring would be impossible without machine translation.

MT Quality Advancements and Neural MT

MT quality has improved dramatically in recent years, driven by the recent wave of research advances in machine learning, increasing volumes of relevant data to train these systems, and improvements in computing power needed to do this.

This combination of resources and events are key drivers for the progress that we see today. The increasing success of deep learning and neural nets, in particular, have created great excitement as successful use cases emerge in many industries, and also benefit a whole class of Natural Language Processing (NLP) applications including MT.

SDL is a pioneer in data-driven machine translation and pioneered the commercial deployment of Statistical Machine Translation (SMT) in the early 2000’s. The research team at SDL has published hundreds of peer-reviewed research papers and has over 45 MT related patents to their credit. While SMT was an improvement over previous rules-based MT systems, the early promise plateaued, and improvements in SMT were slow and small after the initial breakthroughs.

Neural MT changed this and provided a sudden and substantial boost to MT capabilities. Most in the industry consider NMT a revolution in machine learning rather than evolutionary progress in MT. The significant improvements only represent the first wave of improvement, as NMT is still in its nascence.

At SDL our first generation NMT systems improved 27% on average over previous SMT systems. In some languages, the improvement was as much as 100%, based on the automatic metrics used to measure improvement. The second generation of our NMT systems shows an additional 25% improvement over the first generation. This is remarkable in a scientific endeavor that typically sees 5% a year in improvement at most. It is reasonable to expect continued improvements as the research intensity in the NMT field continues and as we at SDL continue to refine and hone our NMT strategy.

The degree of fluency and naturalness of the output, and its ability to produce a large number of sentences that are very fluent and look like they are from the human tongue drives much of the enthusiasm for Neural MT. Human evaluators often consider the early results, with Neural MT output, to be clearly better, even though established MT evaluation metrics such as the BLEU score may only show nominal or no improvements.

The Neural MT revolution has revived the MT industry again with a big leap forward in output quality and has astonished naysayers with the output fluency and quality improvements in “tough” languages like Japanese, German and Russian.

A Breakthrough in Russian MT

An example of SDL’s MT competence was demonstrated recently, when the SDL research team announced a breakthrough with Russian MT, where its new Neural MT system outperformed all industry standards, setting a benchmark for Russian to English machine translation, with 95% of the system’s output labeled as equivalent to human translation quality by professional Russian-English translators.

Additionally, SDLs broad experience in language translation services and enterprise globalization best practices has also enabled them to provide effective MT solutions for many enterprise use cases ranging from eDiscovery, localization productivity improvements, global customer service and support to broad global communication and collaboration use cases that make global enterprises more agile and responsive to improving CX across the globe.

Availability of Enterprise MT Solutions

While the use of MT across public portals is huge, there are several reasons why these generic public systems are not suitable for the enterprise. These include a lack of control on critical terminology, lack of data security, lack of integration with enterprise IT infrastructure and lack of deployment flexibility. MT needs to have the following core capabilities to make sense to an enterprise:
  • The ability to be tuned and optimized for enterprise content and subject domain.
  • The ability to provide assured data security and privacy.
  • The integration into enterprise infrastructure that creates, ingests, processes, reviews, analyzes, and generates multilingual data.
  • The ability to deploy MT in a variety of required settings including on-premises, private cloud or a shared tenant cloud.
  • The availability of expert services to facilitate tailoring requirements and use case optimization.

Life Sciences Perspective

What is clear today, is that the Life Sciences industry can gain business advantage and leverage from the expeditious and informed use of MT. It is worth reviewing this technology to understand this impact.

MT can transform unstructured data, such as free-text clinical notes or transcribed voice-of-the-customer calls, into structured data to provide insights that can improve the health and well-being of patient populations.

As self-service penetrates the Life Sciences industry, the growing volume of new data from around the world can:
  • Drive better health outcomes and advance the discovery and commercialization of new drugs
  • Improve large-scale population screening to identify trends and at-risk patients.
MT and text mining together will enable the enterprise to process multilingual Real World Evidence (RWE) and generate Real World Data (RWD) to inform all phases of pharmaceutical drug development, commercialization, and drug use in healthcare settings.

Regulatory bodies like the FDA could also utilize additional data related to drug approval trials by expanding to more holistic data during the product approval process – for example, they can also review multilingual internal data from international reports, and multilingual external data from social media that MT can make available for analysis. This could enable much faster processing of drug approvals as more data would be available to support and provide needed background on new drug approval requests.

As the Royal Society states:
“The benefits of machine learning [and MT] in the pharmaceutical sector are potentially significant, from day-to-day operational efficiencies to significant improvements in human health and welfare arising from improving drug discovery or personalising medicine.”

This is a post that was originally published on the SDL website in two parts which are combined here in a single long post. This post also reflects the expertise of my colleague Matthias Heyn, VP of Life Sciences Solutions at SDL. 

Sunday, October 28, 2018

What's Cooking? Fundamental Questions about Blockchain in the Translation Industry

This is a guest post by Luigi on his further thoughts on blockchain in the localization industry. He asks some fundamental questions that should provide readers a good reality check on blockchain stuff you might see at a conference or read in an industry journal. He also points to almost new technology that might really matter for this industry NOW, i.e. interactive virtual assistants (IVAs). The momentum on this is building as we speak, and for the most part, the industry is being swept aside from any relevance with it, as so few are even barely aware of it. This is a new and better way to serve digital customers, a way to improve the overall digital experience, a way to more efficiently serve the right content to the right customer at the right time. This is where CX meets DX and where competitive advantage can be built for digital transformation strategies. But everywhere I turn, I see naysayers. Localization people tend to look for volume and efficiency, and very few look for value.

Neural MT has reached a point where possibly even gorillas could build some kind of  (probably crappy) NMT system. There are 10 or more open source toolkits to choose from. To do NMT (or SMT) well, and deploy systems on successful industrial scale has ALWAYS been difficult, requiring deep competence and deep knowledge of the technology and the data. Yes, the data that you learn from. It really really really does matter. The value here will come from those who have built thousands of systems and have something called insight, which is only acquired after this base exploration work is done. Just like playing a musical instrument even half-way well, it takes time and practice.

To add value to IVAs also means you have to understand content, value, and relevance to the customer at least at some superficial level. I am learning a lot more about content at SDL, and it is very exciting to be at a point in the DX chain where you can influence and shape the overall experience in a way that truly adds value. In an industry that is so focused on translating content that for the most part, only a few customers value, it is exciting to be at the point further up the river where decisions are being made about what customers really need, why, and how it should be provided. Content creation and content architecture in relation to digital journeys are where the highest value decisions are made today it seems. That is where you as a business partner become more relevant and more valuable. It is the point in a B2B relationship where what matters is competence, expertise, and experience, not just price and on-time delivery.

I do not mean to dismiss or disparage blockchain, but for its use in this industry, I think the discussion on the value and benefit needs to rise to a greater level of clarity. In recent news, I saw: Five technologies on the Gartner Hype Cycle for Digital Government Technology, 2018. And guess who No. 1 is? #Blockchain "Approach blockchain with a healthy dose of skepticism,” say the folks at Gartner, and unless I have really solid inside information, I tend to take them seriously. They expect it will be at least five to ten years until the technology matures and begins to deliver benefits. 

But I  am still listening, and waiting to hear a really clear rationale for it (in the translation business) as I still do sense it can be revolutionary, when properly deployed.

For contrast, here is a graphic I saw on Reddit  (click here for high resolution image) that provided many plausible examples of use cases where blockchain does or could create value.


Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke

In a recent article, Eleni Vasilaki, Professor of Computational Neuroscience at the University of Sheffield, reminded readers that humans tend to be afraid of what they don’t understand. According to Vasilaki, some technological achievements surpass expectations and human performance are to the point that they look unrealistic and surrounded by a ghastly mystery halo.

A common mistake is in considering AI applications singularly and fearing humans to be replaced. Singularity is near, but nearness is relative. Vasilaki points out that AI is task-oriented, while humans are versatile by nature. Human versatility comes from an understanding of the world, and this, in turn, is developed over years. No AI seems likely to achieve this understanding anytime soon. People seem to overlook how much the huge amount of data and computational power available today might be the reason for the success of today’s AI.

Technology panacea

First Man has brought back memories of the debates around the utility of the space program prior to the launch of the Apollo 11 mission to the Moon in 1969. In a paper prepared for IAF’s meeting in Stuttgart in 1952, Wernher von Braun wrote: “When we are asked the purpose of our striving to fly to the moon and to the planets, we might as well answer with Maxwell’s immortal counter question when he was asked the purpose of his research on electrical induction: «What is the purpose of a newborn baby?»” Today, few seem to pay attention to the fact that the impressive technological development of recent years owes almost everything to the space program.

A by-product of the mission to the Moon was the belief that any technological achievement is possible and at hand, and this might be one of the reasons for the cyclical proposition of new technological hypes. As Isabella Massardo reminds, in the last decade, speech-to-speech technology has been a constant hype, while machine translation has reached the plateau of productivity. Blockchain, together with cryptocurrencies or on its own, also has been a hype for a few years now. In 2017, blockchain was already on the verge of disillusionment. In 2018, blockchain (now for data security) is still a hype. Not surprisingly, among the emerging and rapidly accelerating technologies that are listed to be actively monitored as disrupting innovations for being expected to profoundly impact the way of dealing with the workforce, customers and partners, none is directly related to translation.

Indeed, democratized AI might make digital twins closer than blockchain, as hundreds of millions of things are estimated to have digital twins within five years. Actually, according to Gartner, blockchain “has the potential to increase resilience, reliability, transparency, and trust in centralized systems.” The keyword here is “centralized systems,” while it is now pretty clear that the magic word to sell blockchain is “decentralization”.

Unfortunately, the decentralization of business models and processes is definitely not straightforward for most businesses. As a matter of fact, many are still trying to understand what blockchain is and how it works and, more importantly, how it can be utilized for mission-critical applications. Not surprisingly, Gartner anticipates that through 2018, 85% of projects with “blockchain” in their titles will deliver business value without actually using a blockchain. Also according to Gartner, “blockchain might one day redefine economies and industries via the programmable economy and use of smart contracts, but for now, the technology is immature.”

A matter of transparency

Even technology enthusiasts should better be cautious about the prospected use of blockchain in translation. Maybe, translation blockchain enthusiasts might answer a few questions and help clarify:
  1. How is blockchain supposed to solve the perennial problem of interoperability?
  2. How is blockchain supposed to help have more professional translators to match demand?
  3. How is blockchain supposed to open up existing language platforms?
  4. How is blockchain supposed to guarantee security, confidentiality, and privacy?
  5. How is blockchain supposed to cut translation prices further?
  6. How is blockchain supposed to make translation quality quantifiable?
  7. Is the network for translation blockchain open?
  8. How is mining implemented, through PoW or PoS?
  9. Mining for cryptocurrencies requires huge investments; this is why it is rewarded with cryptocurrencies, which are negotiable. Are “tokens” negotiable too?
  10. Given the investment in tokens required, how can users be guaranteed against a lack of transparency and a possible crash?
Contrary to what has been happening in situations where the introduction and implementation of blockchain is advocated, or has been taking place, no one in the translation industry has been asking any of these questions, at least publicly or out loud, and obviously, no answer has been given or anticipated so far.


Presenting interoperability as a dilemma still in 2018 means that the translation industry is far away from maturity. Since inception, the translation industry has been proclaimed to be on the edge of a massive change in how they receive and translate content. Changes have actually happened over the years, coming almost exclusively from outsiders. Major translation buyers have been imposing their own solutions to their own problems with their suppliers who, in cascade, have imposed these solutions to their own vendors. The fragmentation of the industry has effectively prevented the birth of any real industry standards, further encouraging this intrusiveness. Translation industry players have always been so obsessed with the risk of compromising their own little garden and thus rejecting, if not hindering, where possible, any real standardization effort. Major players have been trying, in turn, to take advantage of any standardization initiatives, even those that they themselves advocate, to enforce their own models and maintain what they see, often wrongly, as a competitive advantage.

This attitude is in blatant contrast with any new methodologies, but it has the reassuring effect of keeping players in a sort of comfort zone, allowing them to prevent any “resource dispersion” and contain any losses due to the inefficiencies ensuing from their immobility. This is also why the processes of most LSPs are optimized for small projects, and why organic growth and a critical mass are so hard to achieve. Unfortunately, process efficiency comes from design and technical interoperability is effective only when technology matches processes, not vice versa.

A leap of faith

Everyone working in the translation industry knows the problems permeating it. Listing them is barely a starting point towards a solution whatsoever.

How is “tracing a user’s history” supposed to be “increasing trust for the translator’s ability and capability?” How is the tracking of digital assets supposed to benefit their creators when blockchain in no way can guarantee ownership? A ledger is used to record transactions not to certify the ownership of the assets in each transaction.

Therefore, Kirti Vashee’s doubts here are well expressed: “Everybody involved in blockchain seems to be trying to raise money. The dot-com boom and bust also had, to some extent similar characteristics, with promises of transformation and very little proof that anything that was clearly better than existing solutions. I feel the problem description of the LIC initiative is clear in this overview, but I am still unclear on what exactly is the solution. I would like to see examples of a few or many transactions executed through this blockchain to see how it is different and better before, I cast any final judgment.”

A relationship-based industry

The translation industry is an intricate intertwinement of relationships between the businesses, players, publishers, analysts, and consultants governing its economy. In this context, the difference is made by who you know. For this reason, ignoring who Renato Beninatto is tantamount to a lèse-majesté offense and it is not exactly clever for someone in a prominent position to ignore him or, even worse, pretend to ignore him, as Lionbridge’s CEO, John Fennelly reportedly did at LocWorld 38 in Seattle, even though or especially if he comes from another industry and a different experience.
The intertwinement of relationships that characterizes the industry has resulted in exclusive clubs that have their meetings at industry events. Each area of the industry has its own club, and each club has its governance. Occasionally, members of different clubs from different areas mingle, but generally, clubs remain distinct. Some clubs are more numerous or powerful than others and their governance may be assimilated to a mafia, as a young and overly ambitious would-be analyst and consultant named it. He also did whatever it took to join it, and he made it.

As long as you are a member of one of these clubs and share its spirit and its policy, you can be sure that any initiative you take will not be hindered, far from it. No one will ever challenge you or even ask you any embarrassing questions.

Openness and negotiability

For this very reason, though, the questions on the openness of the blockchain network and the negotiability of tokens are fundamental. Blockchain may have the potential to increase resilience, reliability, transparency, and trust in centralized systems, but the most powerful promise of blockchain is about decentralization. Being extremely clear on the openness of the blockchain network and on the associated protocols is paramount.

Clarifying the negotiability of “tokens” is equally crucial. Indeed, more and more often, “investment” is the other word accompanying cryptocurrencies, even though, in principle, they are not supposed to generate returns; after all, it’s just software. But they are used also to purchase goods having a counter value in fiat money and are then negotiable. Bitcoin, for examples, can be converted into cash, using a Bitcoin ATM or a Bitcoin debit card or via an online service. Joining a token-based translation blockchain network would require an initial investment in tokens, whether on a barter exchange for data or in fiat money. If tokens are distributed by a centralized entity, this entity would most probably be asking people to purchase tokens. Even though any new users that would join the network won’t fund older users, the founders will end up being the richest ones guaranteed, as in a typical Ponzi scheme: The more people join, the more the founders will earn. And this is the only way they can make money. From nothing, as the only asset of founders is the network. Their net worth would be in fiat currency while the members of the network would not be able to cash their tokens after having bestowed their data assets to the network, and if the network crashes they might be dumped with nothing.

Finally, with merger or acquisition accounting for growth at 3 of the top 5 fastest growing LSPs for 2018 it is hard to believe that these will join the blockchain network anytime soon. And, by the way, there has always been only one man in black.

Beyond baloney

The comparison with the automotive industry and the car is definitely out of scale, but it is true that translators too use only a fraction of the many features available in any translation software tool. Also, the automobile is now a general purpose technology and the only possible comparison might be with the smartphone.

Yet, although “augmented translation” is just diverting marketing crap, if democratized AI will make any sense, it will help redefine the value of linguists rather than taking jobs away from them.
Arthur Clarke’s famous quote above explains why technology is outpacing our ability to comprehend what we can do with it. The next new thing in the translation industry will very soon be conversational agents and virtual assistants rather than blockchain.

Virtual assistants, aka chatbots or bots, already are or are going to be the bridge between technical documentation teams and customer support and power most customer service interactions. Indeed, technical support is the most common type of chatbot content, and bots are said to be the new FAQ.
Technically speaking, there are two kinds of virtual agents:
  • One kind is scripted. It can respond only to questions that it was programmed to understand.
  • Another uses AI, so it can understand what the customer is telling it, and its knowledge grows the more it interacts with people.
The issue, today, is how to prepare, organize and structure content so that chatbots can use it.
Translation industry players, from each side of the fence, have learned to reuse content, while CMS systems are still underused, especially for single-sourcing. The next challenge for content producers is to extrapolate answers to customer questions from a unified set of content modules delivered across channels, rather than creating new batches of (largely duplicated) content or recreating content by copying and pasting existing content from their CMS into a form that chatbots can use.

More technical authors will be needed accustomed to single sourcing through CMS. Will they be translators accustomed to leveraging past translations using TMs?

In fact, Microsoft has already issued a new chapter of its style guide devoted to writing for chatbots.

The main components of chatbots are four:
  1. Entities
    The “things” users are talking about with a chatbot; they can be inherited from taxonomy nodes in a CMS.
  2. Intents
    The goal of a user’s interaction with a chatbot; it can be mapped as content elements in a CMS and be defined as primary and alternate questions.
  3. Utterances
    The (unique) questions or commands a user asks a chatbot.
  4. Responses
    The answers the chatbot returns to utterances; they can be defined in a CMS.
The coming future authoring skill consists in breaking existing content into smaller, modular chunks within CMSs, to achieve COPE (Create Once Publish Everywhere), the new holy grail.

And if dealing with Conversational UI, the new challenge will be writing dialogues. This will require the skills of a UX writer and a creative writer. Ready Player One?


Luigi Muzii's profile photo

Luigi Muzii has been in the "translation business"  or "the industry" since 1982 and has been a business consultant since 2002, in the translation and localization industry through his firm. He focuses on helping customers choose and implement best-suited technologies and redesign their business processes for the greatest effectiveness of translation and localization-related work.

This link provides access to his other blog posts.

Monday, October 22, 2018

How Blockchain will Revolutionize the Language Services Industry: The LIC Solution

This is a guest post by  Dr. Peggy Peng, CEO, and  Founder of the LIC Foundation describing her vision for blockchain in the translation industry and providing an initial overview on the blockchain initiative that she is leading. I saw her present the overall vision of  LIC in some detail at the TAUS conference, and I thought it would be interesting to hear from a proponent of the technology who believed enough in the technology to fund it herself.

From those who are enthusiastic about blockchain, I hear the refrain that it is a way to build a trusted network and reduce the control of oligarchies which rule almost every high-transaction-volume industry in the world today. Thus we could eliminate very low-value middlemen in a system e.g. the need for lawyers and title insurance agencies in a real estate transaction for example. However, this means that no central authority exists or develops in this new world, and the system is truly independent of controlling forces. However, I repeatedly see systems that try to utilize a blockchain but retain some form of centralized control and are thus ruining the most attractive feature of the technology by doing so.

This is a still a technology that has players who use lots of smoke and mirrors from much of what I have seen outside of the translation industry, and so we should tread with care. Everybody involved in blockchain seems to be trying to raise money. The dot-com boom and bust also had, to some extent similar characteristics, with promises of transformation and very little proof that anything that was clearly better than existing solutions. I feel the problem description of the LIC initiative is clear in this overview but I am still unclear on what exactly is the solution. I would like to see examples of a few or many transactions executed through this blockchain to see how it is different and better before, I cast any final judgment. While most failed in the dot-com boom-bust cycle there were some great successes and so I expect this will be similar, the initial signal to noise is very low but for those who look carefully there is value. But I am already at a point where I feel that it will include more substance than a description of an ICO and a distributed ledger. I think it should present clearly discernible value to interested parties. I think the key will be a true collaboration network which is mutually beneficial to all the key stakeholders in some mostly incorruptible structure that will be mostly immune to future domination by monopolistic forces. These may or may not involve tokens or ICOs, and quite likely will have some kind of distributed ledger, I expect.

But honestly, I am still looking for a real example that makes sense at the common sense level and does not require faith in a crypto future and is not filled techno-jargon that obfuscates and distracts from fundamental questions. I have seen that many who talk about "using AI" today have a very vague and nebulous definition of what this means. This lack of definition of the specific use case is a very clear clue of cluelessness. The most successful applications of machine learning (AI) are around very clearly defined problems and data. This is necessary for successful outcomes.  I expect that the first examples of blockchain will come from use cases that remove marginal intermediaries like in the real estate scenario I described above.  The most successful examples of blockchain reported today are focused on very narrow and specific challenges where the benefits can be clearly explained to those concerned without requiring you to go to the Blockchain School for Morons.

As I mentioned in my last post, which was a skeptical view of the role blockchain may have in the industry, I am hoping to post more varied opinions on this subject. There are already some interesting comments on the first post which support the skepticism of the overall post. I hope that other proponents of blockchain will also join the dialogue.

And, I wish to make it clear that if the LIC Foundation solves the problems we have tolerated for decades in a way that is clear to all who engage in the system, I wish them the greatest success. But it will probably not be necessary as it will be adopted because it makes sense.


2017 has been the Year of the Blockchain but the language service industry has largely been spared the onslaught of blockchain startups claiming to shake up the industry. Even the advances in AI and machine learning has so far not been able to replace the need for translators. 

A lot of criticism of blockchain centers around the crypto-currency variants whose value seems suspect and speculative in a somewhat non-rational way.  

However, having spent five years with the top management of Transn, China’s largest translation company with 30,000 translators, I understand that blockchain presents a historic opportunity to solve some of this industry’s long-standing problems.  

The Problems in the Language Service Industry

Here are some of the problems that have long permeated the industry; customers cannot get language services anytime, anywhere, from any device. They also have no way of determining the quality of the translated material.

Meanwhile, translators lack the network to access jobs themselves and are therefore dependent on translation companies to provide jobs. Translators also do not have a means of ranking their capabilities so that they can be screened for jobs that match their level of competency. In many cases, they are grouped together with low-level translators to do the same low-level work and are paid the same remuneration as lesser skilled translators. 
This fragmented nature of the industry creates a bigger problem for the players. The lack of shared knowledge between competing translation companies means there is a dearth of data to be mined. As a result, the big data of the industry cannot be mined. 

This is where blockchain offers a solution.


What is Blockchain?

A blockchain is a decentralized, incorruptible digital ledger that records any type of transactions and allows information to be distributed without being duplicated.

Imagine if your team is working on the same Google Docs sheet and everyone is updating the sheet at different times from different places. All the changes are tracked and updated without needing to create duplicate versions of the sheet.

Blockchain works in the same way. Data is not stored in one single location which means data is not centralized. Instead, data is hosted on millions of computers simultaneously so that there is no one party in control of all the information and all parties can make changes to the asset without creating duplicate versions of it.

Our current centralized model means users congregate digitally on centralized platforms (such as Airbnb, eBay, and Facebook) to use a service and conduct transactions. We have to create a username and password to log in to a digital service provider, store all our information there and hope nobody finds out our password. This system is highly vulnerable to attackers. If your account is hacked, all your information is exposed. The blockchain may store data across its network, but the data is encrypted. Blockchain technology offers an almost hack-proof way to store information.

Also, in our current centralized model, all our data is owned by the service providers that store our data. We have seen many cases of abuse where service providers use our data to sell tailored solutions to advertisers and prying governments. Even if they do not sell our data, they may unintentionally expose our data to nefarious parties. In the decentralized model where data is stored across our network of computers, we own our own data.

With the Internet currently, any digital asset can be copied and illegally distributed, which created many IP problems for content creators. Blockchain technology is the backbone for a new kind of Internet, where digital information can be distributed but not copied.

How Blockchain Solves the Language Service Industry Problems

The nature of blockchain’s encrypted, decentralized model means that data stored are permanent and cannot be tampered. The blockchain records every single transaction that the user makes. This creates a more credible way of tracing a user’s history, therefore increasing trust for the translator’s ability and capability. 

The permanency and the incorruptibility of the blockchain also offer another benefit. Any asset created and modified on the blockchain can be traced back to the parties who added a modification to the asset. This is useful in tracking and identifying the original and co-creators of the digital assets. This is also groundbreaking for translators as they will be able to record their digital assets and then be fairly compensated for future use of their assets.


How Does Blockchain Serve Its Community

LIC Foundation’s blockchain only serves as the underlying infrastructure to power all kinds of activities. It is similar to a power grid. Energy is processed through the power distribution network to supply electricity to appliances for the end user. Similarly, LIC Foundation’s blockchain will be the infrastructure where AI and human translators power the blockchain network to supply solutions to the end user.

 Another way of looking at it is if you view the Internet as the current infrastructure on which all websites and apps are built on, then blockchain is the future infrastructure. The blockchain is widely expected to be the standard infrastructure in the coming years, powering most digital businesses.
 In 1994 when the World Wide Web Consortium was formed and Netscape was the go-to web browser, only several thousand websites existed worldwide. We are at that point with blockchain, where several thousand blockchain startups have been founded in the past year.

And just as the Internet paved the way for developers to create all kinds of web apps and mobile apps, so too on the LIC blockchain, developers will be able to create new DAPPs (decentralized applications) to serve the needs of the language service industry. Apps can be anything from translation marketplaces to crowdsourced movie subtitling. This massive crowdsourcing will allow people to read websites in their language, watch videos dubbed in their language and have audio translated in real time.

The LIC Ecosystem

Some of these kinds of service are already available on the Internet but none allows the contributors to be recognized and rewarded for their effort and because they are mostly centralized, no one web service has seen wide-scale adoption.

Because blockchain can reach the globe, over time, I expect LIC’s public chain to be one of the biggest digital ecosystems for the language services industry that will power the information exchange globally.

About Dr. Peggy Peng : 

Dr. Peggy Peng holds a Ph.D. in Education from Huazhong University of Science & Technology. She was a C-level executive at Transn, China’s biggest translation company, for five years. Prior to that, she was the Academic Director of the Singapore Chinese Chamber of Commerce & Industry and the Deputy Director of Nanyang Technology University. She is a Council Member of the Singapore Blockchain Technology Foundation.

About LIC Foundation : 

LIC Foundation is a non-profit organisation formed to create the world’s first public blockchain for the language service industry. LIC Foundation’s blockchain development is led by AI expert, Dr. Li Qinghua, who holds a Masters in Artificial Intelligence from the National Taiwan University and a Ph.D. in Artificial Intelligence from Tsinghua University. Dr. Li developed the world’s first palmtop share trading system in 1999 and the first online gaming platform and messenger software that can support more than 10,000 users in China. LIC Foundation was incepted in mid-2018 in Singapore and plans to launch its blockchain mid-2019.

LIC Foundation’s advisory board : 

Zhiyong Sun: Executive Director of Blockchain & Digital Currency Laboratory and Adjunct Professor at the centre for Law & Economics at China University of Political Science & Law
Hingbing Zhu: Chairman of Singapore Blockchain Technology Foundation
ChenDan Feng: President of Singapore Translation Association
Henry He: Founder and CEO of Transn and Vice President of Translator’s Assc of China
Christopher Djaouani: Executive Director at Donnelley Language Solutions, now part of the world’s third largest translation service provider: SDL Plc.

Sunday, October 14, 2018

Looking at Blockchain in the Translation Industry

 I recently attended the TAUS Annual Conference, where " current language technology and collaboration" is the focus. And indeed it has historically been the best place to talk about technology in the "language industry".  It was very clear that in addition to MT, edit distance and error classification on MT systems output are also REALLY important to this community.   Blockchain alternatives were presented as the most revolutionary new technology at this event, since the buzz on NMT has subsided a bit, but I have always wondered if it is really possible to really get truly revolutionary if your primary focal point is "localization".  

I say all this not out of any disregard, but somewhat triggered by Chris Wendt (Microsoft) and I sharing thoughts on our motivations on staying with MT after all these years, and our shared angst about when we as a community would (if ever) start talking about "real" MT applications. I am quite sure that we were both glad to hear LinkedIn and Dell very clearly state that the value of the MT content to the customer, and better engagement of global populations (enabled by MT) were much more important than any kind of quality score, and as a rule, more content faster would always be better than better translations delivered way too late in these days of digital transformation.

As long as I have been in "the industry" there has always been a discussion about taking localization to the next level. To make it more respectable. To be considered with more regard by outsiders and be seen more often in the "mainstream press".  I am not sure that this is really possible if the industry focus does not change, and move beyond cost efficiency and translation quality concerns. The fundamental challenge the industry has is mostly because localization happens after-the-fact i.e. after the marketing, and product development people have decided everything that is really worth deciding to drive a market revolution, and/or make a digital transformation happen.

My sense is that it is increasingly all about content and content is more than words that we translate. It is about relevance and value and transformation. Content is where localization, marketing, and product development can meet. Content is where customers meet the enterprise. Content is the magic key that possibly lets localization people into the C-suite. And digital experience is where finance and customer service and support also join the party. From my vantage point the only company that fundamentally understands this in "the industry"  is the "new  SDL". I am quite possibly biased since they deposit money into my bank account on a regular basis, but I like to think that I can make this statement on purely objective facts as well. It is much more important to understand what you should translate and why you are doing it, than simply translate efficiently with fewer errors with MT systems that produce very low edit distances. Indeed it is probably most important to understand what is needed to get the original content right in the first place as that is the fundamental driver of digital transformation. Understand relevance and value. Revolutions tend to be more about what matters, and why it matters,  then about how should we do what we must do. Being content focused enables you get much closer to the source, to the what and the why. 

However, in this age of fake news even content is under fire. We are surrounded by "fake news" and fake videos and fake pictures. How do we tell what is true and what is not? What about blockchain? The idea of an immutable ledger stored in the cloud, tracing the origin of all content to its source, definitely sounds appealing. Users could compare versions of videos or images to check for modifications, and watermarks would serve as a badge of quality.  But here, too, the question is whether this can be applied to text-based content, where the intent to deceive leaves fewer technical traces.

There are now some who wish to bring specialized blockchain implementations to localization (translation) with verified translators, translation memory and payment mechanisms and raise  the level of trust and fairness.  I am hoping to publish a series of posts on this subject that show various perspectives on this issue and technology. I cannot say I really understand the blockchain potential here at this point, and this post and others that follow is part of my effort to learn and share. 

Gabor Ugray has written a post on blockchain as having far-reaching consequences for compensation, compliance, workflows, and tools in the industry, and has a much more optimistic viewpoint than presented by Luigi in the bulk of this post, that I also recommend to readers.

Luigi Muzii is a commentator who is often considered acerbic and "negative" by many in the industry. But I like to listen to his words generally, since he also tends to cut through the bullshit and get to core issues.  He is not enthusiastic about the impact of blockchain on the translation business. This guest post describes why. His summary conclusion:

Blockchain is no change, it may possibly be an improvement, but it will keep us doing things the way they have been done so far, in a [slightly] different shape. 

If we consider the history of MT in the localization industry, his current conclusions do indeed make sense and seem very reasonable. In this industry, MT is about error classification, edit distances, quality measurement, comparing MT system scores, and custom engines. It is almost never about understanding global customers, listening more closely around the globe, better global communication and collaboration on a daily basis, or rapidly  scaling and ramping up international business. Outsiders have for the most part led those kind of truly transformational MT-driven initiatives. We are defined often by the kinds of measures we use to define our success. Consider what you have accomplished by getting a low edit distance score across 30 MT systems vs. say increasing Russian traffic by 800% and Russian online transactions by 25% by translating 50 million product listings into Russian. Lets also say that this increases sales by $150 million. We can also safely bet that the edit distance on these billions of words is quite terrible and very high. (Yes. I understand that this is a very lop-sided contrast.)

So here is a toast to lower edit distance scores on all your MT systems, and to error classification systems with at least 56 dimensions. 😏

And thank you to Eric Vogt for educating the TAUS  community on what a taus actually looks like, as shown below. As somebody who seriously plays a closely related instrument, I appreciate people knowing this. Robert Etches also won some points in my eyes for stating the seminal and enduring influence that a book about the massacre at Wounded Knee had on his sense of injustice as a young man.


During a panel discussion at the first Hackers Conference in 1984, the influential publisher, editor, and writer Stewart Brand was recorded telling Steve Wozniak, “On the one hand information wants to be expensive, because it’s so valuable. The right information in the right place just changes your life. On the other hand, information wants to be free, because the cost of getting it out is getting lower and lower all the time.”

Over the years, Brand’s articulated nuanced modern conundrum has been amputated as “Information wants to be free,” thus distorting the original concept. Indeed, in this way, Brand’s originally value-neutral statement can be used ambiguously to argue the benefits of propertied information, of free and open information, or of both.

Truth is that people should be able to access information freely and that information should be independent, transparent, and honest. Unfortunately, the pages of the mainstream press, especially the economic ones, are the least “free” of all, filled as they are with the successes of companies possibly filling expensive advertising spaces in the same media, nearly without a critical or at least skeptical comment. Maybe, later, it turns out that those very same companies have been in crisis for some time. This same shameful habit can be found in trade media, especially the translation industry media, most often hosting docilely promotional articles. Nothing, however, indicates their peculiar nature next to them. This might not figure a problem of “freedom of information” but certainly one of “quality of information”.

Recently, Joseph E. Stiglitz, a recipient of the Nobel Memorial Prize in Economic Sciences in 2001, has warned against the risk of a short-sighted outlook. True, John Maynard Keynes said, “long run is a misleading guide to current affairs. In the long run we are all dead,” but any “sugar high” is going to vanish when the same unsolved problems fire back. And a major one is exactly the lack of transparency.

In God Bless You, Mr. Rosewater, Kurt Vonnegut recalls the free enterprise system as having “the sink-or-swim justice of Caesar Augustus built into it,” and hence the need for “a nation of swimmers, with the sinkers quietly disposing of themselves.”

The “translation industry” is aggressively competing on prices and volumes, so boasting a growth in revenues and volumes, but not in profits and compensations, and even greater expectations for the next year is not really a smart prospect. In fact, the larger translation companies may be collecting revenues in the short term in many ways, and a short-sighted outlook is the fastest lane to the grave of the whole industry. On the other hand, despite the paeans sung to the alleged smartness of the so-called champions of this industry, greed seems to be what has been driving them more than entrepreneurial and an innovational spirit. Simply put, can anyone name the Elon Musk of the translation industry? And, to be totally clear, leveraging public funding to build a platform and exploit cheap labor may be cunning, but it is no entrepreneurship.

Unfortunately, these “champions” have managed to make a business trend prevail to make the industry and its products and services irrelevant, so not even learning how to swim with sharks could be enough. The whole industry keeps chanting the same old litany, with the same people telling the same old stories in the same old places to the same people who are still surprisingly eager to listen.
On the other hand, there is also the same old vox clamantis in deserto, while the landscape keeps changing.

Blockchain again

Take blockchain for example. Blockchain is largely considered still an immature technology. The market is less than embryonic with no clear recipe for implementation and very few unstructured experimental solutions. Despite many publications, no clear and undisputed strategic evaluation of blockchain has emerged yet and many companies are reconsidering their investments. However, the hype has infected the translation industry through a breath of wind that has traveled the seas and industry events and media.

As many experts, analysts and observers noticed, blockchain is not as efficient as traditional databases: It’s much hungrier of energy use, processing power, and even storage. Also, a blockchain is only as good as the information that is put in it. In other words, the data in a blockchain is not checked in any way.

The translation industry and the translation profession positively need to be modernized and solve perennial problems, but blockchain can hardly be seen as the eventual technology solution and used as a banner in this respect. People who do so are just cynically using blockchain purely for its potential reputational value, to draw some attention, prove how innovative they are, and eventually attract investments. They may even start some kind of an implementation, though it is likely rather a proof of concept, that they possibly know they would not benefit from it in any way.

As a matter of fact, if and when some future practical applications will show possible, the only way those very people might most possibly benefit from their significant investments will be a highly volatile lead, a nonpaying competitive advantage.

At the moment, there seem to be six different categories of business applications addressing two major needs. It remains to be seen where industry players might find their applications: Storing reference data, with a view on ownership? Smart contracts? Maybe this might be a major application, but it is not among those that are being figured out at the moment. What about payment? It is the most interesting application, but it is not being figured out either.

Blockchain might be used to verify the identity of the person(s) with writing privilege, but as long as no control exists over the information being written, and the information itself remains unchecked, this feature will prove useless as it would always be possible to write fraudulent data to the blockchain. Therefore, a mechanism should be devised to stamp the digital signature of the legitimate owner of each digital item as a unique code that stays with it all the way through the supply chain.

Itnellectual property (IP) is another issue here. As long as no mechanism is available to unmistakably identify the owner of the content, investments in blockchain for content transactions would be highly capricious. And an open blockchain to store this information and help to integrate different organizations and systems still seems a long way off given the absence of standards and of any ongoing attempts to identify one.

Blockchain could support a validated register of qualified practitioners with proven experience in a specific field, whose credentials would be validated to allow customers to quickly find a qualified workforce. However, one of the many things that can never be provided is “trust” in transactions.

Will there ever be a sticker on an item that is valid and complete enough? What about the corporate and personal integrity of the people behind the processes in place? After all, Bitcoin success is mostly due to anonymity and the use of it by criminals.

Blockchain is not going to be disruptive to the translation industry. Once mature it will allow certain things to be done better and more efficiently, but it will not do to the translation industry what digital photography did to Kodak.

More with less


What does transparent, honest and independent information have to do with blockchain? And with the translation industry? As for the blockchain, the trade press, even more than the business or the mainstream press, should help debunk hype rather than fuel the frenzy on which they thrive by helping, more or less explicitly, the club of the usual suspects. Also, with their production, those industry sources that should be taken as authoritative end up looking generally unreliable and this creates a general climate of distrust. As a result, the mainstream media do not turn to these sources to gather the information to process for their audiences. Eventually, the translation industry is being devalued even more than it already is.

It comes as no surprise that the “do-more-with-less” mantra has been unquestioningly borrowed by the translation industry in the past few years, together with many other baloneys such as agile, lights-out, augmented, etc. when the many marketing geniuses crowding the industry could have invented something better than a flabby ‘transcreation.’ Of course, they should have done their homework first, and this would have spared them the poor figure of using wrong quotes and attributing them to the wrong people.

But no. Most industry players still believe the same old little fairy tale they have been told for years that the more companies expand globally, the more they need to pay attention to local language expectations in the new markets they are trying to enter, the more they need to pay close attention to the linguistic, cultural, and even socio-economic nuances of these markets, and that this makes translation a major part of a company’s global growth strategy.


Beyond futility


The greatest damage to the industry as a whole has been the explosion of true ‘religion wars’ pervading the industry with the abandonment of any objective, accurate and unemotional approach to problems.

This prevents the understanding of some otherwise simple and obvious phenomena. Globalization, for example, has been underway for almost three decades now while the growth of international trades started soon after the end of WWII. So, why the industry is still waiting for global companies “to pay close attention to the linguistic, cultural, and even socio-economic nuances” of international markets?

Also, content growth is exponential, while revenue growth in the translation industry is linear and the slope below 10°. Why, then, revenues are still that important to measure? How does translation demand correlate with content growth? What is the correlation between 99,99% of content being machine translated and the supposedly growing revenues of the industry’s major players?

Wait! Making effective translation a major part of a company’s global growth strategy is “a daunting task that is near impossible without technological leverage and momentum.” Now everything’s clear: this what happens when you try to bullshit the bullshitter. And they buy it.

This is a fairy tale, and it is where the gig economy comes into play. In fact, it is being peddled in industry events and media by the very same CEOs who are not used to do their homework and would misquote Henry Ford. Maybe they don’t even know that he did not invent either the assembly line, or interchangeable parts, or the automobile. Anyhow…

These very same people have no credible answer when they are asked why the translation industry has been familiar with the gig economy from inception, and yet still talks about innovation. And all laugh? Not really, indeed. As Marion McGovern, author of the otherwise forgettable Thriving in the Gig Economy, points out, “The advent of the digital platform world has altered the talent supply chain.” And rather than pushing out traditional staffing agencies, “digital platforms are becoming sourcing engines for them. Big companies may use both the staffing firm and then for urgent or unforeseen projects, turn to the platforms for options.”

Beyond blockchain and baloneys, the translation industry is stuck with obsolete models. Companies in other industry have been measuring performance for years as a way of adding value to their organizations; LSPs, no matter the size, are still at error-catching quality management.


Content flood


Given the inability of industry players to keep up with customer expectations for technological and process innovation, taking translation in the pipeline has been having the effect of further commoditizing translation. And commodification will continue, with expectations going up, and costs going further down.

The case of Amazon’s Chinese-branch employees leaking data for bribes is emblematic of the risks underpaid workforces might be emboldened to take. And with production business translation being more and more automated by ever better neural machine translation engines things might get a little dicey even in the translation industry.

Many of the larger translation buyers have been developing, managing and delivering multilingual content in different formats and devices for years now. The next step will be fitting everything together into a single workflow. Emerging technology will finally fully enable Sir Berners-Lee’s dream of a semantic Web. In fact, for more than a decade, a significant fraction of Web domains has been generated from structured databases. Today, many Web domains contain Semantic Web (i.e. markups.

New models are being devised for content enrichment or augmentation (i.e. integrating different content aspects and simple resources with semantic and knowledge.) This mainly consists of metadata processing to exploit information and allow end users to navigate on semantic annotations. Content enrichment is a very expensive task typically performed for valuable content. Today, it can be done automatically by combining the analytics of multiple data sources. In document enrichment pipelines, each document is enriched, analyzed and/or linked with additional data to improve navigation and filtering for further analyses.

Also, to date, content generators are already available that take existing content and rewrite and shuffle it around to create new content, while many companies are working on Natural Language Generation, an AI sub-discipline to convert data into text used in customer service to generate reports and market summaries. It is being investigated for creating content for websites or blogs from a variety of sources including answers from questions on social media and forums.

With text analytics to understand the structure of sentences, as well as their meaning and intention and NLP to process unstructured data, full-blown computer-only-generated content will soon be a reality.

When these technologies are fully implemented, the Semantic Web will lead to a further upsurge in content production.


Translation redefined


This will forcedly lead to redefining the nature of translation and the role of linguists to leverage the value in enriched (intelligent) content. It’s time for applied linguists (e.g. translators) to re-think their role in the language industry. Tim Berners Lee’s idea was a bit futuristic (if not actually visionary) when he launched it two decades ago.

The coming future will see ‘applied linguists’ mostly employed as post-processors. Machine translation will do all the jobs, even in creative tasks and those who are still called translators today will have to confirm or, at worst, polish automatic output for cultural appropriateness. Some will re-engineer and re-organize content, more or less as digital indexers and curators, and others will clean and polish data to feed machines. Creativity will no longer exist by definition, it will depend on each one’s ability to exploit his/her knowledge and skills.

So, it is really a good time for a change. Blockchain is no change, it may possibly be an improvement, but it will keep us doing things the way they have been done so far, in a different shape. You might reclaim the things we believe have been taken away from you, but this will never happen. You can stick to obsolete models and expect to keep the same old stances, advance the same old claims, and work in the same old way you have been used to, but this won’t take you anywhere. You can only try to get new ones, and now it’s the time to find a way to get them.

So, is the translation industry attracting more investments than in burgeoning heyday? And where is the money being made? In the meantime, you better row and learn to swim.

Luigi Muzii's profile photo

Luigi Muzii has been in the "translation business"  or "the industry" since 1982 and has been a business consultant since 2002, in the translation and localization industry through his firm. He focuses on helping customers choose and implement best-suited technologies and redesign their business processes for the greatest effectiveness of translation and localization-related work.

This link provides access to his other blog posts.