Tuesday, June 20, 2017

My Journey into "Neural Land"

This is a post, mostly written by Terence Lewis about his experience with open source NMT, that was just published in the 275th edition of  The Tool Box Journal. Since I have been chatting with Terence off and on over the years, about various MT related issues, I thought his experience might be interesting to some in the primary reader base of this blog.  

At the moment, there is a lot of hype driving NMT in the public eye, and while there is no doubt that NMT is definite and real progress in the MT field, it is important to temper the hype with as many actual data points about the reality as possible. There are also a lot of pretty shallow and superficial "Isn't-NMT-cool?" or "My oh my, it looks like human translation!?@!&?$" type stories abounding, so when you see one of substance (like this one by Terence) it is always refreshing, interesting and also illuminating. (For me anyway).

I will admit that I am more than slightly skeptical about DIY NMT, as, from my vantage point, I see that NMT is also really difficult for MT companies to really explore. Not because it is so difficult conceptually which it is, but because to "really" explore the possibilities of NMT deeply, it requires real investment, large amounts of data, and computing scale that only the largest players can afford. I am one of those who thinks that you need to do it a 1,000 times or more in many different variations before "understanding" happens. While quickly running some data though an OpenNMT platform can work for some, probably more often than Moses would, I still maintain that one needs knowledge and skill, and more ability than just being able to operate an open source platform,  for this technology, or any MT capability, to really build long-term business leverage.  And that only comes with understanding and experience that builds increasing expertise.  This is quite in contrast to the advice given here at the recent MemoQFest conference. My view is that it makes great sense for LSPs to invest time and money building expertise on corpus/data analysis, and understanding how data and algorithms interact, but very little sense for them to spend time on understanding how SMT or NMT operates at the nuts and bolts mathematics & programming level. That is best left to experts who do it all the time, as the theory, math and programs will change and evolve continually, and need steady and ongoing attention to achieve excellence. There are literally hundreds of papers being published every month, many which should trigger follow-up and additional research by a team thatreally wants to understand NMT. Thus, I see Terence as an exception that proves the rule, rather than proof that anyone with a few computers can build NMT models. As you can see from his background, he has long-term and relatively deep experience with MT in his bio data. His story here is also an inside view of what goes on, in the early parts of the NMT journey for any MT practitioner.

The emphasis in the post below is all mine.


"A little more than a year ago, in the 260th edition of the Tool Box Journal, I published an article about Terence Lewis, a Dutch-into-English translator and autodidact who took it upon himself to see what machine translation could do for him beyond the generic possibilities out there. He taught himself the necessary programming from scratch, once for rules-based machine translation, again when statistical machine translation became en vogue, and, you guessed it, once again for neural machine translation. I have been and still am impressed with his achievement, so I asked him to give us a retelling of that last leg of his journey."  - Jost Zetzsche

It all started with a phone call from Bill. "B***dy hell, Terence", he shouted, "have you been on Google Translate recently?" He was, of course, referring to Google's much publicized shift from phrase-based statistical machine translation to neural machine translation which got under way late last autumn. Bill, an inveterate mocker of lousy machine translation, had popped a piece of German into Google Translate and, to his amazement, found little to mock in the output. German, it seems, was the first language pair for which Google introduced neural machine translation. I put down the phone, clicked my way over to Google Translate and pasted in a piece of German. To say that what I saw changed my life would be a naïve and over dramatic reaction to what was essentially a somewhat more fluent arrangement of the correctly translated words in my test paragraph than I would have expected in a machine translation. But this was early days and things could only get better is what I told myself.
Around that time the PR and marketing people at Google, Microsoft and Systran had gone into top gear and put out ambitious claims for neural machine translation and the future of translation. Systran's website claimed NMT could "produce a translation overachieving the current state of the art and better than a non-native speaker". Even in a scientific paper the Google NMT team wrote that "additional experiments suggest the quality of the resulting translation system gets closer to that of average human translators", while a Microsoft blogger wrote that "neural networks better capture the context of full sentences before translating them, providing much higher quality and more human-sounding output".

Even allowing for the hype factor, I could not doubt the evidence of my own eyes. Being a translator who taught himself to code I was proud of my rule-based Dutch-English MT system, which subsequently became a hybrid system incorporating some of the approaches of phrase-based statistical machine translation. However, I sensed -- and I say "sensed" because I had no foundation of knowledge then -- that neural machine translation had the potential to become a significant breakthrough in MT. I decided to "go neural" and dropped everything else I was doing.
What is this neural machine translation all about? According to Wikipedia, "Neural machine translation (NMT) is an approach to machine translation that uses a large neural network". So, what's a neural network? In simple terms, a neural network is a system of hardware and software patterned after the operation of neurons in the human brain. Typically, a neural network is initially trained, or fed large amounts of data. Training consists of providing input and telling the network what the output should be. In machine translation, the input is the source data and the expected output is the parallel target data. The network tries to predict the output (i.e. translate the input) and keeps adjusting its parameters (weights) until it gets a result that matches the target data. Of course, it's all far more complex than that, but that's the idea.
Not knowing anything about NMT, I joined the OpenNMT group which is led by the Harvard NLP Group and Systran. According to the OpenNMT website, OpenNMT is an industrial-strength, open-source (MIT) neural machine translation system utilizing the Torch/PyTorch mathematical toolkit. The last two words are key here -- in essence, NMT is math. OpenNMT is written in both the Lua and Python programming languages, but the scripts that make up the toolkit, which are typically 50-100 lines long, are in essence connectors to Torch where all the mathematical magic really happens. Another NMT toolkit is Nematus, developed in Python by Rico Sennrich et al., and this is based on the Theano mathematical framework.

If you're thinking of delving into NMT and don't have any Linux skills, get them first. It's theoretically possible to run OpenNMT on Windows either directly or through a virtual machine, but most of the tutorials you'll need to get up and running just assume you're running Ubuntu 14.4 and nobody will want to give you a lesson in basic Linux. While in theory, you can train on any machine, in practice for all but trivially small data sets you will need a GPU (Graphical Processing Unit) that supports CUDA if you want training to finish in a reasonable amount of time. For medium-size models you will need at least a 4GB GPU; for full-size state-of-the-art models, 8-12GB is recommended. My first neural MT training from the sample data (around 250,000 sentences) provided on the OpenNMT website took 8 hours on an Intel Xeon X3470, S1156, 2.93 GHz Quad Core with 32GB RAM. The helpful people on the OpenNMT forum recommended me to install a GPU if I wanted to process large volumes of data. I installed the Nvidia GTX 1070 with an onboard RAM of 8GB. This enabled me to train a model from 3.2 million sentences in 25 hours.
However, I'm getting ahead of myself here. Setting up and running an NMT experiment/operation is -- on the surface -- a simple process involving three steps: preprocessing (data preparation in the form of cleaning and tokenization), training and translation (referred to as inference or prediction by academics!). Those who have tried their hand at Moses will be familiar with the need for parallel source and target data containing one sentence per line with tokens separated by a space.  In the OpenNMT toolkit, the preprocessing step generates a dictionary of source vocabulary to index mappings, a dictionary of target vocabulary to index mappings and a serialized Torch file -- a data package containing vocabulary, training and validation data. Internally the system will use the indices, not the words themselves. The goal of any machine translation practitioner is to design a model that successfully converts a sequence of words in a source language into a sequence of words in a target language. There is no shortage of views on what that "success" actually is. Whatever it is, the success of the inference or prediction (read "translation") will depend on the knowledge and skill deployed in the training process. The training is where the clever stuff happens, and the task of training a neural machine translation engine is in some ways no different from the task of training a statistical machine translation system. We have to give the system the knowledge to infer the probability of a target sentence E, given the source sentence F (the letters "F" and "E" being conventionally used to refer to source and target respectively in the field of machine translation). The way in which we give the neural machine translation system that knowledge is what differs.
Confession time -- having failed math at school, I've never found anything but the simplest of equations easy reading. When I worked my way through Philipp Koehn's excellent "Statistical Machine Translation" I skipped the most complex equations. Papers on neural machine translation are crammed with such equations. So, instead of spending weeks staring at what could just as well have been hieroglyphs, I took the plunge and set about training my first neural MT engine -- they say the best way to learn is by doing! This was accomplished by typing "th train.lua -data data/demo-train.t7 -save_model demo-model".  This command applied the training script to the prepared source and target data (saved in the file "demo-train.t7") with the aim of generating my model (or engine). Looks simple, doesn't it, but under the hood, a lot of sophisticated mathematical operations got under way. These come down to learning by trial and error. As already mentioned, we give our neural network a batch of source sentences from the training data, and these are related word by word to words in the target data. The system keeps adjusting various parameters (weights) assigned to the words in the source sentence until it can correctly predict the corresponding target sentence. This is how it learns.
My first model was a Dutch-English engine, which was appropriate, as I had spent the previous 15 years building and refining a rule-based machine translation system for that language pair. I was delighted to see that the model had by itself learned basic rules of Dutch grammar and word re-ordering rules which had taken me very many hours of coding. It knew when to translate the Dutch word "snel" as "quick" or as "quickly" -- something that my rule-based system could still get wrong in a busy sentence of some length. "Het paard dat door mijn vader is gekocht" is rendered as "The horse bought by my father" and not "The horse which was bought by my father," reflecting an editorial change in the direction of greater fluency. Another rule the system had usefully learned was to generate the English genitive form so that "het paard van mijn vader" is translated as "my father's horse" and not "the horse of my father," although it did fail on "De hond van de vriend van mijn vader" which came out as "My father's dog" instead of "My father's friend's dog", so I assume some more refined training is needed there.
These initial experiments involved a corpus of some 5 million segments drawn from Europarl, a proprietary TM, the JRC-Acquis, movie subtitles, Wikipedia extracts, various TED talks and Ubuntu user texts. Training the Dutch-English engine took around 5 days. I used the same corpus to train an English-Dutch engine. Again, the neural network did not have any difficulty with the re-ordering of words to comply with the different word order rules in Dutch. The sentence "I want to develop the new system by taking the best parts of the old system and improving them" became "Ik wil het nieuwe systeem ontwikkelen door de beste delen van het oude systeem te nemen en deze te verbeteren". Those who read any Germanic language will notice that the verbal form "taking" has moved seven words to the right and is now preceded by the particle "te". This is a rule which the system has learned from the data.

So far, so good. But -- and there are a few buts -- neural machine translation does seem to have some problems which perhaps were not first and foremost in the minds of the academics who developed the first NMT models. The biggest is how to handle OOVs (Out of Vocabulary Words, words not seen during training) which can prove to be numerous if you try to use an engine trained on generalist material to translate even semi-specialist texts. In rule-based MT you can simply add the unknown words either to the general dictionary or to some kind of user dictionary but in NMT you can't add to the source and target vocabularies once the model has been built -- the source and target tokens are the building blocks of the mathematical model.
Various approaches have been tried to handle OOVs in statistical machine translation. In neural machine translation, the current best practice seems to be to split words into subword units or, as a last resort, to use a backoff dictionary which is not part of the model. For translations out of Dutch I have introduced my own Word Splitter module which I had applied in my old rule-based system. Applied to the input prior to submission to the NMT engine, this ensures that compound nouns not seen in the training data will usually be broken down into smaller units so that, for example, the unseen "fabriekstoezichthouder" will break down into fabriek|s|toezichthouder and be correctly translated as "factory supervisor". With translations out of English, I have found that compound numbers like "twenty-three" are not getting translated even though these are listed in the backoff dictionary. This isn't just an issue with the engines I have built. Try asking Systran's Pure Neural Machine Translation demonstrator to translate "Two hundred and forty-three thousand fine young men" into any of its range of languages and you'll see some strange results -- in fact, only the English-French engine gets it right! The reason is that individual numerical entities are not seen enough (or not seen at all) in the training data, and something that's so easy for a rule-based system becomes an embarrassing challenge. These issues are being discussed in the OpenNMT forum (and I guess in other NMT forums as well) as researchers become aware of the problems that arise once you try to apply successful research projects to real-world translation practice. I've joined others in making suggestions to solve this challenge and I'm sure the eventual solution will be more than a workaround. Combining or fusing statistical machine translation and neural machine translation has already been the subject of several research papers.

Has it all been worth it? Well, customers who use the services provided by my translation servers have (without knowing it) been receiving the output of neural machine translation for the past month and to date nobody has complained about a decline in quality! I have learned something about the strengths and weaknesses of NMT, and some of the latter definitely present a challenge from the viewpoint of implementation in the translation industry -- a translation engine that can't handle numbers properly would be utterly useless in some fields. I have built trial MT engines for Malay-English, Turkish-English, and Lithuanian-English from a variety of bilingual resources. The Malay-English engine was built entirely from the OPUS collection of movie subtitles -- some of its translations have been amazingly good and others hilarious. I have conducted systematic tests and demonstrated to myself that the neural network can learn and its inferences involve more than merely retrieving strings contained in the training data. I'll stick with NMT.
Are my NMT engines accessible to the wider world? Yes, a client allowing the translation of TMX files and plain text can be downloaded from our website, and my colleague Jon Olds has just informed me that plug-ins to connect memoQ and Trados Studio (2015 & 2017) to our Dutch-English/English-Dutch NMT servers will be ready by the end of this month. Engines for other language pairs can be built to order with a cloud or in-premise solution.


Terence Lewis, MITI, entered the world of translation as a young brother in an Italian religious order, when he was entrusted with the task of translating some of the founder's speeches into English. His religious studies also called for a knowledge of Latin, Greek and Hebrew.  After some years in South Africa and Brazil, he severed his ties with the Catholic Church and returned to the UK where he worked as a translator, lexicographer (Harrap's English-Brazilian Portuguese Business Dictionary) and playwright. As an external translator for Unesco he translated texts ranging from Mongolian cultural legislation to a book by a minor French existentialist.  At the age of 50 he taught himself to program and wrote a rule-based Dutch-English machine translation application which has been used to translate documentation for some of the largest engineering projects in Dutch history. For the past 15 years he has devoted himself to the study and development of translation technology.  He recently set up MyDutchPal Ltd to handle the commercial aspects of his software development. He is one of the authors of 101 Things a Translator Needs to Know (,  ISBN 978-91-637-5411-1).

Tuesday, June 13, 2017

From Reasoning to Storytelling - The Future of the Translation Industry

This is a guest post by a frequent contributor to eMpTy Pages, Luigi Muzii, on the future of translation, written in what some may say is an irreverent tone. I like to hear his opinions because he has a way cutting through the bullshit and getting to core issues quickly.  Getting to core issues matters, because it helps then to get to the right questions and the right questions can sometimes lead to meaningful and impactful answers. And isn't that what evolution and progress are really all about?

For those participants who add very little value in any business production chain, the future is always foreboding and threatening, because there is often a sense that a reckoning is at hand.  It is easier to automate low-value work than it is to automate high-value work. Technology is often seen as a demon, but for those who learn to use it and leverage it, it is also a means to increase their own value addition possibilities in a known process, and increase one's standing in a professional setting. While translating literature may be an art, most business translation work I think is business production chain work. These are words translated to help you sell stuff, or help customers better use stuff that has been sold to them, or now increasingly it is a way to understand what customers who bought your stuff think, about the user and customer experience.

While it is important to ask big questions like “What is going to happen in the future?",  it is also important that some questions should be held in the mind, that is in a state of attention, for an extended time, to really allow a deeper investigation and inquiry to happen. While I am an advocate of technology where it makes sense, I also have a deep respect for what makes us truly human and special, beyond our work, and I am still skeptical that technology can ever fully model deeply human experiences. Language is just the tip of the iceberg of what is hard to model, though, most business content is easily modeled. This is perhaps why I play music, and mostly why I play improvised music, as it can lead to flow states that are truly uplifting, liberating, and often mysteriously irreproducible but yet truly meaningful, and thus deeply human. Thus, many human activities are somewhat impossible to model and replicate with technology. However, the singularity and VR/AR crowd are all focused on even trying to change this.

I recently came upon a quote I really liked:

‘Wisdom is a love affair with questions, knowledge a love affair with answers … we are so attracted by knowledge that we have lost concern for wisdom.’

Subsequently, I found this little snippet of audio (spoken by J Krishnamurti), probably recorded 35+ years ago, particularly intriguing and insightful, even though the AI/VR technology revolution images have just been added to the audio this year. He was an observer, who asked big questions mindfully, and pointed out things that are happening with technology today, which were already apparent 50 years ago to his keen observation. Krishnamurti has been a seminal influence on me, and thus I add this video, though it is somewhat tangential to the main theme of this post. It asks a big question and then stays with the question, refusing to rush to an answer.  Asking questions to which you don't know the answer,  triggers a flow of attention and observation, and if one does not rush too quickly to an answer, one might just find, that things in the bigger picture begin to get clearer and clearer. Mostly, from the steady flow of attention that we sometimes are able to bring to especially big questions.

The emphasis and call-outs in Luigi's post below are all mine.


Storytelling is the new black in written communication. It entertains rather than informs, aiming at influencing the reader’s perception rather than being thought-provoking. Storytelling replaces reasoning with a narrative.

The boundary between storytelling and fabulation is fleeting. This phenomenon does not affect politics only and comes from marketing and advertising.

Sometimes, with their fabrications, unknowingly or not, storytellers rationalize the arguments of the many laudatores temporis acti who crowd this industry, and not just that.

Translators have been having the blues at least for 35 years. Judging from how many people in the translation community have been struggling so far, it seems that they really “crave attention.”

The “wrenching change” the industry is undergoing is not new and is not due to the Internet nor to machine translation or technology at large.

For at least the last 35 years, that is well before the spread of the Internet, translators have been complaining about rates, the dominance of intermediaries, the working conditions, the trifling influence of their role in the society despite the importance of translation in many daily tasks.

And yet, through the 80’s and the 90’s, many could make a more than decent living on translation, raise children, buy a house, and enjoy summer holidays every year.

Neither globalization nor technology has made the competition any fiercer. The major technological innovation of the last three decades has been translation memories, which are still being illustrated to the general public and roughly taught in translation courses.

Many—if not most—freelancers who thrived on translation during the 80’s and the 90’s, as well as most of those who extoll the wonders of the improbable Eldorado of the so-called premium market made no use of translation memories. Why improbable? Because these people seem to forget that they have been in business for at least two decades, that they started their career in prosperous times, and that they are English native speakers, possibly expats, and/or working in highly sought-after language combinations. Also, these people strictly avoid providing any sound proof of the existence of this premium market—actually a segment—or samples of their work for premium customers, or their income statements.

Anyway, premium customers really exist, of course, but they are fewer and harder to reach than imagined. Also, they could show erratic, the marketing effort required to win one could prove draining, and the attention to pay to keep one could be more than intense.

Therefore, it looks at least weird that an otherwise watchful observer of this industry can so naively take the storytelling of these people seriously and offer it in his column for a magazine of international importance.

The translation industry has always been a truly open market, with no entry barriers, in which all economic actors can trade without any external constraint. Information asymmetry is its bogey. Even where local regulations apply, typically concerning sworn translation and court interpreting, these are no guarantee of remuneration and working conditions. In other words, the enormous downward pressure on prices does not come from a broadening of the offering. On the contrary, in the world, there are more lawyers and journalists than translators; there are even more doctors than translators, and, keeping with the countless articles that daily describe the devastating impact that artificial intelligence will have in the immediate future, they are equally at risk.

True, not everyone seems to share the same catastrophic predictions; there are also those who include the translator profession among the seven professions that won’t have to fear the future.

A few weeks ago, an article in the MIT Technology Review presented a survey reporting a 50% chance of AI outperforming humans in all tasks in 45 years and of automating all human jobs in 120 years. Specifically, researchers predict AI will outperform humans in translating languages by 2024, writing high-school essays by 2026, writing a bestselling book by 2049, and working as a surgeon by 2053.

As a matter of fact, in the last decade, machine translation has improved the perception of the importance of translation, thus exposing translators to higher demands. At the same time, the quality of the MT output has been steadily improving, and now it is quite impressive. This is what most translators should be afraid of, that expectations on professional translators will be increasing.

Unfortunately, dinosaurs can be found rather easily in translation, mostly among those who should be most “excited about technology and the possibilities of scale it offers.”

Are translators to blame for not being passionate about technology? Of course not. And not just because many of them really are almost obsessive with the tools of the trade. What so many outsiders much too often choose to overlook is that translators, still today, are generally being taught to consider themselves as artists, mostly by people who are never confronted with the harshness of the translation market in their lives.

Furthermore, with very few exceptions, translation companies are generally started and run by translators who generally lack business administration basics. On the other hand, the largest translation businesses that are typically run according to best-class business administration best practices do not certainly shine in terms of technological innovation, unless this is functional to maintain appreciable profit margins. And this is usually done by compressing translator fees and by using low-priced technologies.

So, why would translators be having trouble thriving or even surviving? Does it really have to do with technology? Has disruption come and gone in the translation industry and we didn’t even notice?

Technology has profoundly altered the industry, but the stone guest ("the elephant in the room") in the translation industry is in the translation process and the business model, both as obsolete as translation education programs. Even translation industry ‘leaders’ seem more interested in reassuring those working in the industry than in driving a real change. Morozov’s “orgy of amelioration” has been affecting translation too, without having seen any disruptive or even substantial innovation coming from inside.

The most relevant innovations have involved translation delivery models, not processing or business models.

So, it is not surprising that, given the premise, an outsider may think that “literary translation is under no threat.” It may also be true, as long as it is tied to the publishing industry. However, representing no more that 5% of the overall volume of translations, it is generally quite hard to find anyone making a living of literary translation only. And this is definitely not about technology.

Furthermore, from a strictly academic point of view, it is true that “a good translator may need to rethink a text, re-wording important pieces, breaking up or merging sentences, and so on”.

Unfortunately, in real life, a professional translator who tries to make a living out of “commercial translation”, most of the time, to be generous, does not have the time. Of course, in this respect, namely for the sake of productivity, any translation technology is of the greatest help for a translator willing to exploit it.

Unfortunately, even here, nonsense is on the agenda. Machine translation, shared data, and post-editing, for example, are no “dirty little secret.” Getting suggestions from machine translation, edit them and use the results is a good way to exploit translation technology. And it is possibly incorrect to say that “everybody is doing it, but no one wants to talk about it.” This charge can easily be addressed to those intermediaries seeking to exploit as much as possible the information asymmetry typical of the industry and, among them, there are no freelancers.

Transparency is another victim of marketing and storytelling, maybe the first victim, and a typical product of deceptive marketing tactics based on a lack of transparency is ‘transcreation.’ This is an empty, all-solving word forged to scam buyers. Every translator should ‘transcreate,’ by default, to make two cultures meet. It is a service invented by marketing people to create a false differentiation and recover the losses due to pressure on the prices of basic services. It is definitely not “another market.” It is just another one of those “new trends and buzz words” that “pop up every now and then, only to be forgotten among the presentations at the usual conferences or the blog posts of self-defined experts.

On the other hand, it is no coincidence that another narrative is emerging around the alleged analogies with the pre-Internet advertising industry. And it is no coincidence that many want to get it. But have you ever noticed how many online services are still advertised on media that were given as doomed?

In the same way, a survey on a less than a statistically insignificant sample of volunteer respondents is certainly not the best way to gain any insight. But it is definitely good storytelling.

In the end, being sanguine about translation technology and seeking protection for not being overwhelmed did not help produce antibodies against the unexpected virus of a technological shift.

And no storyteller can cover the uncertainty of the future of the translation with his narrative. Certainly not from here to five years.

Luigi Muzii's profile photo

Luigi Muzii has been in the "translation business" since 1982 and has been a business consultant since 2002, in the translation and localization industry through his firm. He focuses on helping customers choose and implement best-suited technologies and redesign their business processes for the greatest effectiveness of translation and localization related work.

This link provides access to his other blog posts.

Thursday, June 8, 2017

Translation Quality -- WannaCry?

For as long as I have been engaged with the professional translation industry, I have seen that there exist great confusion and ambiguity around the concept of  "translation quality". This is a services industry where nobody has been able to coherently define "quality" in a way that makes sense to a new buyer and potential customer of translation services. It also is, unfortunately, the basis of a lot of the differentiation claims made by translation agencies in competitive situations. Thus, is it surprising that many buyers of translation services are mystified and confused about what this really means?

To this day it is my sense that the best objective measures of  "translation quality", imperfect and flawed though they may be, come from the machine translation community.  The computational linguistics community have very clear definitions of adequacy and fluency that can be reduced to a number, and have the perfect order that mathematics provide.

The tranlsation industry is however reduced to confusing discussions, where ironically the words and terms used in the descriptions, are ambiguous and open to multiple interpretations. It is really hard to just say, " We produce translations that are accurate, fluent and natural," since we have seen that these words mean different things to different people.  To add to the confusion, translation output quality discussions are often conflated with translation process related issues. I maintain that the most articulate and generally useful discussion on this issue comes from the MT and NLP communities.

I feel compelled to provide something on this subject below that might be useful to a few, but I acknowledge that this remains an unresolved issue, that undermines the perceived value of the primary product that this industry produces.

Here are the basic criteria that a Translation Service Provider offering a quality service should fulfill:

a) Translation

  • Correct transfer of information from the source text to the target text.
  • Appropriate choice of terminology, vocabulary, idiom, and register in the target language.
  • Appropriate use of grammar, spelling, punctuation, and syntax, as well as the accurate transfer of dates, names, figures, etc. in the target language.
  • Appropriate style for the purpose of the text.

 b) Work process

  • Certification in accordance with national and/or international quality standards.

Gábor Ugray provides an interesting perspective on "Translation Quality" below and again raises some fundamental questions about the value of new fangled quality assessment tools, when we have yet to clarify why we do what we do. He also provides very thoughtful guidance on the way forward and suggests some things that IMO might actually improve the quality of the translation product. 
Quality definitions based on error counts etc.. are possibly useful to the dying bulk market as Gabor points out, and as he says, "real quality" comes from clarifying intent, understanding the target audience, long-term communication and writing experience, and from new in situ and in process tools, that enhance the translator work and knowledge-gained-via-execution experience that these new tools might provide. Humans learn and improve by watching carefully when they make mistakes, (how, why, where), not by keeping really accurate counts of errors made. 

We desperately need new tools that go beyond the TM and MT paradigm as we know it today, and really understand what might be useful and valuable to a translator or an evolving translation process. Fortunately, Gabor is in a place where he might get some to listen to these new ideas, and even try new implementations that actually produce higher quality.
The emphasis and callouts in his post below are almost all mine.


An idiosyncratic mix of human and machine translation might be the key to tracing down the notorious ransomware, WannaCry. What does the incident tell us about the translating profession’s prospects? A post on – translation quality.


Quality matters, and it doesn’t

Flashpoint’s stunning linguistic analysis[1] of the WannaCry malware was easily the most intriguing piece of news I read last week (and we do live in interesting times). This one detail by itself blows my mind: WannaCry’s ransom notice was dutifully localized into no less [2] than 28 languages. When even the rogues are with us on the #L10n bandwagon, what other proof do you need that we live in a globalized age?

But it gets more exciting. A close look at those texts reveals that only the two Chinese versions and the English text were authored by a human; the other 25 are all machine translations. A typo in the Chinese suggests that a Pinyin input method was used. Substituting 帮组 bāngzǔ for 帮助 bāngzhù is indicative of a Chinese speaker hailing from a southern topolect. Other vocabulary choices support the same theory. The English, in turn, “appears to be written by someone with a strong command of English, [but] a glaring grammatical error in the note suggests the speaker is non-native or perhaps poorly educated.” According to Language Log[3], the error is “But you have not so enough time.”
I find all this revealing for two reasons. One, language matters. With a bit of luck (for us, not the hackers), a typo and an ungrammatical sentence may ultimately deliver a life sentence for the shareholders of this particular venture. Two, language matters only so much. In these criminals’ cost-benefit analysis, free MT was exactly the amount of investment those 25 languages deserved.

This is the entire translating profession’s current existential narrative in a nutshell. One, translation is a high-value and high-stakes affair that decides lawsuits; it’s the difference between lost business and market success. Two, translation is a commodity, and bulk-market translators will be replaced by MT real soon. Intriguingly, the WannaCry story seems to support both of these contradictory statements.

Did the industry sidestep the real question?

I remember how 5 to 10 years ago panel discussions about translation quality were the most amusing parts of conferences. Quality was a hot topic and hotly debated. My subjective takeaway from those discussions was that (a) everyone feels strongly about quality, and (b) there’s no consensus on what quality is. It was the combination of these two circumstances that gave rise to memorable, and often intense, debates.

Fast-forward to 2017, and the industry seems to have moved on from this debate, perhaps admitting through its silence that there’s no clear answer.

Or is there? The heated debates may be over, but quality assessment software seems to be all the rage. There’s TAUS’s DQF initiative[4]. Its four cornerstones are (1) content profiling and knowledge base; (2) tools; (3) a quality dashboard; (4) an API. CSA’s Arle Lommel just wrote [5] about three new QA tools on the block: ContentQuo, LexiQA, and TQAuditor. Trados Studio has TQA, and memoQ has LQA, both built-in modules for quality assessment.

I have a bad feeling about this. Could it be that the industry simply forgot that it never really answered the two key questions, What is quality? and How do you achieve it? Are we diving headlong into building tools that record, measure, aggregate, compile into scorecards and visualize in dashboards, without knowing exactly what and why?

A personal affair with translation quality

I recently released a pet project, a collaborative website for a German-speaking audience. It has a mix of content that’s partly software UI, partly long-form, highly domain-specific text. I authored all of it in English and produced a rough German translation that a professional translator friend reviewed meticulously. We went over dozens of choices ranging from formal versus informal address to just the right degree of vagueness where vagueness is needed, versus compulsive correctness where that is called for.

How would my rough translation have fared in a formal evaluation? I can see the right kind of red flags raised for my typos and lapses grammar, for sure. But I cannot for my life imagine how the two-way intellectual exchange that made up the bulk of our work can be quantified. It’s not a question of correct vs. incorrect. The effort was all about clarifying intent, understanding the target audience, and making micro-decisions at every step of the way in order to achieve my goals through the medium of language.

Lessons from software development

The quality evaluation of translations has a close equivalent in software development.

CAT tools have automatic QA that spots typos, incorrect numbers, deviations from terminology, wrong punctuation and the like. Software development tools have on-the-fly syntax checkers, compiler errors, code style checkers, and static code analyzers. If that’s gobbledygook for you: they are tools that spot what’s obviously wrong, in the same mechanical fashion that QA checkers in CAT tools spot trivial mistakes.

With the latest surge of quality tools, CAT tools now have quality metrics based on input from human evaluators. Software developers have testers, bug tracking systems and code reviews that do the same.

But that’s where the similarities end. Let me key you in on a secret. No company anywhere evaluates or incentivizes developers through scorecards that show how many bugs each developer produced.

Some did try, 20+ years ago. They promptly changed their mind or went out of business.[6]

Ugly crashes not withstanding, the software industry as a whole has made incredible progress. It is now able to produce more and better applications than ever before. Just compare the experience of Gmail or your iPhone to, well, anything you had on your PC in the early 2000s.

The secret lies in better tooling, empowering people, and in methodologies that create tight feedback loops.

Tooling, empowerment, feedback

In software, better tooling means development environments that understand your code incredibly well, give you automatic suggestions, allow you to quickly make changes that affect hundreds of files, and to instantly test those changes in a simulated environment.

No matter how you define quality, in intellectual work, it improves if people improve. People, in turn, improve through making mistakes and learning from them. That is why empowerment is key. In a command-and-control culture, there’s no room for initiative; no room for mistakes; and consequently, no room for improvement.

But learning only happens through meaningful feedback. That is a key ingredient of methodologies like agile. The aim is to work in short iterations; roll out results; observe the outcome; adjust course. Rinse and repeat.

Takeaways for the translation industry

How do these lessons translate (no pun intended) to the translation industry, and how can technology be a part of that?

The split. It’s a bit of an elephant in the room that the so-called bulk translation market is struggling. Kevin Hendzel wrote about this very in dramatic terms in a recent post[7]. There is definitely a large amount of content where clients are bound to decide, after a short cost-benefit analysis, that MT makes the most sense. Depending on the circumstances it may be generic MT or the more expensive specialized flavor, but it will definitely not be human translators. Remember, even the WannaCry hackers made that choice for 25 languages.

But there is, and will always be, a massive and expanding market for high-quality human translation. Even from a purely technological angle, it’s easy to see why MT systems don’t translate from scratch. They extrapolate from existing human translations, and those need to come from somewhere.

My bad feeling. I am concerned that the recent quality assessment tools make the mistake of addressing the fading bulk market. If that’s the case, the mistake is obvious: no investment will yield a return if the underlying market disappears.
 . Source: TAUS Quality Dashboard [link]

Why do I think that is the case? Because the market that will remain is the high-quality, high-value market, and I don’t see how the sort of charts shown in the image above will make anyone a better translator.

Let’s return to the problems with my own rough translation. There are the trivial errors of grammar, spelling and the like. Those are basically all caught by a good automatic QA checker, and if I want to avoid them, my best bet is a German writing course and a bit of thoroughness. That would take me to an acceptable bulk translator level.

As for the more subtle issues – well, there is only one proven way to improve there. That way involves translating thousands of words every week, for 5 to 10 years on end, and having intense human-to-human discussions about those translations. With that kind of close reading and collaboration, progress doesn’t come down to picking error types from a pre-defined list.

Feedback loops. Reviewer-to-translator feedback would be the equivalent of code reviews in software development, and frankly, that is only part of the picture. That process takes you closer to software that is beautifully crafted on the inside, but it doesn’t take you closer to software that solves the right problems in the right way for its end users. To achieve that, you need user studies, frequent releases and a stable process that channels user feedback into product design and development.

Imagine a scenario where a translation’s end users can send feedback, which is delivered directly to the person who created that translation. I’ll key you in on one more secret: this is already happening. For instance, companies that localize MMO (massively multiplayer online) games receive such feedback in the form of bug reports. They assign those straight to translators, who react to them in a real-time collaborative translation environment like memoQ server. Changes are rolled out on a daily basis, creating a really tight and truly agile feedback loop.

Technology that empowers and facilitates. For me, the scenario I just described is also about empowering people. If, as a translator, you receive direct feedback from a real human, say a gamer who is your translation’s recipient, you can see the purpose of your work and feel ownership. It’s the agile equivalent of naming the translator of a work of literature.

If we put metrics before competence, I see a world where the average competence of translators stagnates. Instead of an upward quality trend throughout the ecosystem, all you have is a fluctuation, where freelancers are data points that show up on this client’s quality dashboard today, and a different client’s tomorrow, moving in endless circles.

I disagree with Kevin Hendzel on one point: technology definitely is an important factor that will continue to shape the industry. But it can only contribute to the high-value segment if it sees its role in empowerment, in connecting people (from translators to end users), in facilitating communication, and in establishing tight and actionable feedback loops. The only measure of translation quality that everyone agrees on, after all, is fitness for purpose.


[1] Attribution of the WannaCry ransomware to Chinese speakers. Jon Condra, John Costello, Sherman Chu
[2] Fewer, for the pedants.
[3] Linguistic Analysis of WannaCry Ransomware Messages Suggests Chinese-Speaking Authors. Victor Mair
[4] DQF: Quality benchmark for our industry. TAUS
[5] Translation Quality Tools Heat Up: Three New Entrants Hope to Disrupt the Industry. Arle Lommel, Common Sense Advisory blog.
[6] Incentive Pay Considered Harmful. Joel On Software, April 3, 2000
[7] Creative Destruction Engulfs the Translation Industry: Move Upmarket Now or Risk Becoming Obsolete. Kevin Hendzel, Word Prisms blog.

Gábor Ugray is co-founder of Kilgray, creators of the memoQ collaborative translation environment and TMS. He is now Kilgray’s Head of Innovation, and when he’s not busy building MVPs, he blogs at and tweets as @twilliability.

Saturday, June 3, 2017

Creating a Unified Experience - Meet the Content Fabric!

Recently I have had reason to look closely at how content flows, user experience and the e-Commerce market are related, and I noticed some things that I thought were worth pointing out and highlighting in a blog post. These connections were triggered by a post I saw on Content Fabric by Anna Schlegel.

The demands of the digital landscape and the 21st-century digital audience are challenging and some might say unforgiving for a modern global enterprise. Generally, no internet user is searching the web hoping to find a corporate advertisement and hear about how great your products are in standard corporate and marketing-speak product overviews. Most of the time, the random user on the web is not searching for you or for your company. More than likely, they are searching for an answer to a question. If you can provide a useful answer, they may spend more time and look more closely at your website, social presence, and other content. If  you can help them understand, and educate them on the general subject domain, not just your product related subject matter, they may even begin to trust you and your communications, and if you can provide a good customer experience after they buy your product, they may even advocate using your products.

"Retail guys are going to go out of business and ecommerce will become the place everyone buys. You are not going to have a choice," he says. "We're still pre-death of retail, and we're already seeing a huge wave of growth. The best in class are going to get better and better. We view this as a long term opportunity.“                                                                                                                                                                                                              - Marc Andreessen, 2013

The evidence from retail store closings supports the statement above. Companies that provide mediocre retail experiences are certainly endangered.  Source: via CB Insights

According to research from Gartner, more than 90% of organizations don't have a formal content strategy in place to ensure the content they produce is easy to find and access, and consistent across different customer touch points. As a result, the customer journey is riddled with inconsistent experiences and is often frustrating or confusing. Content Marketing Institute reported that 76 percent of B2C organizations use content marketing, but only 38 percent said they do so effectively. And 57 percent of B2C marketers aren't sure what successful content marketing even looks like. 

For an e-commerce site having the right content is a critical necessity since the web site is the store. Good content marketing strategies have 3 clear benefits, 1)They help generate leads, 2) They help educate the customer, and, 3) They help to build relationships. With helpful, quality content, you can enhance the customer experience and this enhanced experience is essential for building strong relationships.

There is a relationship between content strategy, e-commerce and MT, as much of this new content that enhances the customer experience is constantly changing, and there is great value in making it multilingual to enable engagement with a broader global customer base. The e-commerce revolution is much deeper than Amazon, eBay and Alibaba. Even the retail dominance of giants like Procter & Gamble are being challenged  by e-commerce startups as the chart from CB Insights below shows.

This is a long preamble on why content matters and why it must flow, and an introduction to a guest post by Anna Schlegel (Sr. Director, Globalization & Information Engineering, NetApp) who I think produces great content on best practices in content strategy 😊😊😊. For those who think that this requirement is only a B2C issue, should  take a look at the latest "State of the Internet" presentation by Mary Meeker of Kleiner Perkins, who points out that B2B customers are also increasingly expecting the same kind of buyer journey that is packed with the right kind of  information when needed to educate and enhance the customer experience. What is the digital “buyer journey”? The online self-service movement from product discovery to purchase decision to product delivery to after-sales service. Anna is evidence of this already being a requirement in the B2B world.


The interest in customer experience presents an opportunity for enterprise content strategists. Ultimately, the challenge is in execution – once you raise awareness of the importance of content synchronization, you are expected to deliver on your promises. You must figure out how to deliver information that fits smoothly into the entire customer experience.

 You need a customer experience that does not reproduce your organization’s internal structure. Customers need relevant, usable, and timely information – they don’t care that the video was developed by tech support and the how-to document by tech pubs. When customers search your website, they want all relevant results, not just documents from a specific department. Furthermore, they assume you will use consistent terminology and will provide content in their preferred language. To meet these expectations, you need a unified content strategy.

 At NetApp, the Information Engineering team uses the term Content Fabric to describe this approach. In the Content Fabric, customers get seamless delivery of the right information based on their needs. Multiple departments are involved in creating and delivering content. The processes are complex, but customers only see the end result. The Content Fabric aims to deliver information for each customer at the point of need.

Weaving A Content Fabric 

To deliver a content fabric, you need the following key features:
  • Search across all content sources
  • Content in appropriate format(s)
  • Content in appropriate languages
Each of these requirements expands into significant implementation challenges. To provide search across all content sources, for example, you have to solve the following issues:
  • Provide search across more than a single deliverable (such as a PDF file)
  • Provide search across all deliverables, from one department for one product, from all sources for one product, and from all sources for all products
  • Align product classification schemes, terminology, and content localization across the organization
 Several teams typically share responsibility for content development and delivery. Each group has a different perspective on information, a different tool set, and a different set of expectations from their content creators. But somehow, you have to ensure that their content works in the big picture.

Unifying Content Organizations Is Important

Delivering a seamless Content Fabric means that different organizations must deliver compatible information. There are two main options to achieve this goal:
  • Consolidate the content creators in a single organization
  • Ensure that diverse content creators use the same content standards
Consolidation makes sense for similar roles. For example, most organizations put the entire technical communication function in a single team. Technical support and marketing have important content to contribute, but their functions and priorities differ from those of tech comm. In this case, the sensible approach is to share content infrastructure, including the following:
  • Terminology and Style Guides. All content creators must use agreed-upon terminology to refer to the same thing. Everyone uses the same corporate style guide.
  • Taxonomy. The classification system for content is shared across the organization. For example, the organization defines a set of categories, such as product name, release number, and content type, that labels products and information.
  • Translation. Unified delivery extends across all supported languages.
  • Content Structure. A reference document always has the same structure, no matter who created it. Similarly, knowledge base articles always have the same structure across the organization.
  • Content Formatting. All company content looks related, and all content of a particular type matches.
  • Search. All website content is searchable through a single search interface.
  • Connected Content Development Systems. Move all content creators into a single content development environment, or at a minimum, loosely connect multiple systems to produce a consistent end result.


The Value of the Content Fabric

Why should organizations consider a Content Fabric like NetApp’s? After all, it’s challenging to have consistency in a single department, let alone half a dozen groups across a far-flung organization.
The value of the Content Fabrics is two-fold. First, you improve the customer experience. Instead of repeatedly transferring customers from one group to another, you provide the customer with a consistent, high-quality experience, all in a single location. Second, you improve the overall content development process with less content redundancy and a single set of content development standards. In manufacturing terms, you are reducing waste and improving your quality control.

First Steps Toward Your Own Content Fabric

To begin the move toward your own Content Fabric, start with some basic questions:
  • What information do you need to deliver, and where?
  • How is that information created, and by whom?
  • What standards are needed?
Once you understand the current environment, you can look at improving two major areas:
  • Content Development. Ensure that all content developers have the tools, technologies, and support they need to produce the right information.
  • Content Delivery. Ensure that information is presented to the customer in a unified interface.
With the Content Fabric, your customers will have a seamless experience, no matter how they access information. Start weaving your content together today!

Sr. Director, Globalization and Information Engineering