Thursday, May 18, 2017

Terminology Usage in Machine Translation

This is a guest post by Christian Eisold from berns language consulting. For many who have been using MT seriously already, it is clear that efforts made to implement the use of the correct terminology are a very high-value effort, as this is an area that is also problematic with a purely human translation effort. In fact, I believe that there is evidence that suggests MT (when properly done) outputs much more consistent translations in terms of consistent terminology. In fact, in most expert discussions on the tuning of MT systems for superior performance, there is a very clear understanding that better output starts with work focused on ensuring terminological consistency and accuracy.  The way to enforce terminological consistency and accuracy has been well known to RBMT practitioners and is also well understood by expert SMT practitioners. Christian provides an overview of how this is done across different use scenarios and vendors below. 

He also points to some of the challenges of implementing correct terminology in NMT models, where the system controls are just beginning to be understood.  Patents are a domain where there is a great need for terminology work and is also a domain that has a huge vocabulary, which is supposedly a weakness of NMT. However, given that the WIPO is using NMT for their Japanese and Chinese patent translations, and seeing the many options available in the SYSTRAN PNMT framework, I think we may have already reached a point where this is less of an issue when you work with experts. NMT has a significant amount of research that is driving ongoing improvements, thus we should expect to see it continually improve over the next few years.

As is the case with all the new data-driven approaches, we should understand that investments in raising data quality i.e. building terminological consistency and rich, full-featured termbases, will have a long-term productivity yield and allow long-term leverage. Tools and processes that allow this to happen, are often more important in terms of the impact they might have than whether you use RBMT, SMT or NMT.

noun: terminology; plural noun: terminologies
  1. the body of terms used with a particular technical application in a subject of study, theory, profession, etc.

    "the terminology of semiotics"

    synonyms:phraseology, terms, expressions, words, language, lexicon, parlance, vocabulary, wording, nomenclature; More
    informallingo, -speak, -ese

    "medical terminology"


Currently, the world of machine translation is struck by the impact of deep learning techniques which deal with the matter of optimizing networks between the sentences in a training set for a given language pair. Neural MT (NMT) has made its way from a few publications in 2014 up to the current state, where several practical applications of professional MT services use NMT in a more and more sophisticated way. The method, which is thought to be a revelation in the field of MT for developers and users alike, does hold many advantages compared to classical statistical MT. Besides the improvements like increased fluency, which is obviously a huge step in the direction of human-like translation output, it is well known that NMT has its disadvantages when it comes to the usage of terminology. The efforts made in order to overcome these obstacles show the crucial role that terminology plays in MT customization. At berns language consulting (blc) we are concerned with and focus on terminology management as a basis for language quality optimization processes on a daily basis. We sense that there is a growing interest for MT integration in various fields of businesses, so we took a look into current MT systems and the state of terminology integration in these systems.

Domain adaptation and terminology

Terminology usage is closely tied to the field of domain adaptation, which is a central concept of engine customization in MT. Domain being a special subject area that is identified by a distinctive set of concepts expressed in terms and namings respectively, using terminology is the key to adaptation techniques in the existing MT paradigms. The effort that has to be undertaken to use terminology successfully differs greatly from paradigm to paradigm. Terminology integration can take place at different steps of the MT process and is applied on different stages of MT training.

Terminology in Rule-Based MT

In rule-based MT (RBMT) specific terminology is handled within separate dictionaries. In applications for private end users, the user can decide to use one or several dictionaries at once. If just one dictionary is used, the term translation depends on the number of target candidates in the respective entry. For example, the target candidates for the German word ‘Fehler’ could be ‘mistake’ or ‘error’ when a general dictionary is used. If the project switches to the use of a specialized software dictionary, the translation for ‘Fehler’ would most likely be ‘bug’ (or ‘error’ if the user wants it to be an additional translation). If more than one dictionary is used, they can be ranked to get translations from the higher ranked dictionaries first if the same source entries are present in more than one of them. Domains can be covered by a specific set of terms gathered from other dictionaries available and by adding own entries to existing dictionaries. For them to function properly with the linguistic rules, it is necessary to choose from a variety of morpho-grammatical features that define the word. While most of the features regarding nouns can be chosen with a little knowledge of case morphology and semantic distinctions, verb entries can be hard to grasp with little knowledge in the verbal semantics.

While RBMT can be sufficient to translate in general domains like medicine, its disadvantages lie in the incapability to adapt to new or fast-changing domains appropriately. If it comes to highly sophisticated systems, tuning the system to a new domain requires trained users and an enormous amount of time. Methods that facilitate fast domain adaption in RBMT are terminology extraction and the import of existing termbases. While these procedures will help the fast integration of new terminology, for the system to get sufficient output quality it is necessary to check new entries for correctness.

Terminology in SMT

Basically, there are 2 ways of integrating terminology in SMT: 1) prior to translation and 2) at runtime. Integration before translation is the preferred way and can be done, again, in two ways: 1) implicitly by using terminology in the writing process of the texts to be used for training and 2) explicitly by adding terms to the training texts after the editing process.

Implicit integration of terminology is the standard case which is simply a more or less controlled byproduct of the editing process before texts are qualified for MT usage. Because SMT systems rely on distributions of source words to target words, it is necessary for the source words and the target words to be used in a consistent manner. If you think of translation projects with multiple authors, the degree of term consistency strongly depends on the usage of tools that are able to check terminology, namely authoring tools that are linked to a termbase. Because the editing process normally follows the train of thought in non-technical text types, there is no control of the word forms used in a text. To take advantage of the terminology at hand, the editing process would need a constant analysis of word form distribution as a basis for completing inflectional paradigms of every term in the text. Clearly, such a process would break the natural editing workflow. In order to achieve compliance with the so-called controlled language in the source texts, authors in highly technical domains can rely on the help of authoring tools together with termbases and an authoring memory. Furthermore, contexts would have to be constructed in order to embed missing word forms of the terms in use. In order to do a fill-up of missing forms, it´s necessary to do an analysis of all the word forms used in the text, which can be done by applying lemmatizers on the text. In the translation step, target terms can be integrated with the help of CAT tools, which is the standard way of doing translations nowadays.

Explicit integration, on the other hand, is done by a transfer of termbase entries into the aligned training texts. This method simply consists of the appendix of terms at the end of the respective file for each language involved in the engine training. Duplication of entries lead to higher probabilities for a given pair of terms, but in order to be sure about the effects of adding terms to the corpus, one would have to do an analysis of term distributions in the corpus, as they will interfere with the newly added terms. The selection of term translation also depends on the probability distributions in the language model. If the term in question was not seen in the language model, it’ll get a very low probability. Another problem with that approach concerns inflection. As termbase entries are nominative singular forms, just transferring them to the corpus may only be sufficient for translation into target languages which are not highly inflected. In order to cover more than the nominative case, word forms have to be added to the corpus until all cells of their inflectional paradigms are filled. Because translation hypotheses (candidate sentences) are based on counts of words in context, for the system to know which word form to choose for a given context, it is necessary to add terms in as many contexts as possible. This is actually done by some MT providers to customize engines for a closed domain.

Whereas terminology integration in the described way is part of the statistical model of the resulting engine, runtime integration of terminology is done by using existing engines together with different termbases. This approach allows for a quick change of target domains to translate in without the need for a complete retraining of the engine. In the Moses system, which in most cases will be the system in mind when SMT is discussed, the runtime approach can be based on the option to read xml-annotated source texts. In order to force term translations based on a termbase, one has to identify inflected word forms corresponding to termbase entries in the source texts first. Preferably this step is backed up by lemmatizer, which derives base forms from inflected forms. Again, it has to be kept in mind, that translating into highly inflected languages, additional morphological processing has to be done.

There are more aspects to domain adaptation in SMT than I discussed here, for example, the combination of training texts with different language models in the training step or the merging of phrase tables of different engines etc. Methods differ from vendor to vendor and depend strongly on the architecture that has proven to be the best for the respective MT service.

Terminology Resources

Terminology doesn’t come from nothing – it grows, lives and changes with time and depending on the domain it applies to. So, if we speak of domain adaption there is always the adaptation of terminology. There is no domain without a specific terminology, because domains are, in fact, terminological collections (concepts linked to specific namings within a domain) embedded in a distinctive writing style. To meet the requirements of constantly growing and changing namings in everyday domains (think of medicine or politics), MT engines have to be able to use resources which contain up-to-date terminology. While some MT providers view the task of resource gathering as the customer’s contribution to the process of engine customization, others provide a pool of terminology the customer can choose from. Others do extensive work to ensure the optimal integration of terminology in the training data by checking for term consistency and completeness in the training data. One step in the direction of real-time terminology integration was made by the TaaS project, which came into being as a collaboration of Tilde, Kilgray, the Cologne University of Applied Sciences, the University of Sheffield and TAUS. The cloud-based service offers a few handy tools, e.g. for term extraction and bilingual term alignment, which can be used by MT services to dynamically integrate inflected word forms for terms that are unknown to the engine in use. In order to do so, TaaS is linked to online terminology repositories like IATE the EuroTermBank and TAUS. At berns language consulting, we spoke with the leading providers of MT services and most of them agree that up-to-date terminology integration by online services will become more and more important in flexible MT translation workflows.

Quality improvements – what can MT users do?

Summing up term integration methods, what can we do to improve terminology usage and translation quality with MT systems? Apart from the technical aspects that are best left to the respective tool vendors, the MT user can do a lot to optimize MT quality. Terminology management is the first building block in this process. Term extraction methods can help to build mass from scratch when there are no databases yet. As soon as a termbase is created, it should be used together with authoring tools and CAT tools to produce high-quality texts with consistent terminology. Existing texts that do not meet the requirements of consistent terminology should be checked for term variances as a basis for normalization, which aims at the usage of one and the same term for one concept in the text. Another way to improve term translations is by constantly feeding back post-edits of MT translations into the MT system, which could happen in a collaborative way by a number of reviewers or by using a CAT tool together with MT Plug-ins. The process of terminology optimization in translation workflows strongly depends on the systems and interfaces in use, so solutions may change from customer to customer. As systems constantly improve – especially in the field of NMT – we’re looking forward to all the changes and opportunities this brings to translation environments and translation itself.


Christian Eisold has a background in computational linguistics and is a consultant at berns language consulting since 2016.

At blc he supports customers with translation processes, which involves terminology management, language quality assurance and MT evaluation.

Monday, May 8, 2017

Artificial Intelligence in the Language Industry: We’re Asking the Wrong Questions

This is an interesting guest post by Gábor Ugray on the potential of AI in the translation business.  We hear something about artificial intelligence almost every day now and are continually told that it will change our lives. AI is indeed helping to solve complex problems that even a year ago were virtually unthinkable. Mostly, these are problems where big data and massive computing can come together to produce new kinds of efficiencies and even production solutions. However, there are dangers and risks too, and it is wise to be aware of some of the basic driving forces that underly these problems. As we have seen with self-driving cars, sometimes things don't quite work as you would expect. These mishaps and unintended results can happen when we barely understand what and how the computer "understands". Machine learning is not perfect learning and much of what is learned through deep neural nets, in particular, is kind of mysterious, to put it nicely. 

We have seen that many in the translation industry have more often misused, or abusively used MT to bully translators to accept lower rates, and accept demeaning work, then used where it actually makes sense. We are just beginning to emerge into a stage where we see the more informed and appropriate use of MT, in the very recent past, however, many translators have already been bloodied.  Is AI the new monster we will use to terrorize the translator, or is it a potential work assistant that actually enhances and improves the translation work process? This will depend on us and what we do, and it is good to see Gábor's perspective on this as he is one of the architects of how this might unfold.

Gábor warns us about some key issues related to AI and points us towards asking the right questions to guide enduring and positive change and deployment. We should understand the following:
  • AI is almost completely dependent on training data and we know data is often suspect.
  • Improperly used, there is a risk of inadvertent or deliberate dehumanization of work as in early PEMT use.
 Neural networks are closed systems. The computer is learning something out of a data set in an intelligent but incomprehensible and obscure way to a human eye and human mind. But Google claims they are able to visualize the produced data as described in the zero-shot translation post  where they say:
Within a single group, we see a sentence with the same meaning but from three different languages. This means the network must be encoding something about the semantics of the sentence rather than simply memorizing phrase-to-phrase translations. We interpret this as a sign of existence of an interlingua in the network. 
Is this artificial intelligence or is this just another over-the-top-claim of "magical" scientific success? If we cannot yet define intelligence for humans, how can we even begin to do so for machines? AI is more than often not much more than optimized data-driven task systems, which can be very impressive, but can we really say this is intelligence? A few are quite wary about this whole AI trend. Here is some discussion on shit really going down, driven by AI  which has gone awry.

So hopefully here is a question that makes sense to Gabor: "What needs to happen to make AI-based technology trustworthy and useful in the "language industry"? 

I do basically believe that technology wisely used can indeed improve the human condition but we are surrounded by examples of how things can go wrong without some forethought and these questions that Gábor points to indeed are worth asking. For those who want to dig deep into the big picture on AI, I recommend this article, though I have some reservations about the second part.

As the BBC said recently: Machines still have a long way to go before they learn like humans do – and that’s a potential danger to privacy, safety, and more.


I was honored when Kirti asked me if I would contribute to eMpTy Pages about TMS and intelligent data technologies. I’ve been thinking about this for nearly two months now until I finally realized what’s been holding me back. I find it difficult to attach to most of the ongoing discourse about AI, and that’s because I believe the wrong questions are being asked.

Those questions usually revolve around: What part of life can I disrupt through AI? How can my business benefit from AI? Or, if you prefer the fear angle: Will my company be disrupted out of existence if I don’t jump on the AI train in time? Will my job be made obsolete by thinking machines?

My concern is different. But I won’t tell you until the end of this post.

It’s only as good as your data

I found Kirti’s remark in his recent intro very insightful: “Machine learning” is a fancy way of saying “finding patterns in data.”

That resonates with the way I think about MT, whether it’s the statistical or neural flavor. In simple terms, MT is a tool to extrapolate from an existing corpus to get leverage for new content. If you think about it, that’s what translation memory does, too, but it stops at fuzzy matches, concordance searches, and some amount of sub-segment leverage.

Statistical MT goes far beyond that, but at a higher cost: it needs more data and more computation. Neural MT ups the ante yet again: it needs another order of magnitude more computational power and data. The concept has been around for decades; the “deep learning” aka “neural network” explosion of the past few years has one simple reason. It took until now for both the data and the computational capacity to become available and affordable.

The key point is, AI is machine-supported pattern extraction from large bodies of data, and that data has to come from somewhere. Language data comes in the form of human-authored and human-translated content. No MT system learns a language. They process text to extract patterns that were put in there by humans.

And data, when you meet it out in the wild, is always dirty. It’s inconsistent, in the wrong format, polluted with stuff you don’t want in there. Just think of text from a pair of aligned PDFs, with the page numbers interrupting in all the wrong places, OCR errors, extra line breaks, bona fide typos and the rest.

So, even on this elementary level, your system is only as good as your data, not to mention the quality of the translation itself. And this is not specific to the translation industry: the job reality of every data scientist is 95% gathering, cleaning, pruning, formatting and otherwise massaging data before the work can even begin.

Do we have the scale?

AI, MT and machine learning are often used synonymously with automation, but in reality, they are far from that. As Kirti explained in another intro, in order to get results with MT you need technical expertise, structure, and processes beyond technology per se. All of these involve human effort and skills, and pretty expensive skills too.

So the question is: at what point does an LSP or an enterprise get a positive return on such an investment? How much content must first be produced by humans; what is the cost of training the MT system; what is the benefit per million words (financial, time or otherwise)? How many million words must be processed before you’re in the black?

No matter how I look at it, this is an expensive, high-value service. It doesn’t scale in a human-less way as the software does.

Does the translation industry have the same economy of scale that a global retailer or a global ad broker disguised as a search engine does? Clearly, a number of technology providers, Kilgray among them, are thriving in this market. But I also think it’s delusional to expect the kind of hockey-stick disruption that is the stuff Silicon Valley startup dreams are made of.

Let’s talk about the weather

I have been focusing mostly on MT, but that’s misleading. I do think there are many other ways machine learning will contribute to how we work in the translation industry. Most of these ways are as-yet uncharted, which I think is a consequence of the industry’s market constraints.

I’ll zoom out from our industry now. I checked how many results Google finds if I search for a few similar phrases.

-- AI in weather forecasting: 1.29M
-- AI in language processing: 14.2M
-- AI in police: 69.7M

Of the three, I’d say without hesitation that weather forecasting yields itself best to advanced AI. Huge amounts of data: check. Clear feedback on success: check. Much room for improvement: check. And yet, going by what’s written on the Internets, that’s not what society thinks.

There is a near-universal view that technology is somehow neutral and objective, which I think is blatantly false. Technology is the product of a social and economic context, and it is hugely influenced by society’s shared beliefs, mythologies, fears, and desires.

Choose your mythology wisely

I am on odd one: in addition to AdBlock and Privacy Badger, my browser deletes all history when I close it, which is multiple times a day. At first, I just noticed the cookie messages that kept returning. Then I started getting warning emails every time I logged in to Twitter or Google. Finally, my password manager screwed me completely, requesting renewed email verification every time I launched the browser.

These are all well-meaning security measures, with sophisticated fraud detection algorithms in the back. But they work on the assumption that you leave a rich data trail. It is by cutting that trail that you realize how pervasive the big data net already is around you in your digital life. For a different angle on the same issue, read Quinn Norton’s poignant Love in the Time of Cryptography.
Others have written about the way machine learning perpetuates biases that are encoded, often in subtle ways, in their training datasets. In a world where AI in police outscores the weather and language, that’s a scary prospect.

With all of this I mean to say one thing. Machine learning, data mining, AI – whatever you want to call it, in conjunction with today’s abundance of raw digital data, this technology has the potential to be dehumanizing in an unprecedented way. I’m not talking conveyor-belt slavery or machines-will-take-my-job anxiety. This is more subtle, but also more far-reaching and insidious.

And we, as engineers and innovators, have an outsized influence on how today’s nascent data-driven technologies will impact the world. The choice of mythology is ours.

UX is the lowest-hanging fruit

After this talk about machine learning and big data on a massive scale, let’s head back to planet Earth. To my view of a translator’s workstation, to be quite precise.

Compared to even a few years ago, there is a marvelous wealth of specialized information available online. There are pages of search results just for terminology sites. There are massive online translation memories to search. There are online dictionaries and very active discussion boards.
Without the need to name names, the user experience I get from 99% of these tools is between cringe-worthy and offending. (Linguee being one notable exception to this.)

Here is one reason why I have a hard time enthusing about cutting-edge AI solutions for the language industry. Almost everywhere you look, there are low-hanging fruits in terms of user experience, and you don’t need clusters of 10-kilowatt GPUs to pluck them. I think it’s misguided to go messianic until we get the simple things right.

Two corollaries here. One, I myself am guilty as charged. Kilgray software is no exception. We pride ourselves that our products are way better than the industry average, but they; too, have a ways to go still. Rest assured, we are working on it.

Two, user experience also happens in the context of market constraints. All of the dismal sites I just checked operate on one of two models: ad revenues, or no revenues. I have bad news for you. These models make you the product, not the customer. This is not specific to the translation industry. The world at large has yet to figure out a non-pathological way to monetize online content.

Value in human relationships

I’ve been talking to a lot of folks recently whose job is to make complex translation projects happen on time and in good quality. Now it may be that my sample is skewed, but I saw one clear pattern emerging in these conversations.

I wasn’t told about standardized workflows. I didn’t hear about machine learning to pick the best vendor from a global pool of X hundred thousand translators. I didn’t perceive the wish to shave another few percent off the price by enhanced TM leverage.

The focus, invariably, was human relationships. How do I build a long-term working relationship based on trust with my vendors? How do I do the same with my own clients? How do I formulate the value that I add as a professional, which is not churning through 10% more words per day, but enabling millions of additional revenue from a market that was hitherto not accessible?

Those are not yet the questions I’m asking about AI, but they are closing in on my point. In a narrow sense, I see technology as an enabler: a way to reduce the drudge so humans have more time left for the meaningful stuff that only humans can do.

Fewer clicks to get a project on track? Great. More relevant information at the fingertips of translators and project managers? Awesome. Less time wasted labeling and organizing data, finding the right resources, finding the right person to answer your questions? Absolutely. Finding the right problems to work on, where your effort has the greatest impact? Prima.

AI has its place in that toolset. But let’s not forget to get the basics right, like building software with empathy for the end user.

The right question

Whether or not AI will be part of our lives is not a question. Humans have a very elastic brain, and whatever invention you give us, we will figure out a use for it and even improve on it.

I argued that technology is not a Platonic thing of its own, but the product of a specific social and economic context. I also argued that if you instrumentalize big data and machine learning within the wrong mythology, it has a disturbing potential to dehumanize.

But these are not inescapable forces of nature. The mythology we write for AI is a matter of choice, and the responsibility lies with us, engineers and innovators.

The right question is: 
How do I use AI responsibly? 
Is empathy at the center of my own engineering work?

No touchy-feely idealism here; let’s talk enlightened self-interest.

As a technology provider, I can create products with the potential to dehumanize work and encroach on privacy. That may give me a short-term advantage in a race to the bottom, but it will not lead to a sustainable market for my company. Or I can create products that help my customers differentiate themselves through stronger relationships, less drudge, and added value to their clients. Because I’m convinced that these customers are the ones who will be successful in the long run, I am betting on building technology for them.

That means engaging with customers (then engaging some more) to learn what problems they face every day, instead of worrying about the AI train. If the solution involves AI, great. But more likely it’ll be something almost embarrassingly simple.


Gábor Ugray is co-founder of Kilgray, creators of the memoQ collaborative translation environment and TMS. He is now Kilgray’s Head of Innovation, and when he’s not busy building MVPs, he blogs at and tweets as @twilliability.

Thursday, May 4, 2017

"Specializing" Neural Machine Translation in SYSTRAN

We see that Neural MT continues to build momentum and that already most people agree that generic NMT engines outperform generic phrase-based SMT engines. In fact, in recent DFKI research analysis, generic NMT even outperforms many domain-tuned Phrase-based SMT systems. Both Google and Microsoft now have an NMT foundation for many of their most actively used MT languages. However, in the professional use of MT where the MT engines are very carefully tuned and modified for a specific business purpose, PB-SMT is still the preferred model for now. It has taken many years, but today many practitioners understand the SMT technology better, and some even know how to use the various control levers available to tune an MT engine for their needs. Today, most customized PB-SMT systems involve building a series of models in addition to the basic translation memory derived translation model, to address various aspects of the automated translation process. Thus, some may also add a language model to improve target language fluency, and a re-ordering model to handle word reordering issues. Thus a source sentence could pass through several of these sub-models that address different linguistic challenges before delivering a final target language translation. In May 2017, it may be fair to say most production MT systems in the enterprise context are either PB-SMT or RBMT systems.

One of the primary criticisms of NMT today (in the professional translation world) is that NMT is very difficult to tune and adapt for the specific needs of business customers who want greater precision and accuracy in their MT system output. The difficulty, it is generally believed, is based on three things:
  1. The amount of time taken to tune and customize an NMT system,
  2. The mysteriousness of the hidden layers that make it difficult to determine what might fix specific kinds of error patterns,
  3. The sheer amount/cost of computing resources needed in addition to the actual wall time.

However, as I have described before, SYSTRAN is perhaps uniquely qualified to address this issue as they are the only MT technology developer that has deep expertise in all of the following approaches: RBMT, PB-SMT, Hybrid RBMT+SMT and now Neural MT. Deep expertise means they have developed production-ready systems using all these different methodologies. Thus, the team there may see clues that regular SMT and RBMT guys either don't see or don't understand. I recently sat down with their product management team to dig deeper into how they will enable customers to tune and adapt their Pure Neural MT systems, and they provided a clear outline of several ways by which they already address this problem today, and described several more sophisticated approaches that are in late stage development and expected to be released later this year.

The critical requirement from the business use context for NMT customization is the ability to tune a system quickly and cost-effectively, even when large volumes of in-domain data are not available.

What is meant by Customization and Specialization?

It is useful to use the correct terminology when we discuss this, as confusion rarely leads to understanding and effective response or action. I use the term customization (in many places across various posts on this blog) as a broad term to point to the tuning process that is used to adjust an MT system for a specific and generally limited or focused business purpose, to what I think they call a "local optimum" in math. Remember in any machine learning process it is all about the data. I use the term customization to mean the general MT system optimization process which could be any of the following scenarios: Your Data + My Data, All My Data, Part of Your Data + My Data, and really, there is a very heavy SMT bias in my own use of the term customization.

SYSTRAN product management prefers to use the term "specialization" to mean any process which may perform different functions but which all consist of adapting an NMT model to a specific translation context (domain, jargon, style). From their perspective, the term, specialization better reflects the way NMT models are trained than customization because NMT models acquire new knowledge rather than being dedicated to production use in only one field. This may be a fine distinction to some, but it is significant to somebody who has been experimenting with thousands of NMT models over an extended period. The training process and its impact for NMT are quite different, and we should understand this as we go forward.

The Means of NMT Specialization - User Dictionaries

A user dictionary (UD) is a set of source language expressions associated with specific target language expressions, optionally with Part Of Speech (POS) information. This is sometimes called a glossary in other systems. User dictionaries are used to ensure a specific translation is used on some specific parts of a source sentence within a given MT system. This feature allows end-users to integrate their own preferred terminology for critical terms. In SYSTRAN products, UDs have been historically available and used extensively for their Rule-Based Systems. User Dictionary capabilities are also implemented in SYSTRAN SMT and Hybrid (SPE - Statistical Post-Editing) systems.

SYSTRAN provides four levels of User Dictionary capabilities in their products as described below:
  • Level 1: Basic Pattern Matching: The translation source input is simply matched against the exact source dictionary entries and the preferred translation target string is output.
  • Level 2: Level 1 + Homograph Disambiguation: Sometimes, a source word may have different part-of-speech (POS) associations. A Level 2 UD entry is selected according to its POS in the translation input, and then the right translation is provided. An example:
    • “lead” as a noun: Gold is heavier than lead > L'or est plus lourd que le plomb
    • “lead” as a verb: He will lead the meeting > Il dirigera la reunion
  • Level 3: Morphology: A single base form from an UD entry may match different inflected forms of the same term, thus the translation is produced by the proper inflection of the target form of the UD entry according to matched POS and grammatical categories.
  • Level 4: Dependence: This allows a user to define rules to translate a matched entry depending on other conditionally matched entries.
As of April 2017, only the Level 1 User Dictionary capabilities are available for the PNMT systems, but the other options are expected to be released during this year.  We can see below, that even the Level 1 basic UD tuning functionality, has an impact on the output quality of an NMT system.

The Means of NMT Specialization - Domain Specialization & Hyper-Specialization

The ability to do “domain specialization” of an MT system is always subject to the availability of in-domain training corpora. But often, sufficient volumes of the right in-domain corpora are not available. In this situation, apparently, NMT has a big advantage over SMT. Indeed, it is possible with NMT systems to take advantage of the mix of both generic data and in-domain data. This is done by taking an already trained generic NMT engine and “overtrain” it on the sparse in-domain data that is available. (To some extent this is possible in SMT as well, but generally, much more data is required to overpower the existent statistical pattern dominance.)

According to the team: "In this way, it is possible to obtain a system that has already learned the general (and generic) structure of the language, and which is also tuned to produce content corresponding to the specific domain. This operation is possible because we can iterate infinitely on the training process of our neural networks. That means that specializing a generic neural system is just continuing a usual training process but on a different kind of training data that are representative of the domain. This training iteration is far shorter in time than a whole training process because it is made on a smaller amount of data. It typically takes only a few hours to specialize a generic model, where it may have taken several weeks to train the generic model itself at the first time."

They also pointed out how this "over-training" process is especially good for narrow domains and what they call "hyper-specialization".  "The quality produced by the specialization really depends on the relevance of the in-domain data. When we have to translate content from a very restrictive domain, such as printer manuals for example, we speak of “hyper-specialization” because a very small amount of data is needed to well represent the domain and the specialized system is rather only good for this restrictive kind of content – but really good for it."

For those who want to understand how data volumes and training iterations might impact the quality produced by the specialization, SYSTRAN provided this research paper that describes some specific experiments in some detail and shows that using more in-domain training data produces better results.

The Means of NMT Specialization - Domain Control

This was an interesting feature and use-case that they presented to me to handle special kinds of linguistic problems. Their existing tuning capabilities are positioned to be able to be extended to handle automated domain identification and special style and politeness-level issues in some languages. Domain Control is an approach that allows a single training and model to be developed that can be used to handle multiple domains, with some examples given below.

There are languages where the translation of the same source may be different given the politeness context or perhaps a different style is required.  The politeness level is especially important for the Korean and Japanese (and Arabic too) languages.This feature is one of many control levers that SYSTRAN provides customers who want to tune their systems to very specific use-case requirements.

"To address this, we train our model adding metadata about the politeness context of each training example. Then, at run-time, the translation can be “driven” to produce either formal or informal translation based on the context. Currently, this context is manually set by the user, but in the near future, we could imagine including a politeness classifier to automatically set this information."

They pointed out that this is also interesting because the same concept can be applied to domain adaptation. "Indeed, we took corpora labeled with different domains (Legal, IT, News, Travel, …) and feed it to a neural network training. This way, the model not only benefits from a large volume of training examples coming from several different domains, but it also learns to adapt the translation it generates to the given context.

Again, as of today, we only use this kind of model with the user manually setting the domain he/she wants to translate, but we’re already working on integrating automatic domain detection to give this information to the model at run-time. Thus a single model could handle multiple domains."

As far as domain control is concerned, training data tagging is a step taken during the training corpus preparation process. These tags/metadata are basically annotations. They denote many kinds of information such as politeness, style, domain, category, anything that might be suitable to drive the translation in a particular direction or another. Thus, this capability could also be used to handle the multiple domains and content types that you may find on an eCommerce site like Amazon or eBay.

The annotations may be produced either by humans/annotators or via automated systems using NLP processes. However, as this is done during training preparation, this is not something that SYSTRAN clients will typically do. Then, at run-time, the model needs the same kind annotations for the source content to be translated, and again, this information may be provided either by a user action (e.g: the user selects a domain) or by an automated system (automatic domain detection).
  There is some further technical description on Domain Control in this paper.

The following graphic puts the available adaptation approaches in context and shows how these different approaches vary in terms of effort/time investment and what impact the different strategies may have on the output quality. Using multiple approaches at the same time also adds further power and control to the overall adaptation scenario. The methods of tuning shown here are not incremental or sequential. They can all be used together as needed and as is possible. For those who have been following the discussion with Lilt on their "MT evaluation" study, can now perhaps understand why an instant BLEU snapshot based evaluation is somewhat pointless and meaningless. Judging and ranking MT systems comparatively on BLEU scores like Lilt has done, without using tuning tools properly, is misleading and even quite rude. Clients who can adapt their MT systems to their specific needs and requirements will always do so, and will often use all the controls at their disposal.   SYSTRAN makes several controls available in addition to the full (all my data + all your data training) customization described earlier.  I hope that we will soon start hearing from clients and partners who have learned to operate and use these means of control and are willing to share their experience so that our MT initiatives continue to gain momentum.

Tuesday, May 2, 2017

Creative Destruction Engulfs the Translation Industry: Move Upmarket Now or Risk Becoming Obsolete

This is a guest post by Kevin Hendzel whose previous post on The Translation Market was second only to the Post-editing Compensation post, in terms of long-term popularity and wide readership on this blog. This new post is reprinted with permission and is also available on Kevin's blog with more photos. I am always interested to hear different perspectives on issues that I look at regularly as I believe that is how learning happens.

As with many interesting posts previously published, this I mistakenly thought started from some Twitter banter, when I presumed that Kevin saw this post, which sees MT driving translators (and the translation industry) into an Armageddon scenario. However, it turned out that Kevin wrote his post first, and the Armageddon post may have been a response to it. Steve presents a pretty grim outlook with not much hope in sight. Some of the comments on his post build on this gloomy and forbidding outlook and are worth a gander too. I myself felt that the author (Steve Vitek) of the Armageddon post seriously needed to lighten up, and maybe do some jumping jacks and sing "Hey Jude " (a song known to raise your spirits), but I truly do not mean to diminish his distress or mock him in any way. I too have tried in my own way to educate translators about how to deal with the misuse and abusive use of MT, understand post-editing "opportunities" and translation technology in general, because most translators I have met are really nice, worldly people. 

Anyway, Kevin presents a much more pragmatic, and less submissive response to the situation of MT taking rote, low quality focused, "bulk translation" work away from translators, and suggests that translators "up their game" and gain deep subject matter expertise. Additionally he provides some very specific examples to give fellow translators some hard clues on developing a better future professional strategy than working on translation of bulk content that MT can easily handle, or waiting for an LSP to call you. In case you missed it on his blog, here it is in full, minus some of  the photo images.


Heartfelt, Urgent Advice for Colleagues Stuck in the Downward Pricing Vortex of the Bulk Market

Imagine for a moment that you were in the business of manufacturing digital cameras. In fact, let’s take it a step further and say you invented digital photography. These cameras were a bit pricey at first, and the initial images were terrible, but their innovation was that they eliminated film and film processing; the pictures were immediately visible, could be stored, shared and posted almost anywhere, and the picture quality eventually became startlingly good.
Then along came the smartphone. They were unusually expensive five years ago, but the white-hot smartphone market is synonymous with brutal competition between innovative companies like Apple and Samsung, and pricing, features, longevity and physical resiliency have grown dramatically.

So today such innovation has resulted in a smartphone market where progress is rapid and picture resolution, pixel storage capacity and sharing capabilities have grown by leaps and bounds. Smartphone cameras today are more compact than digital cameras, are easier to use, faster in transporting and storing images, and sit in nearly everybody’s pocket.

So Why Would Anybody Buy a Digital Camera Today?

The answer is: Very, very few people do. Forbes called the collapse of the digital camera market one of the fastest and most startling devastations of a modern commercial market. The smartphone market has been on a tear not just to take market share, but to destroy the entire digital camera market, a process that began in 2010, eventually resulting in the bankruptcy of the company that invented digital photography: Kodak.

Steve Sasson, the Kodak engineer who invented digital photography in 1975, was told by Kodak executives to “keep it quiet,” because it endangered the sale of film, their principal revenue driver.

The “keep it quiet” strategy is a terrible defense against destructive innovation.

The pile of rubble and aged patents that constitute what is left of Kodak are a strong testimony to that reality.

Google Translate as Destructive Innovation

The analogy to the translation market is clear. Google Translate (GT), which has been on an inexorable climb toward better quality over the last fifteen years, contains billions of words in paired language strings in its enormous corpora, which of course are human-produced translations courtesy of our colleagues in such international organizations as the U.N., the E.U, the European Patent Office, and the Canadian Government, as well as multiple other organizations.

These are good, human-produced translations, and don’t even require statistical machine translation.

It’s beyond ironic that calls for translators to collaborate to produce higher quality – calls that are usually ignored – is what GT is actually doing. It is leveraging the work of all your colleagues on a massive global scale, and giving it away for free, as a bundled product.

GT has become the modern translation equivalent of the smartphone. Translators in the bulk market are still trying to sell digital cameras, and are frantically watching prices continue to drop where “good enough quality” is sold. This is the same market where customers recognize that GT is often wrong, but it’s instant, free and “good enough quality.” Now is GT perfect? Of course not. But neither is a smartphone. The lighting is often off, or people feel that they are not flattered, so they take several pictures and pick the one they like best. Snapchat and other platforms provide filters to make people look good, or thinner, or to adjust the color of their face or even to turn them into various cute creatures. All for free.  

(KV - Actually, Google gets a meaningful amount of advertising revenue from the widespread use of their MT service, and from trends they uncover on international commerce activity by analyzing the big data generated by MT use.)

Shift in Expectations

People accept these imperfections in smartphones vs. what an exceptionally good high-end digital camera can produce – much as they accept translation imperfections with a shrug – because there has been a major shift in expectations.

Instant, free, convenient and “good enough” have changed what people expect. Which is why GT famously translates millions of more words every day than all human translators do in a year. Here’s an intriguing question. As a translator with a smartphone, would you spend several hundred dollars to buy a digital camera to take the same pictures you do today with your smartphone?

I think we can safely say that the answer to that question is “no.”

Yet that is what translators in the bulk “good enough to understand” market are asking their clients to do every day. Pay them to translate texts that GT may actually produce better (remember that GT is often simply leveraging the existing translations of your very skilled colleagues).

Clients Awake to the New Reality

And now we are witnessing clients in the bulk market – agencies, small businesses, even major corporations – waking up to this reality. Clients who “just need to know what the document says” are beginning to push back even on the idea that a human needs to take the lead. They often question what a translator produces if they’ve seen a different translation on GT. They demand low, single-digit rates that almost require the use of GT, which turns translators into unwilling post-editors.

These clients’ view is reflected in a famous quote by the photographer Ken Rockwell: “The best camera you can use is the one you have on you.” Increasingly in the translation world, that is GT – and translators who have not honed their skills to move upmarket are feeling the undertow. And it’s getting worse, with rates continuing to edge downward, and translators feeling like commodities, where every human translator is considered indistinguishable from every other.

Markets Where Smartphones Fall Short

To continue our analogy to smartphones, let’s recognize that there will always be markets where smartphones are simply not going to work as cameras. These are domains where quality really does matter and where the added expertise of the photographer is critical and well-compensated. For example:
  • Professional photo shoots of a wide range of products, from cars to food, for high-end professional use by companies;
  • Head-shots for professional portfolios;
  • Photojournalism where the impact of an image requires exceptional talent to capture;
  • Wedding and special-event photography;
  • Live-event coverage for media, sports, and entertainment for commercial purposes;
  • Studio settings for lighting, high-end equipment and the skills to use them.
In translation, those same markets also exist, but they lie several miles above the “good enough to understand” bulk market. These markets are referred to as the ”value-added market” and the “premium market” (distinction discussed below) and typical products include:
  • Annual reports and formal financial disclosure statements required by law that are issued by multinational corporations, where translators must master regulatory issues and complex financial rules;
  • High-profile advertising by Fortune 500 companies, investment banks, high-end consumer goods companies, etc. in high-prestige venues;
  • Professionally published journals, articles and documentation in the sciences and engineering, requiring advanced technical training on the part of the translators;
  • Diplomatic and intelligence data in a wide array of fields critical to national security, both classified and unclassified, where translation often blends into analysis, requiring special expertise;
  • Translations adapted across cultures in ways where the two products end up as completely different works of art.

In the same way that an exceptionally talented photographer can “make magic come alive” in the photographs taken in the examples above, an exceptionally talented translator can also “make the message come alive” in the translation examples.

Note I did not say make “text” or “words” come alive. Those are often translators’ worst enemies, as they are trapped into translating words rather than ideas – what those words express. As I’ve long argued: “Translation is not about words. It’s about what the words are about.”

The photographer Ken Rockwell also famously noted: “The camera’s only job is getting out of the way of making photographs.” This suggests what we have known all along – it’s the talent, expertise, experience and collaborative flair of both the photographer and the translator that makes it possible for these artists to create art: To be able to work in both the value-added market and at the very pinnacle of the industry: the premium market.

Soon these will be the only sectors of the industry that even exist for professional photographers or translators.

Moving Upmarket: Two Pathways out of the Bulk Market

Bulk Market. The market where GT poses a singular threat and is already having an immensely harmful impact on rates is in what we call the “bulk market” – the estimated 60% of commercial translation done “for informational purposes,” or to “convey basic information” where “good enough” is the standard and price is the primary basis of selection, because GT (free) is considered a serious option.

Uneducated clients are also increasingly using GT for “outbound” translations: Into the languages of their clients, for purposes of selling their products to their own customers in broad, general consumer markets, or for software and web content localization, in languages the uneducated clients don’t understand. The pitfalls of such an approach are obvious, as the client cannot judge the results, but the brand power of Google and the mind share grip that GT has on their view of translation has shoved the “for information” translators out of the picture.

Translation Fail: United Airlines

For example, United Airlines recently used GT and a tiny bit of post editing to translate their apology letter relating to the passenger violently dragged off a flight, resulting in a translation that, while understandable, was very far from polished or persuasive in the target languages. Count this as yet another PR blunder by United Airlines in their attempt to enhance their image. One would think such a sensitive and delicate communication would intuitively compel management to demand the best, but we are seeing the immense Google branding power behave like water on a flat surface – it finds the cracks and flows into them all.

The Value-Added Market

The “value-added market” is a higher-end sector that requires special expertise, experience and sensitivity. While not yet the premium market, it’s a solid step above the bulk market, and often where translators work for years as they hone their talents and expertise before moving into the premium market.

The value-added market is where the translations are typically in a specialized subject-area that is sensitive enough for clients to pause at the thought of using GT or any machine translation at all. The complexity of the subject is enough to sow doubt in clients’ minds. The risk of a translation error can be significant. Translators who work in these markets (typical rates in the USD $0.15 – $0.20 range) have completed specialty training in the subject-matter at the university level, are exceptional writers, and have largely completed the switch to direct clients and the best boutique agencies while terminating their relationships with low-end bulk-market agencies and ceased to consider random agency inquiries.

Examples of subject areas include:
  • The entire range of medical, pharmaceutical, and health-care translation, including clinical trials, medical devices and instruments, patient records and charts, physician notes; physician rater training, regulatory and compliance and other health-care specialties;
  • IT and telecom, with principal focus on innovation and next-generation technology solutions;
  • Accounting and auditing on the corporate, institutional and legal levels;
  • Environmental sciences, petroleum and industrial engineering;
  • Entertainment: subtitling, voice-over and A/V at the national network and media level.
While this is obviously a representative list, it is hardly exhaustive, and in several sectors overlaps with the list provided above to contrast with the bulk market. But this value-added market is not the province of generalists or bulk-market translators – those who market themselves in these areas without the true expertise to succeed will certainly fail. A ticket to success in this market is hard-won, and success takes talent, commitment, a thorough knowledge of the subject matter and an excellent record of performance.

The Premium Market

The ironclad way around becoming an unwilling post-editor or being stuck at low single-digit rates is to become an expert translator. A specialist. A true artist. This requires exceptional subject-area knowledge, exquisite writing skills, and a lifetime of collaboration with your most talented colleagues.

Here are some tough-love truths about the bulk vs. the premium market. This will help translators dodge the bulk-market trap of perpetual downward pricing competition.

One can tell the difference between an expert premium market translator’s work and the work of an average bulk-market non-specialist at a glance.

It’s a Picasso vs. a 5th-generation photocopy of a grainy black-and-white mess.

Now, obviously, we all started out as novices producing those grainy photocopies and took a lifetime of work, study, collaboration and the development of subject-matter expertise to get into the Picasso range — it’s important to make this clear.

These “side-by-side” translations have been done going back about a decade. Some come from “mystery shopper” experiments, and others from translation workshops which most of us in the premium market have taught for at least a decade, so we actually SEE the huge discrepancies in quality right there in the room! This is not a mystery or unknown in the industry at large.

Even some raw GT finishes ahead of many translators who are still young, inexperienced, cannot write, or have no idea what they are translating.

Putting Your Translations “At Risk” For All Your Colleagues to See

Translators who engage in translation slams or other competitions where they take the same text and then publicly compare their translations, with hundreds of other translators witnessing that process, and a tough judge in the middle, discussing alternate translations, shades of meaning, the finer points of, say, legal and financial interpretation, etc. show the dramatic differences you see when translators confident in their work — and both highly specialized and also revised by colleagues on a regular basis — are willing to put their translations out there for all their colleagues to see.

For the most part, these competitions occur in the premium market at such events as the “Translate in the…” series, which deals with Fr<->Eng exclusively, and some competitions sponsored by the SFT in France, as well as most recently at the ITI Conference.

Here’s an ironclad rule: If your translations are not out there “at risk” for evaluation by your colleagues, you are not doing it right. You need hands-on, hard-skills collaborative workshops; translation slams; and shared portfolios of your work for everybody to see.

The way to break into these markets is not to just keep translating in isolation, without subject-matter training or collaboration, because the clients paying serious rates (above USD $0.50 per word) are deeply engaged in work in law and banking and industry and technology and are not usually out there in Translatorland talking to translators.

Bemoaning the Lack of Serious Translation Talent

I can’t tell you how many lawyers — just to pick one profession at random — I know who became translators because they were disgusted and fed up with the “quality” they were getting on a regular basis over years from a huge number of different translators who had translated one contract and then marketed themselves as a “legal translator.”

So the way to higher rates starts with subject-matter expertise. You have to know the finer points of the law to offer, say, five different translations of a phrase and explain to your client exactly how they are different in your source language and what you — the expert — think is important in their language (your target language).

That’s right — you get to explain the finer points of the law to a lawyer who is discussing a text in his own native language.

That’s the premium market.

If you are not able to do that — if you are not able to discuss the law in that detail with 5 different translation options, and a solid reason based on the subject-matter with a lawyer (or engineer, or banker, or physicist, or certified financial analyst) – then you are not there yet.

So your level of specialization should be on a level equivalent to a practitioner in that field.

That requires REAL expertise, not lightweight CPD, so expect to minor in these subjects in college, or go back for formal training. That means formal university training. Without this level of knowledge and expertise you will NOT see errors that you are making every day in your text that a true subject-expert will spot in an instant (several translators with readily recognizable names on social media recently published books containing howling scientific errors — and they had no clue.
 Don’t be them.) Skipping this step means you are not only delivering a substandard product to your client, your competition out there that does have expertise in the field will soon enough take your clients away from you.

There are many more steps on that ladder and most of them involve daily collaboration with other, more experienced colleagues.

I recognize that people must tire of hearing me say this, but there is NO OTHER WAY to make it to the pinnacle of the craft without leveraging the expertise of other smart, creative, thoughtful and engaged colleagues.

Find an expert reviser (or more) who reviews every word you translate. Establish revision partnerships with translators whose skill sets complement yours, but whose experience is superior to yours. Ideally they should come from an institutional framework, so if you translate physics into English, be revised by other physicist-translators with the American Institute of Physics who have greater expertise than you do. Have them share their marked-up copy with you on every assignment. There is no other way to avoid the “echo chamber” of translation that exists in your head, or to avoid making even glaring errors repeatedly, because nobody downstream has bothered to correct you.

Then you have to go out there and attend the same functions your clients do.

You also have to share your translations in public — perhaps do a slam or two as a form of practice — and see how you stack up against the 20 or 30 or 50 other translators who call themselves “experts.”

Some day — if you do this long enough — you will find your work has made incremental improvements over a very long time, and the distance between what you were producing five or ten years ago vs. what you are producing today will point you in the right direction for being successful in the premium market.

Other translators who have seen how good you are will begin to refer work to you that is too much for them to handle. Other translators working in the opposite direction will have also heard of you, and will be glad to have a trusted name to recommend to their clients for work in the opposite direction. Talent, experience, expertise and your final product drives rates. Not the other way around.

Rates: Welcome to the Premium Market

As you improve through specialization, collaboration and honed writing skills, raise your rates on a regular basis. Use earmuffs if you need them to combat howls of pain from low-ball clients. Use rates as a way to control workflow once it rises to the level where you are unable to service it all in a quality fashion.

If you make it into the premium market, be aware that $0.50 per word is a common rate, and project rates are becoming the predominant quoted practice.

Plus, the demand is intense, as there are simply not enough translators able to produce on this level. Translate a LOT. Every day. In the same way that professional athletes train hard every single day, your success in the market will be determined by your persistent dedication to translating regularly, being reviewed regularly, being revised/corrected regularly, raising your rates regularly, serving your clients exceptionally, and rising up the ladder in that fashion.

Final Thoughts

Finally, Smile. Laugh. Be nice. You are enormously fortunate to be succeeding in a field you love. How many people can say that about their work?

Kevin Hendzel is an Award-Winning Translator, Linguist, Author, National Media Consultant and Translation Industry Expert