Thursday, March 1, 2018

Machine Translation Maturity Model (MTMM)

This is a guest post by Valeria Cannavina, Project Coordinator at Donnelley Language Solutions, adopting the Common Sense Advisory’s Localization Maturity Model (LMM) which is itself an adaptation of the software industry’s Capability Maturity Model (CMM). The resulting Machine Translation Maturity Model (MTMM) is a way of assessing the users’ understanding of the technology, and whether they are using it in an efficient and effective manner, properly linking it to other organizational processes. Valeria provides a framework for businesses to “identify where they are and what they can do to either significantly or modestly improve their existing production model to maximize the value that MT can provide to their organizations.” 

This is, however, a perspective that is quite localization-centric, and process alignment for a global Enterprise MT service that might be used by thousands of users, across an enterprise to translate hundreds of millions of words could be quite different.

As we head into 2018, we continue to see excitement and hype around Neural MT (Machine Translation), which is a breakthrough approach on the verge of providing a wealth of possibilities for the creation and management of business content. But, because the technology is relatively new, many players in the translation industry are overlooking the importance of implementing aligned procedures to guide the use of MT.

Neural MT, or any other kind of MT on its own, is not a magic wand that can solve any and every translation problem. Production and work procedures need to be aligned in an informed and competent way to enable the technology to provide maximum benefits and also minimize risks and data security issues.

It is useful to always ask fundamental questions before embarking on new technology deployment initiatives. The most fundamental question for businesses looking to invest in MT may be:

Why do we want to translate the content at all? 

Content translation only makes sense if it furthers overall business objectives and improves the global customer experience in some way. Today’s markets are massively global and that means communication and collaboration need to happen at scale and in volumes that were inconceivable just a decade ago. Today, any business that seeks to have even a moderately global footprint must understand how MT can provide increasing volumes of relevant content to their customer base.

Customers all over the world expect relevant information at their fingertips as quickly as possible, and this information is increasingly more dynamic and also short-lived. Business information is very important for a brief instant and of very limited value after that. This ability to deliver the right content quickly and effectively is often critical to the impression the customer forms of the business and its product offerings.

In addition, businesses are rightly concerned about data security and privacy. Improperly implemented MT deployments, where key processes and systems are not properly aligned, can expose private and confidential data. As the sheer volume of information continues to increase, businesses need to ensure that security is not compromised when content flows through these new translation processes. This may be especially critical with new product/service developments, sensitive employment, credit-related, medical, and financial data.

As MT becomes much more pervasive, it is wise for us to understand the bigger picture. In this paper, Valeria provides a unique and valuable perspective on assessing organizational alignment with new technology deployments. I hope you find it a useful guide for assessing your business needs around MT.

This post was originally published last month and then removed so that Donnelly could prepare the more complete document that is referenced at the bottom. 


5 different approaches to succeed with machine translation


With the ever-increasing volume and pace of global trade, the need to communicate to multiple markets simultaneously has never been greater. Add to this the huge technological advances of recent years, and it’s easy to see why machine translation (MT) has emerged as a translation tool of choice for high volume, high-speed translations, with ever-improving quality.

MT delivers big time-saving and money-saving benefits, plus big gains in productivity. But as is often the case when technology moves at speed, many businesses are lagging behind. While the demand for MT is growing very fast, there are still some basic challenges that clients are not aware of. For example, not all language combinations, documents, and formats types are ideal for MT. In fact, the quality of the output can change considerably based on these criteria which could affect your workflow, the time to market, quality and business targets.

This is where partnering with a specialized Language Service Provider (LSP) can give you the upper hand. A professional LSP will not only help you understand the MT landscape together with the latest developments but also advise on how it can be best used to optimize your processes, productivity and profits.

MT engines, its output, and training require skilled professionals and solid technologies to support automated workflows. Simply put, MT is currently far from being just a plug-and-play technology.

This approach is paramount, especially when confidentiality is key to the process. Assessing the risks of publishing data and securing processes is not a standard practice for all language service providers, so while clients and regulators are setting up very strict measures for data breaches, vendors are struggling to create processes to ensure top quality processes and services for MT.

Whether you are a large or a small business; whether you have a little or a lot of knowledge of MT, this paper will show you how to take full advantage of it. We follow a Machine Translation Maturity Model (MTMM) which is based on the Localization Maturity Model[1] created by Common Sense Advisory, an independent market research company for the language services industry. This paper is a guide to help businesses identify where they are and what they can do to either significantly or modestly improve their existing model to maximize the value that MT can provide to their organizations.

[1] The Localization Maturity Model was created by Common Sense Advisory, an independent Massachusetts-based market research company for the language services industry. It is based on the Capability Maturity Model (CMM), a development model informed by a study of data collected from organizations contracted with the US Department of Defense, who funded the research. The term "maturity" relates to the degree of formality and optimization of processes – from ad hoc practices, to formally defined steps, to managed result metrics, to active optimization of the processes. The model's aim is to improve existing software development processes, but it can also be applied to other processes.

Machine Translation Maturity Model (MTMM)

The model has five maturity levels, each divided into different areas which we encourage companies to evaluate individually on their own merits, allowing you to freely move from one level to another, not necessarily in successive steps, although an ideal path is shown below.

We will now walk through them so you can identify where you are and what you need to do to move to a different level  - and enjoy the associated benefits.

Level 1: Initial

If you're at this level, you are requesting MT only when absolutely necessary. This could be due to time or budget constraints, or because your communication is internal only. It may be that you are not satisfied with the service that you are getting. Or it could be that you have no MT investment, resources or best practices in place. This may be because you don't have a localization department in place, or because it is not directly affecting a core activity within your organization.

This model may work for some organizations with a very limited usage of MT, but who may benefit from taking the following steps to increase its value within their organization:

  • Governance: make a case to management for investment in the maturity process.
  • Organization: appoint a dedicated MT resource in the localization/translation/marketing department who will set the basis for the process investigation.
  • Process: document the main tasks of the outsourcing process so you can track those that can be repeated and those that can be deleted because they don't generate any added value. For example, to track which department translation requests come from, the type of content you receive, and the turnaround times. This will help you start setting best practice for your process.

Level 2: Repeatable

The first step to maturity through process improvement is two-fold: describing what you do, and doing what you've described. If you are at this level, you have started documenting some tasks of the process which are repeatable - for example, your criteria for identifying texts that can be sent out to MT and how to store them by category. You view terminology management as a relatively low priority, whereas in reality, it's an investment that will pay dividends by ensuring consistency.
This is where most organizations may be at and would benefit from taking the following steps to evolve their existing processes:

  • Organization: define an internal process to track feedback on the source text from your LSP. For example, you might have received a comment saying that the text wasn't suitable for MT because it was too creative or overly complex in structure. Make a note of this in the process documentation.
  • Process:
    • analyze your existing content in order to understand exactly which documents can be translated with MT and how the source text is structured.
    • organize linguistic reviews on the translated material involving internal country reviewers, so that terminology management starts to become part of your internal process.
  • Governance:
    • track the costs of the process improvements you are implementing (expectations/forecasts vs. reality).
    • define KPIs for this process to track the ROI of activities involved at this level
  • Automation: investigate available tools for automating some tasks. For example, look for tools to help you build a repository of texts that have already been sent to MT and decide a naming convention. You will then be able to identify similar content and remove text that's not suitable for MT. The automation will be run parallel to the process of analyzing the texts.


Level 3: Defined

At this level, you will have clear goals around integrating MT into your business, in the form of a roadmap of tasks aimed at continuous improvement through collaboration with your LSP.
Your processes will be documented and fully executed. The internal process of outsourcing MT is defined, repeatable and managed. You have best practices in place - a process to identify the MT content, a process to collect and implement feedback, a process for internal translation review, a process to track costs etc.. and can now measure your process.

For example, to track productivity you might want to measure word output per hour. Before a process improvement is introduced, a baseline measurement is taken. At the end of the project, the process is measured again to show whether the change resulted in more words produced per hour.

Terminology management is no longer seen as a secondary task, but as a fundamental step which adds value to your business and your brand. The benefits are now showing in your ROI. For example, you can identify which content has already been sent to MT, which means fewer words to process, fewer man-hours for both you and your LSP, reduced time to market and reduced costs.
This is the stage where most organizations should aim to reach in order to optimize their supplier relationship with their language service provider and maximize return on investment. That said, some organizations take additional steps to further mature their MT procurement strategy as follows:
  • Organization: supported by your LSP, hire specialized terminology management staff who will work closely with other departments to:
    • organize feedback received on source text for fields of application
    • pass the feedback to internal technical writers
    • check the feedback has been implemented
    • incorporate the feedback into your CMS or whichever tool you're using to store the source and target documents.
  • Process:
    • define the internal process to combat source linguistic inconsistency. For example, give clear guidelines on what to do if a word has more than one meaning, who is the decision maker, how many review cycles the work will go through, and how this will be implemented in the CMS
    • plan internal review cycles so you can send feedback to your LSP and implement it in the CMS.
  • Governance:
    • establish the budget for multilingual projects based on the forecast for previous MT projects in terms of volumes and languages
    • establish decision making to prioritize languages and markets.
  • Automation: identify a tool to automatically apply correct terminology to source content in the CMS.

Level 4: Managed

MT is now tied to your corporate goals and part of your production process. The different departments rely on the MT department to prepare documents before sending the files to translate. The idea of 'department' here is fluid; for example, it could take the form of resources performing MT tasks alongside non-MT tasks. Alternatively, if you don't have your own MT department you can contact your trusted LSP to help you manage this internally or serve as the department itself.

The size of your business will dictate its scope.

The MT department has its own budget and schedule for incoming projects during the year, with automated checks in place for managing terminology and producing the source documents. The focus at this level is on automation; you will be working closely with the technology department to improve the source i.e., a style guide for writing source text to ensure consistency.

If your organization is in this camp, consider taking the following steps to improve the maturity of your sourcing model:
  • Organization: define roles within the process; for example, a project manager (PM) to handle requests from different departments, dedicated engineers for automation, and internal reviewers.
  • Processes:
    • define delivery parameters around new products/documents. For example, you're issuing a new set of letters to shareholders and, based on previous experience, you know they will need to go through X reviews. This knowledge enables you to set realistic deadlines, review cycles, and delivery volumes
    • define text structure rules (short sentences typically translate much better than long, complicated sentences).
  • Governance: measure business benefits vis-a-vis strategic use of MT budgets.
  • Automation: technology resources work on a roadmap to automation which integrates all the previous stages: from terminology management to check that the source text follows the defined rules.

Level 5: Optimized

At level 5 you have a team of engineers, terminologists, internal reviewers and project managers running the content creation process for MT on a daily basis. You have rules in place for content editors writing source text, ad hoc terminology and an internal tool to check and select the right kind of source text for MT, extracting only the new parts to be translated. You are now looking at new ways to get the same quality output while trying to keep costs and content creation time to a minimum.

What to do to achieve continuous improvement beyond level 5?
  • Organization:
    • prepare training material for staff and build a career path in the MT office
    • plan to offer a global service 24/7/365.
  • Process
    • content creators work to minimize content sent to MT: less content = lower costs
    • customize writing rules based on target language to minimize linguistic disparity between source and target. You will already have writing rules for new content creation, but with your accrued experience, constant LSP feedback, and the help of internal reviewers with in-depth knowledge of the target languages, the rules can be customized further to optimize MT for each target market
    • allocate LSP resources strategically, based on the language combination that best fits your quality expectations.
  • Governance: based on your long-term business goals plan how to support everyone involved in the roadmap to continuous improvement.
  • Automation: connect your CMS with your LSP's Translation Management System (TMS) to:
    • speed up the process
    • send requests and import the MT content automatically
    • centralize review cycles in a familiar environment
    • ensure consistency across content with a shared repository of linguistic assets.


Even if localization is not central to your organization, it can and does have an impact on your business. To reap the maximum rewards from MT, working with a trusted LSP is key to strengthening your supply chain, improving ROI and protecting your brand and reputation.

The MTMM was designed for any size or type of organization to use to make the most of MT. It is an ideal path to maturity because it has built-in flexibility – each activity can be performed as a stand-alone step as well as in a sequential way.

We believe that MT is not just about getting the technology right. It’s also about having a strong relationship with your LSP; a partnership that is characterized by collaboration and constant feedback between the whole team. Only by analyzing your process and implementing some of the suggested tasks will you arrive at an MT roadmap which delivers against your expectations.

It is also possible to get a more complete  version of this post directly from  the RRD website at this  link:


Valeria Cannavina holds a degree in language and culture mediation, and a master's degree in technical and scientific translation from Libera Università degli Studi “San Pio V” in Rome. She has spent 10 year in the GILT industry as Project Manager and while working for companies like SAP and Xerox she was involved in quite a few research projects on new processes implementation and Machine Translation. At present she works for Donnelley Language Solutions a Project Manager.

Wednesday, February 14, 2018

A Change in Status & The Larger Translation Market Beyond Localization

The Emerging MT-Driven Translation Market Opportunity

As we roll on into the New Year, it is clear that machine translation is now pervasive, universally available, and easily accessible to millions across the globe on all kinds of digital devices. It is often even accessible when we are not connected to the web. This widespread access to, and use of MT is true across the globe and recent advances in translation quality from Neural MT have raised the profile of MT all over again. And while Google is a large provider of generic MT, it is not dominant across the globe, and we see that regional MT portals dominate local markets, e.g. Baidu in China, Naver in Korea, and Yandex in Russia.

By my very conservative estimates, the sheer volume of translation done is astonishing, and I think the global daily use of MT is easily in excess of 500 billion words a day. This amounts to approximately 185 trillion words a year, and this is if MT use does not continue to grow even further as new populations become more digital and connected! While the bulk of this use is by the consumer-at-large on the global internet, I think it is quite possible and even likely that a growing portion of this usage is uncontrolled use by a growing population of enterprise workers and many translators. The need for instant, largely accurate translation has now become a fundamental and often critical business requirement.

MT is much bigger than what we understand as the reality of the total translation need from the localization industry perspective. The quality of MT can vary dramatically, be significantly less than "perfect", and still be hugely valuable. There are many business applications where the output quality requirements are a long way from perfect post-editability for MT to add value to the business mission. It is quite likely that the highest quality MT output today is seen in systems developed by experts in localization, but the evidence provided by the huge demand for raw MT use suggests that the real translation need is greater and broader than understood by localization experts. Based on data provided by Common Sense Advisory of the translation volumes in the localization industry, I estimate that currently the localization industry services less than 1% of the translation needs on the globe on any given day. Perhaps for business content that number is slightly higher, but ever so slightly.

This suggests that the market opportunity for business translation can grow substantially. This growth can only come from a different strategy to what is translated and how it is done than it has been in the historical approach and focus. Thus, possibly the new leaders will create and move to a new industry business model where MT drives other translation activity, rather than the other way around. While addressing this larger market opportunity is only a potential strategy for a select few in the translation industry, I think few are as well positioned to lead the way and define this as SDL is today. Thus, I have decided to join SDL in a senior marketing strategy role as a “Language Technology Evangelist”.

The need for translation goes far beyond the focus of the localization industry

 Why SDL?

I have had the good fortune to closely examine the MT offerings and strategies of many of the leading players and users in the industry over the years, and recently even compare the various solutions and capabilities from both vendor and buyer perspectives. Having an honest and objective outlook is fundamental to seeing accurately, and my role in maintaining this blog has enabled this clarity, as all claims need to be validated and reviewed. Based on my personal assessment over some time, I believe that SDL is in a unique position to capture much of this new MT market opportunity that lies beyond traditional localization and expand the worldview of the business translation market in general. Those who have interacted with me in my consulting role, have already seen that I have been advocating SDL as a superior MT solution (amongst others) for enterprise and professional translation use. There are three primary reasons that I am excited about my new role at SDL:
  1. The Executive Management commitment to the role of MT in the organization and the future business strategy of the company which I touched upon in a post early last year. Their vision and understanding of the role and possibilities of using MT technology to drive deeper and broader engagement with global enterprises, I find is a refreshing contrast to the narrow localization MT perspective that still pervades much of the thinking in the industry. The relatively new management team has a strikingly different perspective from the prior management, and I believe they see the potential of using SDL’s substantial MT competence to drive a larger presence for business translation in the broader enterprise translation market that lies beyond localization.
  2. The depth and overall NLP (Natural Language Processing) and ML (Machine Learning) competence that exists within the technology team is, in my opinion, a significant long-term competitive advantage. In contrast to the teams at Google, FB, and Microsoft, the SDL team is focused on developing MT solutions that are optimized around global enterprise needs and use-cases. It is my opinion that the capabilities and competence of the SDL MT team are beyond the reach of other large LSPs, and also quite probably of many smaller MT vendors. This ability to tune and customize the technology to specialized business use-cases requires deep knowledge of the underlying technology, and those who work with open source toolkits are clearly at a disadvantage. SDL has regularly beat out competitors in customer-driven quality evaluations, but much of this is under the radar and not well known. SDL also has a long history of deploying their MT technology in Government National Security user environments where system robustness, scalability, elasticity and overall manageability REALLY matter. This experience is excellent preparation for the larger enterprise MT market which has many of the same organizational needs. Today, SDL is the only MT technology vendor that has enterprise-grade SMT, Adaptive MT, and Neural MT solutions in place. This competence allows ongoing comparisons internally of how different data sets can be optimized across these technology alternatives to optimize for the business purpose in ways that are not easily possible by competitors. It matters more that the business purpose is achieved and much less what technology is used. The engineering team here has many patents to their credit and have published hundreds of papers on ML strategies, SMT techniques, NMT and Adaptive MT.
  3. The experience of Internal Translation teams working with MT over an extended period has built unique capabilities and a deep understanding of successful MT technology deployment. SDL is one of the few language service companies that have a large team of internal translators. This fact enhances the possibilities of building a more comprehensive Linguistic Steering & MT Development collaboration when the highest quality output is required. This is one of the reasons why SDL has such a robust Adaptive MT offering. The deep collaboration between expert NLP engineers and expert linguists has only just begun, and I believe that as this engagement expands, SDL will have the potential to build distinctively and consistently superior MT engines that are optimized for enterprise use. Enterprise MT solutions also require specialized process infrastructure that surrounds MT in a business use scenario, and the long-term internal MT use experience that SDL has had provides great insight into building this process infrastructure for clients whether they are global enterprises or LSPs. The quality and robustness of the surrounding processes around MT deployment are key to maximizing MT benefits. This interaction between computational linguists and human linguists where needed could also enable new kinds of AI and machine learning driven process improvements at several different points in new business translation workflows. Having competent linguists involved with your MT systems also results in data refinement over time. Many are now beginning to realize that the data is the highest value asset in a world where the best machine learning technology is only as good as the data it learns from. The existence of a linguist team that understands the learning implications of "good data" vs. "bad data" creates the possibility of having the most valuable, cleanest and most relevant data for the new AI frontiers.

When these three factors are placed in the core competency portfolio and context of a company that already has deep penetration into Global 1000 Enterprises and Major Government Agencies around the world, the potential for further success is indeed great. The initial engagement with the Global 1000 that SDL already has makes the likelihood for a broader and deeper business relationship to develop around new international initiatives and needs even more likely. From my historical vantage point as an independent industry analyst, it is easy to see that the SDL MT offerings are already formidable and unique, and hopefully, with my involvement this will only become more so.

The many paths to driving SDL MT quality above generic public MT solutions

Personal Factors

While it is important to consider these relatively objective assessments of organizational strengths and characteristics in choosing one’s employment, I think the personal factors around the choice are possibly even more important. The following are the ones that immediately come to mind.

The People: I have enjoyed my interactions with many SDL employees in my various roles as an analyst, collaborator, and potential employee. The straightforward and clear communication styles and professional personal manner of the key executives I dealt with encouraged me to consider taking on this new role with minimal hesitation and consideration. In a way this was also a homecoming, as I originally entered the world of translation in a leadership position with Language Weaver (acquired by SDL in 2010), except now it has better management, better focus, and more resources. And also hopefully I too am wiser and smarter, as my new grey beard suggests.

I plan to keep this blog going and will continue to produce or curate what I consider to be high-value content on the subjects of Translation, MT, AI, ML, and Collaboration. This is done with SDL Executive permission (and possibly even their blessings) and again points to the openness of the SDL culture and new management. SDL does not necessarily agree with everything that I say on this blog but sees the value of serious content sharing forums. I will, of course, endeavor to keep this blog from becoming a marketing mouthpiece for SDL MT alone, and will try my best to keep it objective and issue focused. There is great value in intelligent dialogue and discussion, and I will strive to maintain this here. While this post clearly shows where my allegiances and biases lie from this point on, I will always strive to be fair and allow different viewpoints to be heard. I have always considered it a great and important skill to listen, to listen to everyone, especially those with a different worldview. I have been a long-term MT and translation technology evolution advocate, but I have also found that it is possible to communicate respectfully and meaningfully with translators and opponents of MT. The regular contributions by translators as guest writers or in the comments on this blog is a testimony to that. I hope to maintain that open dialogue quality and the overall respectful cultural characteristic of this forum. I invite guest writers who have an interest in sharing insights and new perspectives on the primary themes of this blog.

There is also great value in being an insider and seeing the challenges from the inside in contrast to the view that is offered to a consultant. As an insider one has the opportunity to shape and change the outcomes in a much more substantial and organic way. I look forward to this new insider role in helping to shape the future of SDL MT. This role will also allow me to engage with many more actual MT deployments and possibly have a role in helping to drive many more successful outcomes.

My first public-facing SDL event will be a webinar on February 27th. I look forward to seeing you there and I look forward to ongoing honest and forthright dialogue.


Thursday, January 18, 2018

Literary Text: What Level of Quality can Neural MT Attain?

Here are some interesting results from guest writer Antonio Toral, who provided us a good broad look at how NMT was doing relative to PBMT last year. His latest research investigates the potential for NMT in assisting with the translation of Literary Texts. While NMT is still a long way from human quality, it is interesting to note that NMT very consistently beats SMT even at the BLEU score level. At th eresearch level this is a big deal. Given that BLEU scores tend to favor SMT systems naturally, this is especially promising, and the results are probably quite strikingly better when compared by human reviewers.

I have also included another short post Antonio did on the detailed human review of NMT vs SMT output to show those who still doubt that NMT is the most likely way forward for any MT project today.


Neural networks have revolutionised the field of Machine Translation (MT). Translation quality has improved drastically over that of the previous dominant approach, statistical MT. It has been shown that this is the case for several content types, including news, TED talks, United Nations documents, etc. At this point, we wonder thus how neural MT fares on what is historically perceived the greatest challenge for MT, literary text, and specifically its most common representative: novels.

We explore this question in a paper that will appear in the forthcoming Springer volume Translation Quality Assessment: From Principles to Practice. We built state-of-the-art neural and statistical MT systems tailored to novels by training them on around 1,000 books (over 100 million words) for English-to-Catalan. We then evaluated these systems automatically on 12 widely known novels that span from the 1920s to the present day; from J. Joyce’s Ulysses to the last Harry Potter. The results (Figure 1) show that neural MT outperforms statistical MT for every single novel, achieving remarkable results: an overall improvement of 3 BLEU points.

Figure 1: BLEU scores obtained by neural and statistical MT on the 12 novels

Can humans notice the difference between human and machine translations?


We asked native speakers to rank blindly human versus machine translations for three of the novels. For two of them, around 33% of the translations produced by neural MT were perceived to be of equivalent quality to the translations by a professional human translator (Figure 2). This percentage is much lower for statistical MT at around 19%. For the remaining book, both MT systems obtain lower results, but they are still favourable for neural MT: 17% for this NMT system versus 8% for statistical MT.

Figure 2: Readers' perceptions of the quality of human versus machine translations for Salinger’s The Catcher in the Rye

How far are we?

Based on these ranks, we derived an overall score for human translations and the two MT systems (Figure 3). We take statistical MT as the departure point and human translation as the goal to be ultimately reached. Current neural MT technology has already covered around one fifth (20%) of the way: a considerable step forward compared to the previous MT paradigm, yet still far from human translation quality. The question now is whether neural MT can be useful [in future] to assist professional literary translators… To be continued.

Figure 3: Overall scores for human and machine translations

 A. Toral and A. Way. 2018. What Level of Quality can Neural Machine Translation Attain on Literary Text? ArXiv.

Fine-grained Human Evaluation of Neural Machine Translation

In a paper presented last month (May 2017) at EAMT we conducted a fine-grained human evaluation of neural machine translation (NMT). This builds upon recent work that has analysed the strengths and weaknesses of NMT using automatic procedures (Bentivogli et al., 2016; Toral and Sánchez-Cartagena, 2017).

Our study concerns translation into a morphologically-rich language (English-to-Croatian) and has a special focus on agreement errors. We compare 3 systems: standard phrase-based MT (PBMT) with Moses, PBMT enriched with morphological information using factored models and NMT. The errors produced by each system are annotated with a fine-grained tag set that contains over 20 error categories and is compliant with the Multidimensional Quality Metrics taxonomy (MQM).
These are our main findings:
  1. NMT reduces the number of overall errors produced by PBMT by more than half (54%). Compared to factored PBMT, the reduction brought by NMT is also notable at 42%.
  2. NMT is especially effective on agreement errors (number, gender, and case), which are reduced by 72% compared to PBMT, and by 63% compared to factored PBMT.
  3. The only error type for which NMT underperformed PBMT is errors of omission, which increased by 40%.
F. Klubicka, A. Toral and V. M. Sánchez-Cartagena. 2017. Fine-grained human evaluation of neural machine translation. The Prague Bulletin of Mathematical Linguistics. [PDF | BibTeX]

This shows that NMT errors are greatly decreased in most categories except for errors of Omission

 Antonio Toral
Antonio Toral is an assistant professor in Language Technology at the University of Groningen and was previously a research fellow in Machine Translation at Dublin City University. He has over 10 years of research experience in academia, is the author of over 90 peer-reviewed publications and the coordinator of Abu-MaTran, a 4-year project funded by the European Commission

Tuesday, January 16, 2018

2018: Machine Translation for Humans - Neural MT

This is a guest post by Laura Casanellas @LauraCasanellas  describing her journey with language technology. She raises some good questions for all of us to ponder over the coming year.

 Neural MT is all the rage now and it now appears in almost every translation industry discussion we see today. Sometimes depicted as a terrible job-killing force and sometimes as a savior, though I would bet that it is neither. Hopefully, the hype subsides and we start focusing on solving issues that enable high-value deployments. I have been interviewed by a few people about NMT technology in the last month, so expect to see even more on NMT, and we continue to see that GAFA and the Chinese/Korean giants (Baidu, Alibaba, Naver) also introduce NMT offerings. 

Open source toolkits for NMT proliferate, training data is easier to acquire, and hardware options for neural net and deep learning experimentation continue to expand.  It is very likely that we will see even more generic NMT solutions appear in the coming year, but generic NMT solutions are often not suitable for professional translation use.  For many reasons, but especially because of the inability to properly secure data privacy, properly integrate the technology into carefully built existing production workflows, customize NMT engines for very specific subject domains, and implement controls, and feedback cycles that are critical to ongoing NMT use in professional translation scenarios. It is quite likely that many LSPs will waste time and resources with multiple NMT toolkits, only to find out that NMT is far from being a Plug'nPlay technology, and real competence is not easily acquired without significant long-term knowledge building investments. We are perhaps reaching a threshold year for the translation industry where skillful use of MT and other kinds of effective automation are a requirement, both for business survival and for developing a sustainable competitive advantage.

The latest Multilingual magazine (January 2018) contains several articles on NMT technology but unfortunately does not have any contributions from SDL and Systran, who I think are the companies that are probably the most experienced with NMT technology use in the professional translation arena.  I have pointed out many of the challenges that still exist with NMT in previous posts in this blog, but I noted better definition of interesting challenges and some new highlights (for me) listed in the articles in Multilingual, for example:

  • DFKI documented very specifically that even though NMT systems have lower BLEU scores they exhibit fewer errors in most linguistic categories and are thus preferred by humans
  • DFKI also stated that terminology and tag management are major issues for NMT, and need to be resolved somehow to enable more professional deployments
  • Several people reported that using BLEU to compare NMT vs. SMT is unlikely to give meaningful results, but this is still often the means of comparison used in many cases
  • Capita TI reported that the cost of building an NMT engine is 50X that of an SMT engine, and the cost of running it is 70X the cost of an SMT engine
  • Experiments run at this stage of technology exploration by most in the professional translation world, should not be seen as conclusive and final. Their results will often be a reflection of their lack of expertise than of teh actual technology. As NMT expertise deepens and as the obvious challenges are worked out, we should expect that NMT  will become the preferred model even for Adaptive MT implementations.
  •  SMT took several years to mature and develop the ancillary infrastructure needed to enable MT deployments at scale. NMT will do this faster but it still does need some time for support infrastructure and key tools to be put in place. 
  • MT is a strategic technology that can provide long-term leverage but is most often unlikely to promise ROI on a single project, and this, plus the unwillingness to acknowledge the complexity of do-it-yourself options are key reasons that I think many LSPs will be left behind. 

Anyway, these are exciting times and look like things are about to get more exciting.

I am responsible for all text that is in bold in this post.


2017 has been a year of reinvention. We thought we had it good and then, Neural MT came along.

Riding The Wave

I started in localization twenty years ago and I still feel like an outsider; I don’t have a translation degree, neither do I have a technical background; I am somebody who came to live in a foreign country, liked it and had to find a career path there in order to be able to stay. Localization was one of the options, I tried it and it worked for me. This business has had many twists and turns and has been forced to adapt and be flexible with each one of them. I think I have done the same, change and adapt to every new invention, I have tried to ride the wave.

There were already translation memories when I started, but I remember big changes in the way processes worked and, at each turn, more automation was embraced and implemented: I remember the jump from static translation dumps to on-demand localization and delivery, and the implementation of automatic sophisticated quality check-ups. I progressed and evolved mirroring the industry and, from a brief period as a translator, I moved on to work in different positions and departments within the localization workflow. This mobility has given me the opportunity to have a good understanding of the industry’s main needs and problems.

Six years ago, I stumbled upon Machine Translation (MT). At that time, it almost looked like chance, but having seen the evolution of the technology in this short period of time, now I know that I had it coming, we all did, we all do. It happened because a visionary head of localization requested the implementation of an MT program in their account. I was in the privileged position of being involved in that implementation and that meant that myself and my colleagues could experiment and experience Machine Translation output first hand. For somebody who can speak another language and who has a curious mind, this was a golden opportunity. For a couple of years, we evaluated MT output within an inch of its life: from a linguist point of view (error typology, human evaluation), using industry standards (Bleu, yes, Bleu, and others…), setting up productivity tests (how much more productive post-editing effort is when compared with translation effort), etc. We learned to deal with this new tool and we acquired experience that helped us estimate expectations.

It feels like a lifetime ago. During the last few years, industry research has zoomed in on Machine Translation; as a consequence, there has been a colossal amount of research and studies done by industry and academia on the subject ever since. As we all know.

And I still haven’t mentioned Neural MT (NMT).

The Wondrous NMT

Geeky as it sounds, from the point of view of Machine Translation, I can consider myself quite privileged, as I have experienced directly the change from Statistical Machine Translation (SMT) to Neural while working for a Machine Translation provider. Again, I was able to compare the linguistic output produced by the previous system (SMT) and the new one (NMT) and see the sometimes very subtle, but significant differences. 2017 was a very exciting year.

NMT has really begun to be commercially implemented the last year but, after all the media attention (including in blogs like this one) and focus on industry and research forums, it feels as if it has been here forever. Everything goes very quick these days, proof of it is that most (if not all) Machine Translation providers have adopted this new technology in one way or another.

Technology Steals The Show

Technology is all around us, and it is stealing the show. I would love to do an experiment and ask an outsider to read articles and blog posts related to the localization industry for a month and then ask them, based on what they had read, what the level of technology adoption is in their opinion. I think they would say that the level of adoption, let’s focus on MT, is very high.

I see a different reality though; from my lucky position, I see that many companies in the industry are still hesitant, and maybe one of the reasons for it is fear. Fear of not fully understanding the implications of the implementation, the logistics of it, and of course, fear of not really grasping how the technology works. Because it is easy to understand how Translation Memory (TM) leverage works, but Machine Translation is a different thing.

I have no doubt in my mind that in five years’ time the gap will be closed; but at the moment there is still a large, not so vocal, group of people who are still not sure of how to start. For them, it might feel a bit like a flu jab, it is painful, may not really work, but most people are adopting it, it kind of has to the done. All other companies seem to be adopting it, they feel they need to do the same, but how? And when we ask how it should include questions like how is this technology going to connect with my own workflow; do I use TMs as well, how do I make it profitable, what is my ROI going to be, how do I rate post-edited words, what if my trusted translators refuse to post-edit, how many engines do I need, one per language, one per language and vertical, one per language and domain…?

MT for Humans

Many of the humans I have worked and dealt with are putting on a brave face, but sometimes they struggle with the concepts; a few years ago it was Bleu, now it is perplexity, epochs… Concepts and terms change very fast. For the industry to fully embrace this new technology a bigger effort might need to be done to bring it to the human level. The head of a language company will probably know by now that NMT is the latest option, but might not really care to comprehend what the intrinsic differences between one type of MT and the others are. They might prefer to know what the output is like, how to implement it, how to train their workforce (translators and everybody else in the company) on the technology from a practical point of view; is it going to affect the final quality, what does a Quality Manager or a Language lead need to know about it, what about rates, can a Vendor Manager negotiate a blanket reduction for all languages and content types? How is it going to be incorporated into the production workflow?

I think 2018 is going to be the year of mass adoption and more and more professionals are going to try to figure out all these questions. Artificial intelligence is all around us, the new generations are growing with it, but today this new bridge created by progress is still being crossed by very many people. Not everybody is on the other side. Yet.

Dublin, 12.I.18

Laura Casanellas is a localization consultant specialised in the area of Machine Translation deployment. Originally from Spain, she has been living in Ireland for the last 20 years. During that time, Laura has worked in a variety of roles (Language Quality, Vendor Management, Content Management) and verticals (Games, Travel, IT, Automotive, Legal) and acquired extensive experience in all aspects related to Localization. Since 2011, Laura has specialized in Language Technology and Machine Translation; until last year, Laura worked as a Product Manager and head of Professional Services in KantanMT.

Outside of her professional life, she is interested in biodiversity, horticulture, apiculture, and sustainability.

The result of some of the evaluations mentioned on the blog are collected in a number of papers:

Empirical evaluation of NMT and PBSMT quality for large-scale translation production
(2017) Shterionov, D., Nagle, P., Casanellas, L., Superbo, R., and O’Dowd, T.

Assumptions, expectations, and outliers in post-editing
 (2014) Laura Casanellas & Lena Marg: Assumptions, expectations, and outliers in post-editing. EAMT 2014, Dubrovnik

Connectivity, adaptability, productivity, quality, price: getting the MT recipe right
(2013) Laura Casanellas & Lena Marg: Connectivity, adaptability, productivity, quality, price: getting the MT recipe right XIV Machine Translation Summit, Nice