Pages

Monday, August 5, 2019

Adapting Neural MT to Support Digital Transformation

We live in an era where the issue of digital transformation is increasingly recognized as a primary concern, and a key focus of executive management teams in global enterprises. The stakes are high for businesses that fail to embrace change. Since 2000, almost half (52%) of Fortune 500 companies have either gone bankrupt, been acquired, or ceased to exist as a result of digital disruption. It’s also estimated that 75% of today’s S&P 500 will be replaced by 2027, according to Innosight Research.

Responding effectively to the realities of the digital world have now become a matter of survival as well a means to build long term competitive advantage.

When we consider what is needed to drive digital transformation in addition to structural integration, we see that large volumes of current, relevant, and accurate content that support the buyer and customer journey are critical to enhancing the digital experience both in B2C and B2B scenarios. 

Large volumes of relevant content are needed to enhance the customer experience in the modern digital era, where customers interact continuously with enterprises in a digital space, on a variety of digital platforms. To be digitally relevant in this environment requires that enterprises must increasingly be omni-market-focused, and have large volumes of relevant content available in every language in every market they participate on a continuous basis.


This requires that the modern enterprise must create more content, translate more content and deliver more content on an ongoing basis to be digitally relevant and visible. Traditional approaches to translating enterprise content simply cannot scale and a new approach is needed. The possibility of addressing these translation challenges without automation is nil, but what is required is a much more active man-machine collaboration that we at SDL call machine-first human optimized. Thus, the need for a global enterprise to escalate the focus on machine translation (MT) is growing and has become much more urgent. 

However, the days of only using generic MT to solve any high volume content translation challenges are over, and the ability of the enterprise to utilize MT in a much more optimal and agile manner across a range of different use cases is needed to enable an effective omni-market strategy to be deployed.

 A one-size-fits-all MT strategy will not enable the modern enterprise to effectively deliver the critical content needed to their target global markets in an effective and optimal way. 

Superior MT deployment requires ongoing and continuous adaptation of the core MT technology to varied use cases, subject domain, and customer-relevant content needs. MT deployment also needs to happen with speed and agility to deliver business advantage, and few enterprises can afford the long learning and development timelines required by any do-it-yourself initiative.

The MT Adaptation Challenge

Neural machine translation (NMT) has quickly established itself as the preferred model for most MT use cases today. Most experts now realize that MT performs best in industrial deployment scenarios when it is adapted and customized to the specific subject domain, terminology, and use case requirements. Generic MT is often not enough to meet key business objectives. However, the constraints to successful development of adapted NMT models is difficult for the following reasons:
  1. The sheer volume of training data that is required to build robust systems. This is typically in the hundreds of millions of words range that few enterprises will ever be able to accumulate and maintain. Models built with inadequate foundational data are sure to perform poorly and fail in meeting business objectives and providing business value. Many AI initiatives fail or underperform because of data insufficiency. 
  2. The available options to train NMT systems are complex and almost all of them require that any training data used to adapt NMT systems be made universally available to the development platform being used to further enhance their platform. This often raises serious data security and data privacy issues in this massively digital era, where the data related to the most confidential customer interactions and product development initiatives are needed to be translated on a daily basis. Customer interactions, sentiment and service, and support data are too valuable to be shared with open source AI platforms.
  3. The cost of keeping abreast of state-of-the-art NMT technology standards are also high. For example, a current best of breed English to German NMT system requires tens of millions of training data segments, hundreds and even thousands of hours of GPU cycles, deep expertise to tune and adjust model parameters and knowhow to bring it all together. It is estimated that just for this one single system it costs around $9,000 in training time costs on public cloud infrastructure, and 40 days of training time! These costs are likely to be higher if the developer does not have real expertise and is learning as they attempt to do it. These costs can be reduced substantially by moving to an on-premise training setup and by working with a foundation sytem that has been set up by experts.
  4. NMT model development requires constant iteration and ongoing and continuous experimentation with varying data sets and tuning strategies. There is a certain amount of uncertainty in any model development and outcomes cannot always be predicted upfront thus repeated and frequent updates should be expected. Thus, computing costs can rapidly escalate when using cloud infrastructure. 
Given the buzz around NMT, many naïve practitioners jump into DIY (do-it-yourself) open-source options that are freely available, only to realize months or years later that they have nothing to show for their efforts. 

The many challenges of working with open-source NMT are covered here. While it is possible to succeed with open-source NMT, a sustained and ongoing research/production investment is required with very specialized human resources to have any meaningful chance of success.


The other option that enterprises employ to meet their NMT adaptation needs is to go to dedicated MT specialists and MT vendors, and there are significant costs associated with this approach as well. The ongoing updates and improvements usually come with direct costs associated with each individual effort. These challenges have limited the number of adapted and tuned NMT systems that can be deployed, and have also created resistance to deploying NMT systems more widely as generic system problems are identified.

The most informed practitioners are just beginning to realize that using BLEU scores to select MT systems is usually quite short-sighted. The business impact of 5 BLEU points this way or that is negligible in most high value business use cases, and use case optimization is usually much more beneficial and valuable to the business mission.


As a technology provider who is focused on enterprise MT needs, SDL already provides existing adaptation capabilities, which range from:
  • Customer created dictionaries for instant self-service customization – suitable for specific terminology enforcement on top of a generic model. 
  • NMT model adaptation as a service, performed by the SDL MT R&D team.


 

The Innovative SDL NMT Adaptation Solution

The SDL NMT Trainer solution provides the following:
  • Straightforward and simple NMT model adaptation without requiring users to be data scientists or experts.
  • Foundational data provided in the Adaptable Language Pairs to expedite and accelerate the development of robust and deployable systems quickly.
  • On-premise training that completely precludes the possibility of any highly confidential training data that encapsulates customer interactions, information governance, product development and partner and employee communications ever leaving the enterprise.
  • Once created the encrypted adapted models can be deployed easily on SDL in an on-premise deployment or cloud infrastructure with no possibility of data leakage.
  • Multiple use cases and optimizations are possible to be developed on a single language pair and customers can re-train and adjust their models continuously as data becomes available or as new use cases are identified.
  • A pricing model that encourages and supports continuous improvement and experimentation on existing models and allows for many more use cases to be deployed on the same language combination. 
The initial release of the SDL On-Premise Trainer is expected to be the foundation of an ever-adapting machine translation solution that will grow in capability and continue to evolve with additional new features.


Research shows that NMT models are very dependent on high quality training data and outcomes are highly dependent on the quality of the data used. The cleaner the data is, the better the adaptation will be, and thus after this initial product release, SDL plans to introduce an update later this year that leverages years of experience in translation memory management to include the appropriate automated cleaning steps required to make the data used as good as possible for neural MT model training.

The promise of the best AI solutions in the market is to continuously learn and improve with informed and structured human feedback, and the SDL technology is being architected to evolve and improve with this human feedback. While generic MT serves the needs of many internet users who need to get a rough gist of foreign language content, the global enterprise needs MT solutions that perform optimally on critical terminology, and are sensitive to linguistic requirements within the enterprise’s core subject domain. This is a solution that leverages a customer’s ability to produce high quality adaptations with minimal effort in as short a time as possible and thus make increasing volumes of critical DX content multilingual.