Pages

Tuesday, May 10, 2022

MT as an Enabling Technology for Global eCommerce

 We regularly hear about the digital disruption caused by online eCommerce to the retail industry, and there are many examples of the retail giants of yesteryear who have been all but obliterated by the power shift to online stores.

This is not just true for B2C markets but is now increasingly a factor in B2B markets as well.

The pandemic has accelerated this trend, requiring that more firms establish an agile and comprehensive digital presence, even though, there was already evidence that a substantial online presence was critical to delivering superior CX and building brand relevance.

As lockdowns became the new normal, businesses and consumers increasingly “went digital”, providing and purchasing more goods and services online, raising e-commerce’s share of global retail trade dramatically in a single year.

Source: Digital 2022 Global Overview

The preference for a customer-managed, online, self-service shopping experience has been a trend for over a decade.  This is very clear to digitally-savvy executives who understand the building blocks to deliver superior CX and digital transformation strategies. Few customers are willing to trade a more time-consuming, less transparent, in-store-only experience for a less time-consuming, more transparent, online experience.

Digitally-savvy executives understand that providing content relevant to the buyer journey matters, and is a key element of enabling digital online success.

Source: Shopify

The Pandemic Impact and eCommerce Megatrends

The pandemic restrictions forced consumers and workers around the globe to stay at home and thus dramatically accelerated the shift to online for both B2C and B2B markets.

COVID-19 essentially forced the trial of new behaviors, and consumers had no choice in the early part of the pandemic but to trial e-commerce as a shopping channel. And surprisingly—or not—they liked it.

The pandemic has accelerated the use of digital, and many believe that the opportunities and paradigm shifts that have emerged will persevere postcrisis.

Companies will have to communicate, engage, and interact with customers in new ways because customer behavior has changed.

McKinsey described the pandemic impact as a “perfect storm for fashion marketplaces.” Fast-fashion brand Shein saw its valuation double to $30 billion and saw a revenue growth rate of over 170% during this period making it the world’s largest online-only fashion retailer. Twenty other online retailers (mostly fashion) saw revenue growth of over 150%.

At the same time, consumer behavior is changing. Because consumers are looking for purpose and sustainability from brands that they support, and are willing to switch brands quickly if they do not see this.

Research shows that peer purchasing insights (customer and user reviews) seem to have more influence on consumers than any marketing strategy. Estimates say that global reviews doubled in the year after COVID-19 began, on top of an already very steady long-term growth trend.

The pandemic also drove B2B customer behaviors to begin to shift dramatically, favoring video conference interactions with sales reps and eCommerce.

B2B sales are now resolutely omnichannel, with eCommerce, face-to-face, and remote videoconference sales all a necessary part of the buyers’ experience.

Omnichannel is a path to market share growth, analysts say. The more channels a sales organization deploys, the bigger the expected market share gains, and there is clear evidence that B2B customers also prefer online shopping, especially in the early phases of the buyer journey.

As Millennials rise in seniority in business settings, they’re questioning why the B2B buyer experience should be different, citing a disconnect between an archaic spreadsheet-driven B2B buying experience and the CX-friendly personal B2C experience they prefer. A recent study by Demand Gen found that 44 percent of Millennial respondents indicated they are primary decision-makers at their companies for purchases of $10,000 or more.



The B2B eCommerce Opportunity

While much of the focus is on the B2C progress, McKinsey states: “We now see a tipping point, with [B2B] eCommerce surpassing in-person selling as a sales channel, at 65 percent, versus 53 percent earlier this year (2021). Videoconferencing and online chat also rose during the year.”

In 2021, online sales on US B2B eCommerce sites, log-in portals, and marketplaces increased 17.8% to $1.63 trillion from $1.39 trillion in 2020. B2B eCommerce in 2021 grew 1.17 times faster than the growth of all U.S. manufacturing and distributor sales.

B2B eCommerce sales accelerated in 2021 in large measure because more business buyers and sellers now see digital commerce as a more efficient and effective way to research and purchase corporate goods and services, according to analysts.

  • B2B buyers are increasingly choosing to buy online rather than through phone and other offline channels. B2B buyers prefer digital, self-guided experiences because they can explore, research, and purchase on their own terms. According to TrustRadius, “87% of buyers want [the ability] to self-serve part or all of their buying journey.”
  • Forrester data shows that nearly 75% of B2B buyers prefer to buy online when purchasing products for work, yet just 25% of B2B companies actively sell online. McKinsey research shows that only 20% of B2B buyers say they hope to return to in-person sales. This proves to be especially true in sectors where field sales have long dominated, including pharma and medical products.
  • According to McKinsey, customers want an always-on, personalized, omnichannel experience, and, ALL B2B customers prefer omnichannel, no matter their industry, country, size, or customer relationship stage. Customers are more willing than ever to switch suppliers to gain exceptional omnichannel experiences.
  • 51% of business buyers come to a B2B eCommerce site attracted by an excellent user experience.
  • Buyers are more willing than ever before to spend substantial amounts through remote or online sales channels. Globally, 62% of B2B decision-makers are now willing to spend $50,000 or more in online purchases—and one in five would spend more than $500,000.
  • Other benefits of B2B eCommerce according to Forrester include substantially (as much as 90%) lower selling and service costs. Additionally, they say that 52% of B2B executives say they have reduced their customer-support costs by migrating offline customers online.
  • The new B2B buyer will only be loyal if customer needs are met: for example, eight in ten B2B decision-makers say they will actively look for a new supplier if performance guarantees (e.g., a full refund if a certain level of performance is not met) are not offered.
  • The number of channels needed to service customers effectively has increased over the last five years.

The Global Opportunity Outlook

At the start of 2020, 1.35 billion people were in the global “middle class” with the majority of the middle-class growth happening in the Asia Pacific region.

While the middle-class growth in Asia has slowed with the arrival of COVID-19, eCommerce’s center of gravity has already moved East and will continue to do so as large population centers continue to generate more disposable income.

For the first time in history, the world is within reach.

For example, Patrick Coddou, founder and CEO of Supply said: “Early last year [2021], we decided to duplicate our domestic ads, change the targeting from [the] US to worldwide, set language to English, and hit go,” Coddou says. “The end result was 30% of revenue coming from international markets.”

This kind of success justifies further investment and improvement in the global outreach process. However, there is much evidence that sustaining global success needs investment, which starts with providing growing volumes of content in the local languages of the target markets.

Early SEO-based success can quickly flounder if the larger buyer and customer journey information needs are not addressed, which means substantially more content translation is needed.

The success of early-mover eCommerce marketplaces has also bred increasing competition, and suppliers have to continue to improve their global digital footprint and DX to be noticed or even to maintain gains.

Providing local language content means more than quickly passing content through Google Translate. Raw MT without proper refinement and optimization can undermine global sales efforts and build a negative brand reputation that is difficult to remedy.

Research from CSA states that there is “a strong preference for local language and localization, even if it costs the buyer more. Nearly two-thirds (66%) of business users told us they’d pay up to 30% more for a localized product (2020 survey), and just a bit more than one-third (34%) of consumers said they would also be willing to dig deeper into their wallets for products adapted to their language and market.”

Source: CSA Research

CSA also found that 65% of consumers prefer content in their native language, even if it’s poor quality in a previous survey. Moreover, 40% will not buy from websites in other languages.

In another survey by Flow.io, over 67% of global consumers surveyed said they’d made a cross-border purchase in their lives. Almost one in five respondents stated that lack of language translation was a big barrier to purchasing on a foreign site.

Shopify data shows a 13% relative increase in conversion when buyers were shown a store translated into their language compared to the same one in the default language. Best practice shows that “properly localized content” creates a good customer experience from first impression to checkout.

In fact, in terms of the website content, the majority of shoppers in the Flow.io’s survey agreed that the following pages which include both corporate and user-created content, needed to be in their local language:

  • Product descriptions (67%)
  • Product reviews (63%)
  • Checkout process (63%)


Where do we buy from?

The USA, China, and the UK consistently ranked among the top 3 countries purchased from in most markets. Japan and Korea are markets where the translation is even more critical to success, as cultural reluctance to cross-border purchases is significantly higher.

Source: Flow.io

There have never been as many opportunities in the eCommerce space, nor has there been as much competition. Plummeting return on ad spend is pushing brands to prioritize customer lifetime value and promote brand loyalty. High-quality multilingual content is essential to this effort.

Global customers care about the content that brands share with them, and brands committed to being international need ongoing translation strategies that go beyond “MT-once and forget” approaches. They need to be listening, sharing, and actively communicating multilingually on an ongoing basis to understand changing needs and concerns.

Sales through social media channels around the world are expected to nearly triple by 2025. Customer reviews in social media are recognized as key influencers of brand perception.

Bad CX will be shared vigorously, and can rapidly undermine business success and revenue. Customer reviews influence consumers to try competing products. 

In many eCommerce markets, customer reviews are THE primary driver of purchase behavior.

Being aware of these dynamic perceptions will be critical to long-term success and being able to listen, understand, and respond to problems with agility and speed is crucial.

As more of the world gets more accustomed to a digital-first buyer journey, companies must adapt to stay relevant. As brands face unmatched logistical and communication challenges in the new millennium, they have focused on more engagement with their customers via digital channels.

Success in eCommerce means building ongoing relationships with customers, which in turn requires increasing volumes of content sharing and more localized infrastructure.

“I believe we'll see more local brands branching out and offering customized shopping experiences for international customers to remain competitive. This will include things like geo-targeted domain names, pricing in local currency, and local product shipping, with the help of third-party distribution or company-owned warehouses.”

Leanne Lee, Marketer at Blue Bungalow

The Content Focus

Success in online businesses is increasingly driven by careful and continued attention to providing a good overall customer experience throughout the buyer journey.

Customers want personalized, relevant information to guide their purchase decisions, and also want the self-service support content to be able to be as independent as possible after they buy a product. Thus sellers need to provide much more content, both in terms of volume and relevance, than they traditionally have provided.

Much of the customer journey today involves a buyer interacting independently with content related to the product of interest, and digital leaders now increasingly understand that on digital platforms, useful, and relevant content is how this journey is enhanced and improved.

Understanding and providing content that matters to the customer is a prerequisite for providing superior DX and CX and enabling customer success.

In a recent study, focused on B2B digital buying behavior presented at a recent Gartner conference, Brent Adamson pointed out some interesting research findings that clearly show the increasing value of content in the buyer journey. Some of the highlights:

“Customers spend much more time doing research online -- 27% of the overall purchase evaluation and research [time]. Independent online learning represents the single largest category of time-spend across the entire purchase journey.”

In surveying 750 customers making a large B2B purchase, it was found that the proportion of time they spent working directly with salespeople -- both in-person and online -- was just 17% of the total purchase research and evaluation process time spent.

This fractional time is further diluted when you spread this total salesperson contact time across 3 or more vendors that are typically involved in a B2B purchase evaluation. In a typical, large B2B transaction, it is clear that an individual seller sales representative gets a very small fraction of the total time that a buyer spends in the purchase and evaluation process.

This research also points out that a huge proportion of a sellers’ total access to a customer happens through digital content means, rather than in-person, channels. This means that any B2B supplier without a coherent digital marketing strategy specifically designed to help buyers through the buyer journey will fall rapidly behind those who do.

Digital leaders also understand that because in-person contact begins, it doesn’t mean that online exploration ends, and even long after engaging supplier sales reps in direct in-person conversations, customers simultaneously continue their digital buying journey, making use of both human and digital buying channels. This is why B2B eCommerce in particular, is so omnichannel-focused.

Forrester suggests that product information is more important to customer experience than any other source or type of information including all sales or marketing content.

A digital online platform enables an enterprise to quickly establish a global presence. However, the global customer requires all the same content that a US customer does in the buying journey. This requirement for voluminous multilingual content presents a significant translation challenge for any enterprise that seeks to build momentum in new international markets and thus the right translation technology is critical.

Thus, we see eCommerce giants like eBay, Amazon, and Alibaba are amongst the largest users of machine translation technology in the world today. There is simply too much content that is needed to be multilingual to do this any other way.

However, the translation challenge even with MT is significant and requires deep expertise and competence to do well. The skills needed to do this efficiently and cost-effectively are not easily developed and many B2B sellers are beginning to realize that they do not have the in-house competence to do this and could not effectively develop them in time. Working with comprehensive language translation platforms, or translation engine experts is wise.

Providing local language content means more than quickly passing your content through Google Translate

To participate in new global opportunities, digital leaders should be preparing the following elements in their expanding online digital footprint:

  • Making substantial amounts of relevant content available to support the buyer journey in both B2C, and especially in B2B markets.
  • Make this content available in all the markets and languages that they wish to participate in to maximize their global presence.

Technology alone, or human translation alone, is not enough. Competitive advantage comes from working with experts who provide comprehensive man-machine-translation process collaboration at scale.


The MT Strategy Beyond Google Translate

Given the increasing volumes of content required, suppliers must use machine translation (MT). And while MT does not replace humans, it enables suppliers to make 100X+ more content available to support international customers cost-effectively.

Today, there is a need for growing volumes of both corporate and customer-created content (UGC). The benefit of making more relevant content available outweighs the limitations of imperfect machine translation. This is especially true for suppliers with large catalogs, extensive customer support needs, and high-impact UGC that global customers find useful.

Fundamentally, eCommerce marketplaces must understand the importance of the content in enabling customers to achieve their goals (e.g., what content is critical in their path to purchase) considering the volume, necessary translation turnaround, and languages needed.

And then build translation production models directly related to the value, criticality, and longevity of the information. The emerging best practices model use human-machine collaboration mixes that can be adapted to content type and value.  Also, rapidly improving MT technology like ModernMT assures successful outcomes.


This can sometimes mean that in the initial stages, MT output may be more imperfect, but, with the right technology and process, it is possible to quickly establish a beachhead that evolves continuously.

UGC is a dominant element of the eCommerce content landscape and presents special challenges for MT technology. UGC content is often written by non-native speakers and, most likely, by non-professional content writers. Marco Trombetti, CEO of Translated said: “there is a lot of flexibility that the AI [MT] needs to learn to translate UGC content well. It is not like training a custom model on a very narrow terminology.”


Tech-savvy localization managers who understand this “start now and improve gradually approach“ are now being seen as vital partners in global growth strategies. Best practices suggest that the most effective strategy is to have MT and Human translators working together to build a continuous improvement cycle. 

The strategy to translate a billion new words every month has to be different from the typical localization translation production TEP (translate-edit-proof)  process.

Airbnb is an example where the localization team is seen as a vital partner in enabling global growth. The Airbnb localization team oversees both typical localization content and user-generated content (UGC), across the organization, which means they oversee billions of words a month being translated across 60+ languages using a combined human plus continuously improving MT translation model. The localization team enables Airbnb to translate customer-related content across the organization at scale. High-value external content is often accorded the same attention as internally produced marketing content.

Airbnb runs on ModernMT, the Translated-led, open-source project, co-founded by Fondazione Bruno Kessler, the University of Edinburgh, and the European Commission. ModernMT is an adaptive neural machine translation system with a degree of flexibility that allows it to be used for hundreds of different use-cases, including IP and life sciences translations.

Trombetti added, “The indirect challenge with UGC is scale. Often UGC scale can be a million times bigger in volume than content produced by localization teams, and the volume spikes are much more unpredictable.”

Continuously improving responsive MT is a critical foundation to building better global CX in eCommerce settings. With ModernMT, the use of optimized raw MT output can be delivered at scale to the customer without additional human intervention even though ongoing MT quality monitoring is recommended.

The linguistic human oversight process and approach will likely change over time.

Often, upfront human feedback investments are needed, in addition to selective pre-emptive translation, to make MT engines perform optimally on a range of unique and specialized enterprise content.

It is possible to post-edit 1M words a month with a team, but it is not easy to do this for a billion words a month.

Rebecca Ray of CSA eloquently describes the impact of producing relevant content for the modern eCommerce marketplaces.
“The [most] significant implications revolve around the recognition given to the business value of multilingual content by a high-tech company such as Airbnb through its financial investment in a cross-functional collaboration initiative to greatly expand language accessibility. They must recognize that, in many cases, their products and services do not function independently of information about them and that the most valuable content and code are often generated by third parties [UGC].
Both Airbnb and Expedia recognize that they are not lodging companies, but rather high-tech (multilingual) content companies. And Chesky [Airbnb CEO] certainly understands this very well as he touts Translation Engine [powered by ModernMT] availability, which will deliver five million listings in 62 languages and 500 million reviews without customers having to tap on a translate button.”

Any eCommerce marketplace with global ambitions will need to have a comprehensive language strategy that is flexible and sophisticated as digital leaders like Airbnb change and raise customer expectations.

As more senior executives in the global enterprise ask questions like:

  • How do we integrate our international strategy with our overall corporate strategy?
  • What will this take in terms of people, process, and technology?

We should expect a shift to language as a feature at the platform level wherein language is designed, delivered, and optimized as a feature of a product and/or service from the beginning.

Language accessibility is no longer relegated to a lowly translation task outside the bounds of product or service development and delivery. Rather, it is integrated into content and procedural workflows that affect almost everyone within the organization at some point. Something that analysts call a "language platform."

Perhaps some of the recent interest in language operations and translation operating systems are all steps in the same direction, pointing to a more globally embedded and pervasive translation-focused ecosystem.

The Airbnb deployment is a pioneering example that shows how extensive and deep-reaching translation workflows can be when the value is understood at executive levels.

ModernMT is uniquely positioned to be THE optimal platform-level MT element of such a comprehensive vision. It has flexibility, scalability, straightforward adaptability to scores of use-cases, and low overhead management and maintenance features that allow it to be a “translation engine” at scale.

This is in stark contrast to typical MT systems built by customizing generically focused public MT engines requiring specialized MT expertise, having costly, cumbersome management and maintenance requirements, yet not having the dynamic, real-time learning and adaptation capabilities that ModernMT has.

For more detailed information on ModernMT including white papers, technical product overviews, case studies, best practices, quality comparisons, and presentation material, please contact us here or at info@translated.com

Friday, March 11, 2022

The Evolving Relationship of MT with the Translator

 Machine translation is pervasive today and even the most conservative estimates say that MT is “translating” trillions of words a month across multiple large public MT portals and is used by hundreds of millions of internet users daily at virtually no cost.

As more of the global population comes online, people need MT to access the content that interests them even if only in a gist-sense, and today we see that there is growing momentum in the development and advancement of the state-of-the-art (SOTA) on “low-resource” (languages with limited or scarce data) languages to further accelerate global MT use.

MT technology has been around in some form for the last 70 years and unfortunately has a long history of over-promising and under-delivering. A history of eMpTy promises as it were. However, the more recent history of data-driven MT has been especially troubling for translators, as SMT and NMT pioneers have repeatedly claimed to have reached human parity.

These over-exuberant claims about the accomplishment of MT technology, have driven translator compensation down and have made many would-be translators reconsider their career choices.

It does not help that a more careful examination of the human parity claims by experts shows that these claims are not true, or perhaps only true for a tiny sample of test sentences.

Many say, that the market perception of exaggerated MT capabilities has damaged translator livelihood and there is often great frustration by many who use MT in production environments where the high-quality human equivalent translation is expected but never delivered, without significant additional effort and expense.

To add insult to injury, the overly optimistic MT performance claims have also resulted in many technology-incompetent LSPs attempting to use MT to reduce costs by forcing translators to post-edit low-quality MT output at low rates.

It does not seem to matter that most LSPs have yet to properly learn to use MT in localization production work, according to a survey of MT use by LSPs done by Common Sense Advisory last year.

It is also very telling that the author wrote a blog post on MT post-editing compensation in March 2012 that has had the widest readership of any post he has written ever, and continues even in 2022 to be an actively read post!

Thus, often "monolithic MT" is considered a dark, unuseful,  and unwelcome factor in the lives of translators. However, this state of affairs is often a result of incompetent and unethical use of the technology rather than a core technology characteristic.


The Content and Demand Explosion

However, the news on MT is not all doom and gloom from the translator's perspective. There is a huge demand for language translation as evidenced by the volume of use of public MT, and by the digital transformation imperatives for global enterprises driving the need for better professional MT.

Both public MT and enterprise MT are building momentum. The demand for content from across the globe is exponential which means that translation volumes will also likely explode. And, while much of it can be handled with carefully optimized Enterprise MT, it will also need an ever-growing pool of tech-savvy translators to drive continuously improving MT technology.

World Bank estimates say that by 2022, yearly total internet traffic is projected to increase by about 50 percent from 2020 levels, reaching 4.8 zettabytes, equal to 150,000 GB per second. The growth in global internet traffic is as dazzling as the volume. Personal data are expected to represent a significant share of the total volume of data being transferred cross-border.


It is estimated that the amount of digital data created over the next five years will be more than twice the amount created since the advent of digital storage. Global data creation and replication will experience a compound annual growth of 23 percent in the 2020–2025 forecast (IDC, 2021a). Data traffic trends are related to economic development, value creation, and prosperity.

The sheer volume and explosion in content volumes driven by these trends are already creating an increasing awareness of the supply shortage of translators. The furor around the poor quality of the translation of the Korean hit show “Squid Games” is a telling example of this changing scene.

LSPs and translators are critical to the distribution of that local content on a global scale. But because of a labor shortage and no viable automated solution, the translation industry is being pushed to its limits.

“I can tell you literally, this industry will be out of supply over demand for the upcoming two to three years,” David Lee, the CEO of Iyuno-SDI, one of the industry’s largest subtitling and dubbing providers, said recently. “Nobody to translate, nobody to dub, nobody to mix –– the industry just doesn’t have enough resources to do it.” Interviews with industry leaders reveal most streaming platforms are now at an inflection point, left to decide how much they are willing to sacrifice on quality to subtitle their streaming roster.

So while it is true that as we enter 2022 most LSPs have yet to learn how to use MT efficiently for production use, and that translator compensation at the word level has been decreasing over the last five years, there are also positive changes.

The Translated Srl experience with ModernMT shows that it is possible to use MT effectively for production localization work as Translated uses MT in 95% of their production workload, mainly because the technology is flexible, easy to set up, highly responsive, and agile enough to handle the variations typical in production work.

This is the result of superior architecture, better process integration, and sensitivity to human factors, refined over decades, to ensure sustainable and increasing productivity improvements.


The Translated Srl experience is also direct proof that MT can be a valuable assistive technology tool for serious, i.e. professional human translation work.

The ModernMT technology is perhaps the only MT technology optimized for production localization work and is already in the process of being extended to work with video content (MateDub & MateSub). Video adds time synchronization challenges to the basic translation tasks.


The Importance of the Human-In-The-Loop

The exploding content and enterprise CX demands to provide more relevant content to their customers also suggests that there is a potential for rates to rise as more enterprises begin to understand that improving translation quality has to be linked to an increased role of humans-in-the-loop to make MT perform better on the specific content that matters to the enterprise.

As we consider the possibility of MT achieving human parity on language translation at production scale we need to remind ourselves of the following. Language is the cornerstone of human intelligence.

The emergence of language was the most important intellectual development in our species’ history. It is what separates us from all other species on the planet. It is through language that we formulate thoughts and communicate them to one another. Language enables us to reason abstractly, to develop complex ideas about what the world is and could be, and to build on these ideas across generations and geographies. Almost nothing in modern civilization would be possible without language.

Building machines that can “understand” language has thus been a central goal of the field of artificial intelligence dating back to its earliest days, but this has proven to be maddeningly elusive. The current state of MT is the result of 70 years of effort, and having a machine master language may either be impossible or simply much farther out in the future than the ML-focused singularity-is-nigh fanboys can envision.

This is because mastering language is what is known as an “AI-complete” problem: that is, an AI that can understand language the way a human can, would by implication be capable of any other human-level intellectual activity. Put simply, to solve the language challenge is to create human-equivalent machine intelligence.

Competent linguistic feedback is needed to improve the state of MT technology, and humans are needed to improve the quality of MT output for enterprise use.

We see today that machine translation is ubiquitous, and by many estimates is responsible for 99.5% or more of all language translation done on the planet on any given day. But we also see that MT is used mostly to translate material that is voluminous, short-lived, transitory and that would never get translated if the machine were not available to help.

Trillions of words are being translated by MT weekly, yet when it matters, there is always human oversight on translations that may have a high impact, or when there is great potential risk or liability from mistranslation.

While machine learning use-cases continue to expand dramatically, there is also an increasing awareness that a human-in-the-loop is necessary since the machine lacks comprehension, cognition, and common sense, all elements that constitute “understanding”.

As Rodney Brooks, the co-founder of iRobot said in a post entitled - An Inconvenient Truth About AI: "Just about every successful deployment of AI has either one of two expedients: It has a person somewhere in the loop, or the cost of failure, should the system blunder, is very low."

As the use of machine learning proliferates, there is an increasing awareness that humans working together with machines in an active learning contribution mode can often outperform the possibilities of machines or humans alone.

Many of the public generic MT engines already have billions of sentence pairs that underlie and “train” the model. Yet, we see an increasing acknowledgment from the AI community that language is indeed a hard problem. One that cannot necessarily be solved by using more data and algorithms alone, and a growing awareness that other strategies will need to be employed.

This does not mean that these systems cannot be useful, but we are beginning to understand that while language AI tools are useful, they have to be used with care and human oversight, at least until machines have more robust comprehension and common sense.

Effective human-in-the-loop (HITL) implementations allow the machine to capture an increasing amount of highly relevant knowledge and enhance the core application as ModernMT does with MT.

Another way to look at this is to see the Language AI or MT model as a prediction system, rather than as a representative model of a human translator.

Very simply put, we are using information that we do have to generate information that we don’t have.

MT models are built primarily with translation memory (a.k.a training data) and are most successful with material that is most similar to this training data. MT models take the new source material and produce a prediction of this material into a target language based on what it knows from what it has been explicitly trained with.

With deep learning, pattern detection and prediction have gotten more sophisticated, but we are still, quite some distance from actual understanding, comprehension, and cognition.

A human translation cognitive flow within the brain of a competent human translator has significantly more sophisticated capabilities around the many translation-related sub-tasks that require and involve actual intelligence, gathered from multisensorial life experience and common sense.

Human translators understand the relevant document, historical, and situational context even though it may not be explicitly stated. They identify semantic intent, and add cultural context into the translation, reading between the lines to ensure overall accuracy, guided by common sense, on what may not be stated but can be “understood” from life experience, insight, and deep comprehension.

This is in stark contrast to just performing the literal conversion of word strings and patterns from the source language to a target language that MT systems are limited to. Systems trained on billions of example "training" sentences have yet to capture what humans do. More data is not enough.

To restate, it is more accurate to see the MT model as a prediction system rather than an understanding system. Much of the recent success with AI and machine learning is a result of converting problems that were not historically prediction problems into prediction problems e.g. self-driving cars, fraud detection, and automated email replies.

MT systems are most useful when they produce a large number of useful predictions, even if these are not "perfect". It is as useful for a translator as TM, maybe even more so, when MT is responsive, continuously learning, and a true assistant.

The overview of the development and deployment of the prediction model can be seen in this generic graphic overview which is true for MT and many other ML use cases.


Once a model has been deployed ongoing improvement in its prediction ability can be driven by more data, better learning algorithms, more computing power, and ongoing corrective feedback that becomes increasingly important as an ML model evolves in competence and performance.

After 70 years of MT research, it is increasingly clear that the efficient incorporation of human corrective feedback is one of the fastest and most useful ways available to improve an MT system's performance.

The following chart shows what happens at the monitor stage where human judgment and active corrective feedback on model outputs begin to drive improvements on the specific material in focus. The best systems will take feedback and process, learn, update, and incorporate new learning quickly to improve the predictions of the model in real-time.


The speed and ease with which new learning can be incorporated into an MT system are critical determinants of the value of the MT system to an individual translator. There is great value for all stakeholders in improving the predictive capabilities of an MT system.


ModernMT: An MT system designed for the translator

The modern era translator work experience often involves the use of translation memory (TM). Since it improves translator productivity when the TM is related and relevant to any new translation work that a translator may undertake.

MT is used less often by professional translators in general because of the following reasons:

  • Generic MT output is of limited value.
  • Most MT systems have a very limited ability to customize and adapt the generic system to the translator's area of focus and specialization.
  • The typically complex customization process often requires that translators have skills that are typically outside of the scope of translator education.
  • A large volume of data (more than most translators can summon) is needed to have any impact on generic engine performance. This also makes it difficult for most LSPs to also customize an MT engine as most of the MT models in the market require tens of thousands or more segments of training data to have an impact.
  • The very slow rate of improvement of most MT engines means that translators must correct the same errors over and over again. The whole improvement process can itself be a significant engineering undertaking and task.
  • The open admission of MT use is often penalized with lower compensation and lower word rates.
  • The inability to control and improve MT output predictably means that translators themselves have a higher level of uncertainty about the utility of MT given project deadlines and thus fallback to traditional approaches.

For MT to be useful to a translator it needs the following attributes:

  • Tight integration with CAT tools that are the primary work environment for translators.
  • Easy to start using without geeky technical preparation and ML-customization-related work.
  • Rapid learning of new material and incorporation of any corrective feedback so that the MT system is continuously improving, by the day or even the hour.
  • The ability to handle project-related terminology with ease.
  • Keep translator data private and secure.
ModernMT is an MT system that is designed to adapt to the unique needs and focus of an individual translator in essentially the same way that TM does. In many ways, it is a next-generation TM technology that has predictive capabilities.

ModernMT is a translator-focused  MT architecture that has been built and refined over a decade with active feedback and learning from a close collaboration between translators and MT researchers.

ModernMT has been used intensively in all the production translation work done by Translated Srl for over 15 years and was a functioning human-in-the-loop (HITL) machine learning system before the term was even coined.

ModernMT is perhaps the only MT system that was designed by translators for translators rather than by pure technologists working in isolation with data and algorithms.

This long-term engagement with translators and continuous feedback-driven improvement process also results in creating a superior training data set over the years. This superior training data enables users to have an efficiency and quality advantage that is not easily or rapidly replicated.

This is also the reason why ModernMT does so consistently well in third-party MT system comparisons, even though evaluators do not always measure its performance optimally. ModernMT simply has more informed translator feedback built into the system.

The following is a summary of features in a well-designed Human-in-the-loop (HITL) system, such as the one underlying ModernMT:

  • Easy setup and startup process for any and every new adapted MT system that allows even a single translator to build hundreds of domain-focused systems.
  • Responsive: Active and continuous corrective feedback is rapidly processed so that translators can see the impact of corrections in real-time and the system improves continuously without requiring the translator to set up a data collection and re-training workflow.
  • An MT system that is continuously training and improving with this feedback (by the minute, day, week, month). Small volumes of correction can improve the ongoing MT performance.
  • Tightly integrated into the foundational CAT tools used by translators who provide the most valuable system-enhancing feedback.
  • Different engagement and interaction with MT than a typical PEMT experience. 

I recently interviewed several translators who are active ModernMT users and have summarized their comments (+ve and -ve) below. Their comments contain pearls of wisdom and anecdotal experience that may be useful to other translators who are still considering MT.

Subject focus by those who shared their usage patterns with me included accounting/finance, legal contracts, complex engineering equipment-related content, marketing content, product manuals, newsletters & press releases, medical information for patients, and even Buddhism & meditation-related content. Many simply provided categories like Law, Medical, Technical.

The extent of use: Used in the large majority of work they did, except for DTP or very specialized domain content that they did on an infrequent basis. Many said that the real benefits start to accrue after one builds up some TM and that over time ModernMT learns to support your primary workload.

How is MT engaged: CAT Tools (Trados), ModernMT GUI, and MateCat

Why: Work volumes and turnaround requirements and high-level data privacy and availability of TM to enable adaptation.

Competitive systems evaluated: Google, DeepL, Systran, Kantan

“I have used DeepL and Google, which can be very useful, although I still find ModernMT to have better overall accuracy compared to both of them. DeepL is a good alternative for comparing output, although it is much less consistent compared to ModernMT when working on large documents e.g. consistency of terminology etc.”

“I can tell you this with peace in my mind that nothing can replace ModernMT. ModernMT has magic that no one can describe. It really adapts to contexts and stores my previous translations and yields me 99% accurate translations.

Improvements needed: Word case handling for acronyms and abbreviations, handling of short phrases and titles, the lack of persistence of terms across documents, better format preservation, better dashboard.

Desirable New features: Glossary and terminology handling, a dashboard on data and usage, more robust punctuation handling, real-time predictive capabilities, pre-translation quality assessment.

”I consider MT as a development tool, making our job easier, but not a tool that gives the final product. It is like an advanced medical tool used by a surgeon during surgery, which helps the surgeon to make fewer mistakes, to save time, and to save the life of the patient.”

A strong positive comment by a translator who also provided constructive areas of improvement content: “I have noticed incredible improvement [in the MT quality] as if it is my roommate who was trying to get to know me and my translation style and way of constructing the sentences.”

Many were surprised to find out that glossary and terminology terms are best introduced to ModernMT in sentence form rather than as short phrases as the context and variants shown in sentence-context ensures a faster pick-up and learning.

Several expressed surprise that more translators did not realize cost/benefit and productivity advantages to be gained by using a responsive MT system like ModernMT and also mentioned that success with ModernMT required investment in one or all of the following: time, corrective feedback, and personal TM but can yield surprisingly good results in as little as a few weeks.


To close this post I include a podcast done with ProZ last year, that I got very positive feedback on, from many translators.

Conversation with Paul Urwin of Proz on MT

Paul talks with machine translation expert Kirti Vashee about interactive-adaptive MT, linguistic assets, freelance positioning, how to add value in explosive content situations, e-commerce translation, and the Starship Enterprise.


Paul continues the fascinating discussion with Kirti on machine translation. In this episode, they talk about how much better MT can get, which languages it works well for, data, content, pivot languages, and machine interpreting.

Wednesday, January 12, 2022

Most Popular Blog Posts of 2021

Here is the list of most popular blog posts in 2021. The only theme that I can discern in the list is that there is a greater focus on better understanding what is real and viable from a technology viewpoint and looking beyond the hype. The secondary theme is more exploration into the "how" to do it right which is all about better human-machine collaboration and creating a more robust assistant role for MT.

I have noticed that these lists tend to favor the posts that were published earliest in the year and in 2020 the post would have easily been the top post had it been published earlier in the year. 

The most popular post for the year was:

1. The Quest for Human Parity Machine Translation 


We have over the last few years, especially since the emergence of Neural MT seen several claims of MT systems having reached human parity. Anyone could show that this was not true within minutes of submitting a few sentences to verify this. The basis of the claim typically is the performance of MT systems on certain measured metrics (scores) on tiny test sets. NLG rankings have the same problem with leaderboards with over-exuberant claims of having reached human parity. Thus, the extrapolations of achieving human-level performance are extravagant, to put it mildly. However, as soon as you move away from data that is typical in the training data, one notices how brittle and fragile these systems really are.

MT developers should refrain from making claims of achieving human parity until there is clear evidence that this is happening at scale. Most current claims on achieving parity are based on laughably small samples of 100 or 200 sentences. I think it would be useful to the user community-at-large that MT developers refrain from making these claims until they can show all of the following:
    • 90% or more of a large sample (>100,000 or even 1M sentences) that are accurate and fluent and truly look like they were translated by a competent human
    • Catch obvious errors in the source and possibly even correct these before attempting to translate 
    • Handle variations in the source with consistency and dexterity
    • Have at least some nominal amount of contextual referential capability
Note that these are things we would expect without question from an average translator. So why not from the super-duper AI machine? 

 



The second most popular post was a guest post by @VeredShwartz on the challenge of building AI that has common sense.

Common sense has been called the “dark matter of AI” — both essential and frustratingly elusive. That’s because common sense consists of implicit information — the broad (and broadly shared) set of unwritten assumptions and rules of thumb that humans automatically use to make sense of the world. Critics of over-exuberant AI claims frequently point out that two-year children have more common sense than existing deep-learning-based AI systems whose "understanding" is often quite brittle and easily distracted and deranged.

Common sense is easier to detect than to define. The implicit nature of most common-sense knowledge makes it difficult and tedious to represent explicitly. 

"The great irony of common sense—and indeed AI itself—is that it is stuff that pretty much everybody knows, yet nobody seems to know what exactly it is or how to build machines that possess it," said Gary Marcus, CEO, and founder of Robust.AI. "Solving this problem is, we would argue, the single most important step towards taking AI to the next level. Common sense is a critical component to building AIs that can understand what they read; that can control robots that can operate usefully and safely in the human environment; that can interact with human users in reasonable ways. Common sense is not just the hardest problem for AI; in the long run, it's also the most important problem." 


The third most popular post was based on some research I did on ModernMT which impressed me enough that I decided to join the company that built it. This decision was further validated when they announced that they were the heart of the "translation engine" that Airbnb uses to power UGC translation and ensure an optimal global CX for all their customers. This is done by translating billions of words a month through a continuously improving MT infrastructure and is quite likely to be one of the largest deployments of MT technology in the world for UGC by any global enterprise.

3. ModernMT: A Closer Look At An Emerging Enterprise MT Powerhouse

The ModernMT system was used heavily by translators who worked for Translated and the MT systems were continually adapted and modified to meet the needs of production translators. This is a central design intention and it is important to not gloss over this, as this is the ONLY MT initiative I know of where Translator Acceptance is used as the primary criterion on an ongoing basis, in determining whether MT should be used for production work or not. The operations managers will simply not use MT if it does not add value to the production process and causes translator discontent.

The long-term collaboration between translators and MT developers, and resulting system and process modifications are the key reasons why ModernMT does so well in both generic MT system comparisons by independent testers, and this is especially pronounounced in adapted/customized MT comparisons.

Over the years the ModernMT product evolution has been driven by changes to identify and reduce post-editing effort rather than optimizing BLEU scores as most others have done. This makes it the best system available for translators in my opinion as all the heavy lifting for customization is done in the background, seamlessly and transparently.

ModernMT has reached this point with very little investment in sales and marketing infrastructure. As this builds out and expands I will be surprised if ModernMT does not continue to expand and grow its enterprise presence, as enterprise buyers begin to understand that a tightly integrated man-machine collaborative platform that is continuously learning, is key to creating successful MT outcomes.

This was followed by:

4. Building Equity In The Translation Workflow With Blockchain


and an interview with ProZ which was well received and which continues to regularly generate feedback from readers. It includes links to the original podcast.


Midway through the year, I started engaging with ModernMT and Translated in a much more substantial way, and thus there was a continuity break and publishing hiatus for a while. 

The posts since my engagement with Translated are influenced by my increasing exposure to ModernMT, but they are still honest opinions that I would stand by. I expect that these posts will become much more popular as they have time to circulate.

The most popular in 2021 are:



Ideally, the “best” MT system would be identified by a team of competent translators who would run a diverse range of relevant content through the MT system after establishing a structured and repeatable evaluation process. 

This is slow, expensive, and difficult, even if only a small sample of 250 sentences is evaluated.

Thus, automated measurements that attempt to score translation adequacy, fluency, precision, and recall have to be used. They attempt to do what is best done by competent humans. This is often done by comparing MT output to a human translation in what is called a Reference Test set. These reference sets cannot provide all the possible ways a source sentence could be correctly translated. Thus, these scoring methodologies are always an approximation of what a competent human assessment would determine, and can sometimes be wrong or misleading. Small differences in scores are particularly meaningless.

Thus, identifying the “best MT” solution is not easily done. Consider the cost of evaluating ten different systems on twenty different language combinations with a human team versus automated scores. Even though it is possible to rank MT systems based on scores like BLEU and hLepor, they do not represent production performance. The scores are a snapshot of an ever-changing scene. If you change the angle or the focus the results would change.

It has recently become common practice to use "MT routers" that select the "best" MT system for you, but I maintain that this is a  practice that will often lead to sub-optimal choices, as your rankings and selections are only as good as your test set selections, and you are always looking at old, out-of-date data. MT systems are always evolving and how quickly and easily systems learn to do what you focus on is much more relevant than a score from an old ranking. 


The final post in the popularity list for 2021 is this one:

8. The Human-In-The-Loop Driving MT Progress


I expect this post will be an evergreen post since the issues raised are of long-term if not perennial interest. As we see Tesla Self Driving, Alexa, GPT-3, and the other AI fads of the day regularly fumble and fall, more and more people realize that AI can be a super assistant if properly built, but that it is wise and even imperative to keep a human-in-the-loop to keep the AI from doing dangerous or stupid things.

Neural MT “learns to translate” by looking closely (aka as "training") at large datasets of human-translated data. Deep learning is self-education for machines; you feed the system huge amounts of data, and it begins to discern complex patterns within the data.

But despite the occasional ability to produce human-like outputs, ML algorithms are at their core only complex mathematical functions that map observations to outcomes. They can forecast patterns that they have previously seen and explicitly learned from. Therefore, they’re only as good as the data they train on and start to break down as real-world data starts to deviate from examples seen during training.

In most cases, the AI learning process happens upfront and only takes place in the development phase. The model that is developed is then brought onto the market as a finished program. Continuous “learning” is neither planned nor does it always happen after a model is put into production use. This is also true of most public MT systems. While these systems are updated periodically, they are not easily able to learn and adapt to new, ever-changing production requirements. 

With language translation, the critical training data is translation memory. 
However, the truth is that there is no existing training data set (TM) that is so perfect, complete, and comprehensive as to produce an algorithm that consistently produces perfect translations.

 Human-in-the-loop aims to achieve what neither a human being nor a machine can achieve on their own. When a machine isn’t able to solve a problem, humans step in and intervene. This process results in the creation of a continuous feedback loop that produces output that is useful to the humans using the system.

With constant feedback, the algorithm learns and produces better results over time. Active and continuous feedback to improve existing learning and create new learning is a key element of this approach.

As Rodney Brooks, the co-founder of iRobot said in a post entitled - An Inconvenient Truth About AI:

 "Just about every successful deployment of AI has either one of two expedients: It has a person somewhere in the loop, or the cost of failure, should the system blunder, is very low."

In the translation context, with ModernMT, this means that the system is designed from the ground up to actively receive feedback and rapidly incorporate this into the existing model on a daily or even hourly basis.

 AI lacks a theory of mind, cognition, common sense and causal reasoning, extrapolation capabilities, and a physical body collecting multi-sensory contextual data, and so it is still extremely far from being “better than us” at almost anything slightly complex or general.

This also suggests that humans will remain at the center of complex, knowledge-based AI applications even though the way humans work will continue to change. The future is more likely to be about how to make AI be a useful assistant than it is about replacing humans. 


For those who wonder, what post has gotten the most readership over 12 years that this blog has been in place, the answer is a post I wrote on post-editor compensation in 2012. This is unfortunate as it suggests that this is an issue that people are still grappling with in 2021 and that it remains unresolved for many. It is still being read thousands of times a year if Google Analytics is to be believed:

Exploring Issues Related to Post-Editing MT Compensation



My final post of the 2021 year which I wrote during downtime during the holiday season, has little or nothing to do with MT but I somehow managed to link it to some musings on the limits of AI and machine learning. It is the post that I had the most fun writing and I think also based on initial feedback, one that people actually enjoyed reading. I would not be surprised if it is an evergreen post, i.e. one that continues to be popular over many years. I recommend it, as it is about human connection and is something that could be shared with anyone, even those who have little interest in AI, MT, or translation. It is primarily about music which many have said is the universal language, and about how music connects us to feeling and emotion where language is unnecessary:

The Human Space Beyond Language

  

Peace.


Wishing you a wonderful and successful New Year.