Translated Srl is a pioneer in using MT in professional translation settings at a production scale. The company has a long history of innovation in the effective use of MT technology (an early form of AI) in production settings. It has deployed MT extensively across much of its professional translation workload for over 15 years and has acquired considerable expertise in doing this efficiently and reliably.
Machine Translation
IS
Artificial Intelligence
One of the main drivers behind language AI has been the ever-increasing content volumes needed in global enterprise settings to deliver exceptional global customer experience. The rationale behind the use of language AI in the translation context has always been to amplify the ability of stakeholders to produce higher volumes of multilingual content more efficiently and at increasingly higher quality levels.
Consequently, we are witnessing a progressive human-machine partnership where an increasing portion of the production workload is being transferred to machines as technology advances.
Research analysts have pointed out that even as recently as 2022-23 LSPs and localization departments have struggled with using generic (static) MT systems in enterprises for the following reasons:
- Inability to produce MT output at the required quality levels. Most often due to a lack of training data needed to see meaningful improvement.
- Inability to properly estimate the effort and cost of deploying MT in production.
- The ever-changing needs and requirements of different projects with static MT that cannot adapt easily to new requirements create a mismatch of skills, data, and competencies.
The Adaptive MT Innovation
In contrast to much of the industry, Translated was the first mover in the production use of adaptive MT since the Statistical MT era. The adaptive MT approach is an agile and highly responsive way to deploy MT in enterprise settings as it is particularly well-suited to rapidly changing enterprise use case scenarios.
From the earliest days, ModernMT was designed to be a useful assistant to professional translators to reduce the tedium of the typical post-editing (MTPE) work process. This focus on building a productive and symbiotic human-machine relationship has resulted in a long-term trend of continued improvement and efficiency.
The ModernMT approach to MT model adaptation is to bring the encoding and decoding phases of model deployment much closer together, allowing dynamic and active human-in-the-loop corrective feedback, which is not so different from the in-context corrections and prompt modifications we are seeing being used with large language models today.
It is now common knowledge that machine learning-based AI systems are only as good as the data they use. One of the keys to long-term success with MT is to build a virtuous data collection system that refines MT performance and ensures continuous improvement. This high-value data collection effort has been underway at Translated for over 15 years and is a primary reason why ModernMT outperforms competitive alternatives.
This is also a reason why it makes sense to channel translation-related work through a single vendor so that an end-to-end monitoring system can be built and enhanced over time. This is much more challenging to implement and deploy in multi-vendor scenarios.
The existence of such a system encourages more widespread adoption of automated translation and enables the enterprise to become efficiently multilingual at scale. The use of such a technological foundation allows the enterprise to break down the language as a barrier to global business success.
The MT Quality Estimation & Integrated Human-In-The-Loop Innovation
As MT content volumes rapidly increase in the enterprise, it becomes more important to make the quality management process more efficient, as human review methods do not scale easily. It is useful for any multilingual-at-scale initiative to rapidly identify the MT output that most need correction and focus critical corrective feedback primarily on these lower-quality outputs to enable the MT system to continually improve and ensure overall improved quality on a large content volume.
The basic idea is to enable the improvement process to be more efficient by immediately focusing 80% of the human corrective effort on the 20% lowest-scoring segments. Essentially, the 80:20 rule is a principle that helps individuals and companies prioritize their efforts to achieve maximum impact with the least amount of work. This leveraged approach allows overall MT quality, especially in very large-scale or real-time deployments, to improve rapidly.
Human review at a global content scale is unthinkable, costly, and probably a physical impossibility because of the ever-increasing volumes. As the use of MT expands across the enterprise to drive international business momentum and as more automated language technology is used, MTQE technology offers enterprises a way to identify and focus on the content that needs the least, and the most human review and attention, before it is released into the wild.
When a million sentences of customer-relevant content need to be published using MT, MTQE is a means to identify the ~10,000 sentences that most need human corrective attention to ensure that global customers receive acceptable quality across the board.
This informed identification of problems that need to be submitted for human attention is essential to allow for a more efficient allocation of resources and improved productivity. This process enables much more content to be published without risking brand reputation and ensuring that desired quality levels are achieved. In summary, MTQE is a useful risk management strategy as volumes climb.
Pairing content with lower MTQE scores into a workflow that connects a responsive, continuously learning adaptive MT system like ModernMT with expert human editors creates a powerful translation engine. This combination allows for handling large volumes of content while maintaining high translation quality.
When a responsive adaptive MT system is integrated with a robust MTQE system and a tightly connected human feedback loop, enterprises can significantly increase the volume of published multilingual content.
The conventional method, involving various vendors with different and distinct processes, is typically slow and prone to errors. However, this sluggish and inefficient method is frequently employed to enhance the quality of MT output, as shown below.
Rapidly pinpointing errors and concentrating on minimizing the size of the data set requiring corrective feedback is a crucial aim of MTQE technology. The business goal centers on swiftly identifying and rectifying the most problematic segments.
Speed and guaranteed quality at scale are highly valued deliverables. Innovations that decrease the volume of data requiring review and reduce the risk of translation errors are crucial to the business mission.
The additional benefit of an adaptive rather than a generic MTQE process further extends the benefit of this technology by reducing the amount of content that needs careful review.
The traditional model of post-editing everything is now outdated.
The new approach entails translating everything and then only revising the worst and most erroneous parts to ensure an acceptable level of quality.
For example, if an initial review of 40% of the sentences with the lowest MTQE score using a generic MTQE model identifies 60% of the major problems in a corpus, using the adaptive QE model informed by customer data can result in the identification of 90% of the "major" translation problems in a corpus by focusing only on the 20% lowest scoring MTQE scores using the adaptive MTQE model.
This innovation greatly enhances the overall efficiency. The chart below shows how a process that integrates adaptive MT, MTQE, and focused human-in-the-loop (HITL) work together to build a continuously improving translation production platform.
The capability to enhance the overall quality of translation in a large, published corpus by analyzing less data significantly boosts the efficiency and utility of automated translation. An improvement process based on Machine Translation Quality Estimation (MTQE) is a form of technological leverage that advantages extensive translation production.
The Evolving LLM Era and Potential Impact
The
emergence of Large Language Models (LLMs) has opened up thrilling new
opportunities. However, there is also a significant number of vague and
ill-defined claims of "using AI" by individuals with minimal experience
in machine learning technologies and algorithms. The disparity between
hype and reality is at an all-time high, with much of the excitement not
living up to the practical requirements of real business use cases.
Beyond concerns of data privacy, copyright, and the potential for misuse
by malicious actors, issues of hallucinations and reliability
persistently challenge the deployment of LLMs in production
environments.
Enterprise users expect their IT infrastructure to consistently deliver reliable and predictable outcomes. However, this level of consistency is not currently easily achievable with LLM output. As the technology evolves, many believe that expert use of LLMs could significantly and positively impact current translation production processes.
No comments:
Post a Comment