Machine translation is pervasive today and even the most conservative
estimates say that MT is “translating” trillions of words a month
across multiple large public MT portals and is used by hundreds of
millions of internet users daily at virtually no cost.
As more of
the global population comes online, people need MT to access the content
that interests them even if only in a gist-sense, and today we see that
there is growing momentum in the development and advancement of the
state-of-the-art (SOTA) on “low-resource” (languages with limited or
scarce data) languages to further accelerate global MT use.
MT
technology has been around in some form for the last 70 years and
unfortunately has a long history of over-promising and under-delivering.
A history of eMpTy promises as it were. However, the more recent
history of data-driven MT has been especially troubling for translators,
as SMT and NMT pioneers have repeatedly claimed to have reached human parity.
These
over-exuberant claims about the accomplishment of MT technology, have
driven translator compensation down and have made many would-be
translators reconsider their career choices.
Many
say, that the market perception of exaggerated MT capabilities has
damaged translator livelihood and there is often great frustration by
many who use MT in production environments where the high-quality
human equivalent translation is expected but never delivered, without
significant additional effort and expense.
To add insult to injury, the
overly optimistic MT performance claims have also resulted in many
technology-incompetent LSPs attempting to use MT to reduce costs by
forcing translators to post-edit low-quality MT output at low rates.
It is also very telling that the author wrote a blog post on MT post-editing compensation
in March 2012 that has had the widest readership of any post he has
written ever, and continues even in 2022 to be an actively read post!
Thus,
often "monolithic MT" is considered a dark, unuseful, and
unwelcome factor in the lives of translators. However, this state of
affairs is often a result of incompetent and unethical use of the
technology rather than a core technology characteristic.
The Content and Demand Explosion
However,
the news on MT is not all doom and gloom from the translator's
perspective. There is a huge demand for language translation as
evidenced by the volume of use of public MT, and by the digital
transformation imperatives for global enterprises driving the need for
better professional MT.
Both public MT and enterprise MT are building momentum. The demand for content from across the globe is exponential which means that translation volumes will also likely explode.
And, while much of it can be handled with carefully optimized
Enterprise MT, it will also need an ever-growing pool of tech-savvy
translators to drive continuously improving MT technology.
World
Bank estimates say that by 2022, yearly total internet traffic is
projected to increase by about 50 percent from 2020 levels, reaching 4.8
zettabytes, equal to 150,000 GB per second. The growth in global
internet traffic is as dazzling as the volume. Personal data are
expected to represent a significant share of the total volume of data
being transferred cross-border.
This has been the result of a lot of MT providers doing a poor job over the past 25 years to create relationships with professional human translators. I warned us in some LinkedIn posts in 2009 and 2010 that the result would be professional translators becoming enemies rather than allies. The professional translators have a very large influence on customers. Some LSPs even train their sales people to show bad quality of MT to sell more higher quality translation services without MT.
Very few training programs in universities with MT software. Very little involvement in building up the next generation of MT post-editors, or whatever the participants can be called.
And the market materials continue to promise very good to excellent MT without extra effort by humans.
Some MT providers are trying to change this, but they are the minority.
I agree with Jeff Allen that the MT technology community has created much of the enmity helped by incompetent LSPs who used "bad MT" to simply force compensation down. But highly responsive MT is now available and freelancers can get more control and leverage if they learn to use it more skillfully.
I am not sure if it is just my personal perception but it seems that the level of colaboration between programmers and translators is low when it comes to develop MT/AI. I would love the idea of having an AI-Translator capable to adapt to my style of working and that I can train to deliver a faster and most efficient service to LSPs - but somehow translators and linguists got the feeling that we have been ignored and MT providers have gone directly to LSPs.
I think you are correct, there is very little collaboration between MT engineers and translators. Also, very few engineers understand the translation work process so what they develop may not be that useful to translators. ModernMT is one of the very few exceptions that tune into each individual's work portfolio.
Kirti Vashee I entirely agree with you. But as I always said, Translation is not a verbatim act. 8 words in l1 7-9 in L2 and the algorithm then says, it's a good translation. Verbatim translation is a failure and is unacceptable. This is where Google and DeepL score. Their billions of parameters ensure that the translation reads 'human' and not just verbatim which is what low end NMT systems provide.
Google alerts for “translation technology” only focus on MT, because what players in this industry know as such is irrelevant for most customers. Alerts for “machine translation” indicate a steady growth in interest around MT, because the relationships most customers have been experiencing with LSPs is disappointing when not definitely poor, despite all the BS. Or maybe due to it. There should be no surprise in considering how enduring the interest in MT has been in over 70 years. The whole language community should have anticipated it, embraced it, and exploited it rather than fought it. How stupid! There is no human in the loop, there’s never been: no technology can exist and work and be useful without human beings, devising it, implementing it, using it. It is simply too late for linguists: machine are already doing most of the work with humans being paid peanuts for a job that has remained invariably the same for decades. There are no contrarians nor visionaries in this industry, only one or two Kassandras, and many Trojans. The Achaeans have been in for years and still too many pretend they’re safe. Possibly waiting for someone to show them the way to El Dorado (the mythical premium market). Fools.
In the final context, the human touch is always needed. No NMT engine however good can replace a human being. In my opinion, it is a question of trust. I will trust a reputed agency to translate a text, but I will not trust a NMT engine. The day we start trusting the output of these engines [like we do with Google Maps, albeit to a large extent], the days for translators, I am afraid, are numbered and it is only in material which has legal or economic repercussions that a translator will be called upon
I have been researching contexts in which the human touch is not present, yet MT is satisfactorilly fulfilling a need. My own idea is that some of the responsibility a translator would carry in other situations is carried by the consumer of the unedited, raw MT. In a good situation, they are fully aware they are consuming raw MT and they approach it in a different way than they would human translation - with caution and an awareness that any given passage could contain mistakes. I studied one context in which raw MT is used extensively (the research work done in patenting processes). One of my goals was to learn about the role of trust in their use of raw MT. What I eventually concluded was that it's not fully about trust - it's about risk management. Trust plays a role there too, but the more important factor is the calculation of the risk of relying on raw MT in any given situation. When the risk is too high, patent professionals turn to human translators.
I agree with you. Blind trust may apply to languages like French where there is considerable data. And even then both DeepL and Google goofed up on the word bâtiment: building but also a 'navire". The context allowed me to know it was a ship and not a building. In Low Resource languages, the situation is worse. And the need for PEMT is always present. But how long ?
ModernMT is a significantly more adaptive MT system than DeepL, but it requires that the user teach the system with TM, corrective feedback and glossary terms entered in full sentence form. If a translator makes this investment the quality improvement yield will generally be MUCH greater than any generic system. Read the comments by the translators who have made this investment in the last part of the post.
MT definitely needs the human in the loop
ReplyDeleteThis has been the result of a lot of MT providers doing a poor job over the past 25 years to create relationships with professional human translators. I warned us in some LinkedIn posts in 2009 and 2010 that the result would be professional translators becoming enemies rather than allies. The professional translators have a very large influence on customers. Some LSPs even train their sales people to show bad quality of MT to sell more higher quality translation services without MT.
ReplyDeleteVery few training programs in universities with MT software. Very little involvement in building up the next generation of MT post-editors, or whatever the participants can be called.
And the market materials continue to promise very good to excellent MT without extra effort by humans.
Some MT providers are trying to change this, but they are the minority.
I agree with Jeff Allen that the MT technology community has created much of the enmity helped by incompetent LSPs who used "bad MT" to simply force compensation down. But highly responsive MT is now available and freelancers can get more control and leverage if they learn to use it more skillfully.
DeleteI am not sure if it is just my personal perception but it seems that the level of colaboration between programmers and translators is low when it comes to develop MT/AI. I would love the idea of having an AI-Translator capable to adapt to my style of working and that I can train to deliver a faster and most efficient service to LSPs - but somehow translators and linguists got the feeling that we have been ignored and MT providers have gone directly to LSPs.
ReplyDeleteI think you are correct, there is very little collaboration between MT engineers and translators. Also, very few engineers understand the translation work process so what they develop may not be that useful to translators. ModernMT is one of the very few exceptions that tune into each individual's work portfolio.
DeleteKirti Vashee I entirely agree with you. But as I always said, Translation is not a verbatim act. 8 words in l1 7-9 in L2 and the algorithm then says, it's a good translation. Verbatim translation is a failure and is unacceptable. This is where Google and DeepL score. Their billions of parameters ensure that the translation reads 'human' and not just verbatim which is what low end NMT systems provide.
DeleteThanks Kirti Vashee for sharing. A very good read indeed.
ReplyDeleteGoogle alerts for “translation technology” only focus on MT, because what players in this industry know as such is irrelevant for most customers. Alerts for “machine translation” indicate a steady growth in interest around MT, because the relationships most customers have been experiencing with LSPs is disappointing when not definitely poor, despite all the BS. Or maybe due to it.
ReplyDeleteThere should be no surprise in considering how enduring the interest in MT has been in over 70 years. The whole language community should have anticipated it, embraced it, and exploited it rather than fought it. How stupid!
There is no human in the loop, there’s never been: no technology can exist and work and be useful without human beings, devising it, implementing it, using it. It is simply too late for linguists: machine are already doing most of the work with humans being paid peanuts for a job that has remained invariably the same for decades. There are no contrarians nor visionaries in this industry, only one or two Kassandras, and many Trojans. The Achaeans have been in for years and still too many pretend they’re safe. Possibly waiting for someone to show them the way to El Dorado (the mythical premium market). Fools.
In the final context, the human touch is always needed. No NMT engine however good can replace a human being. In my opinion, it is a question of trust. I will trust a reputed agency to translate a text, but I will not trust a NMT engine. The day we start trusting the output of these engines [like we do with Google Maps, albeit to a large extent], the days for translators, I am afraid, are numbered and it is only in material which has legal or economic repercussions that a translator will be called upon
ReplyDeleteI have been researching contexts in which the human touch is not present, yet MT is satisfactorilly fulfilling a need. My own idea is that some of the responsibility a translator would carry in other situations is carried by the consumer of the unedited, raw MT. In a good situation, they are fully aware they are consuming raw MT and they approach it in a different way than they would human translation - with caution and an awareness that any given passage could contain mistakes. I studied one context in which raw MT is used extensively (the research work done in patenting processes). One of my goals was to learn about the role of trust in their use of raw MT. What I eventually concluded was that it's not fully about trust - it's about risk management. Trust plays a role there too, but the more important factor is the calculation of the risk of relying on raw MT in any given situation. When the risk is too high, patent professionals turn to human translators.
DeleteI agree with you. Blind trust may apply to languages like French where there is considerable data. And even then both DeepL and Google goofed up on the word bâtiment: building but also a 'navire". The context allowed me to know it was a ship and not a building. In Low Resource languages, the situation is worse. And the need for PEMT is always present. But how long ?
DeleteI tried modern Mt years ago. I prefer deepl. Why do you prefer modern Mt? Are you sure that it is consistent with terminology inside TM?
ReplyDeleteModernMT is a significantly more adaptive MT system than DeepL, but it requires that the user teach the system with TM, corrective feedback and glossary terms entered in full sentence form. If a translator makes this investment the quality improvement yield will generally be MUCH greater than any generic system. Read the comments by the translators who have made this investment in the last part of the post.
Delete