Machine translation technology has an unfortunate history of overpromising and under delivering. At least 50 years of doing this and sometimes it seems that the torture will never stop. MT enthusiasts continue to make promises that often greatly exceed the realistic possibilities. Recently, in various conversations, I have seen that the level of unwarranted exuberance around the possibilities with the Moses Open Source SMT technology is rising to peak levels. This is especially true in the LSP community. While most technologies go through a single hype cycle, MT seems destined to go through several of these cycles with each new approach and the latest of these is what I call Moses Madness. It has become fashionable of late to build instant DIY MT engines, using tools that help you with the mechanics of running the software that is “Moses”.  While some of these tools greatly simplify the mechanical process of running the Moses software, they do not give you any insight into what is really going on inside the magic box or any clues to what you are doing at all.  Moses is a wonderful technology and it enables all kinds of experimentation that furthers the art and science of data-driven MT, but it does require some knowledge and understanding for real success. It is possible to get a quick and dirty MT engine together using some of these tools, but for long-term strategic translation production leverage, I am not so sure. Thus it is my sense that we are at the peak of the hype cycle for DIY Moses. 
I would like to present a somewhat contrarian viewpoint to much of what you will hear at TAUS - “Let a thousand MT systems bloom”,  and other online forums on getting started with instant MT approaches.  IMO Moses and especially instant Moses is clearly not the final answer. While Moses is a starting point for real development, it should not be mistaken as the final destination. I think there are a number of reasons that you should pause before you jump in, and at least build up some knowledge before taking the dive. I have attempted to enumerate some of these reasons, but I am sure some will disagree. Anyway, I hope an open discussion will be valuable in reaching a more sustainable and accurate view of the reality and so here goes, even though perhaps I am rushing in where angels fear to tread.  And of course my opinion on this matter is not impartial, given my involvement with Asia Online. 
The Sheer Complexity
As you can see from the official description, Moses is an open source project that makes its home in the academic research community. This link describes some of the conferences where people with some expertise and understanding of what Moses actually does convene and share information. Take a look at the program committee of these conferences to get a sense of what the focus might be. Now take a look at the “ step-by-step guide”, which students in NLP are expected to be able to handle. It is what you would have to do to build an MT system if did not have the DIY kit. Most of the instant/simplified Moses engine services in the market focus on simplifying this and only this aspect of developing an MT engine.
Clearly it would be good to have some knowledge of what is going on in the magic box BEFORE you begin, and perhaps it would even be really nice to have some limited team expertise with computational linguistics to make your exploration more useful. Remember that hiding complexity is not quite the same as removing complexity, and it would be smart to not underestimate this complexity BEFORE you begin. Anybody who has ventured into this has probably realized already, that while some of the complexity has been hidden, there is still much that is ugly and complicated to deal with in Moses world, and often it feels like the blind leading the blind.
I have noticed that many in professional translation industry have trouble even with basics like MT system BLEU scoring, and even some alleged MT experts barely know how to measure BLEU accurately and fairly. Thus I am skeptical that LSPs will be able to jump into this with any real level of competence in the short term. A level of competence that assures or at least raises the probability of business success i.e. enhances long-term translation productivity. Though it is possible that a hardy few will learn over the next 2-5 years, it is also clear that NLP and computational linguistics is not for everyone. The level and extent of knowledge required is simply too specialized and vast. As Richard Feynman said:”I think it’s much more interesting to live not knowing, than to have answers which might be wrong.” (Though he was talking about beauty, curiosity and mostly about doubt).
Alon Lavie, AMTA President, CMU NLP professor and President of Safaba (which develops hosted MT solutions that are largely built on top of Moses) says:
“ I am of course a strong supporter, and am extremely enthusiastic about Moses and what it has accomplished in both academic research and in the commercial space. I also think there is indeed a lot of value in the various DIY offerings (commercial and Achim's M4L efforts). But these efforts primarily target and solve the *engineering complexity* of deploying Moses. While this undoubtedly is a critical bottleneck, I think there is a potential pitfall here that users that are not MT experts (the vast majority) would come to believe that that's all it takes to build a state-of-the-art MT system. The technology is actually complex and is getting more complex and involved to master. Users may be disappointed with what they get from DIY Moses, and more detrimentally, become convinced that that's the best they can accomplish, when in fact letting expert MT developers do the work can result in far better performance results. I think this is an important message to communicate to potential users, but I'm not sure how best to communicate this message.”
Thus, I will join Alon in trying to convey the message that Moses is a starting point in your exploration of MT and not the final answer, and that experience, expertise and knowledge matter. Perhaps, a way to understand the complexity issue better, is to use some analogies.

The sewing machine/tailor analogy: Moses can be perhaps be viewed as a very basic sewing machine. You still need to understand how to cut cloth, stitching technique, fabric and lining selection, measurement, pocket technique (?), final fit modifications and so on to make clothes. Tailors do it better and expert tailors that only focus on men's suits do it even better than you or I would with the same sewing equipment. The closest to a ready made suit would be the free MT engines, except in this analogy they are only available in one size. Expertise really does matter folks if you want to customize-to-fit.
The DIY car analogy: In this analogy, Moses is the car engine and perhaps a very basic chassis, one that would be dangerous on a highway or bumpy roads. The DIY task is to build a car that can actually be used as transportation. This will require some understanding of auto systems design, matching key components to each other, tires, braking systems, body design and so on. Finally you also need to learn to drive and you would want the car to turn right when you want to. Again, expert mechanics are more likely to be successful even though there are some great DIY kits out there for NASCAR enthusiasts.
The Learning Curve
Even if you do have a team with some NLP expertise, remember that working with any complex technology involves a process of learning and usually an apprenticeship to get to a point of real skill. The people who build SMT engines at Microsoft, Google, Asia Online and other MT research teams have built thousands of MT engines during their MT careers. The skills developed and lessons learned during this experience are not easily replicated and embedded into open source code. Failure is often the best teacher and most of these teams have failed often enough to understand the many pitfalls along an SMT engine development path. To expect that any “instant” Moses solution is going to capture and encapsulate all of this is naïve and and somewhat arrogant. This is the kind of skill where expertise builds slowly, and comes after much experimentation across many different kinds of data and use case scenarios. Just as professional tailors and expert mechanics are likely to produce better results, MT experts who work across many different use scenarios are likely to produce much better results than a do-it-yourself enthusiast might. These results translate into long-term savings that should far exceed an initially higher price.
The objective of MT deployment for most LSP users is to increase translation productivity. (Very few have reached the next phase where they are translating new content that would never be translated were it not for MT). Thus getting the best possible systems that produce the highest possible MT output quality really matters to achieve this core objective of achieving measurable translation productivity. To put this in simpler terms, the difference between instant Moses systems and expert MT systems could be as much as 4,000 words/day versus 10,000+ words a day. Expert MT engine developers like Asia Online have multi-dimensional approaches, NLP skills, and many specialized tools in place to extract the maximum amount of information out of the data they have available. The use of these tools is guided by two team members with deep expertise on the inner workings of Moses and SMT in general. The learning process driving the development of these comprehensive tools takes years, and they enable Asia Online custom systems to produce superior translation output to the free online MT engines consistently. One team member has literally written the book on SMT and created Moses and thus one could presume is quite likely to have the expertise to develop better MT systems than most.
I have already heard from several translators who when asked to post-edit “instant Moses” output they know is inferior, simply run the same source material through Google/Bing and edit that instead, to improve their own personal productivity and save themselves some anguish. So if your Moses engine is not as good as these public engines you will find that translators will simply bypass them whenever they can. And they may not actually tell you that they are doing this. Post-editors will generally choose the best MT output they can get access to, so beware if your engine does not compare well. And buyers, insist on seeing how these instant MT engines compare to the public free engines on a meaningful and comprehensive test set, not just a 100 or so sentences.
However, I am also aware that some Moses initiatives have produced great results e.g. Autodesk,(for you doubters on the value of PEMT, here is clear evidence from a customer viewpoint) and here I would caution against any extrapolation of these results and expectation to achieve this for any and every Moses attempt. The team that produced these systems were more technically capable and knowledgeable than most, and I am also aware that that their training data was better suited for SMT than most of the TM you will find in the TDA or on the web. And even here, I would argue that MT experts would probably produce better results with the same data especially with the Asian languages where other support tools and processes become much more imperative.
As others have stated before me, the global population of people who actually understand how these data-driven systems work is really quite tiny, miniscule in fact. If you are building Moses systems you should be comparing yourself to the public free engines, as you may find that all your effort was much ado about nothing. One would hope that you will produce systems that compare favorably to these “free” options. And if your competition includes the lads and lassies at Microsoft and Google, one would hope that you know more about how to do this than pushing the instant Make-my-engine button. The financial cost of ignorance is substantially higher than most are able to define in terms of lost opportunity costs, and learning costs (a.k.a. mistakes) should be factored into a real TCO (Total Cost of Ownership).
The bottom line: Success with SMT requires very specialized skills that include, some NLP background, massive data handling skills, knowledge of parallel computing processing, linguistic data management tools, corpus analysis and linguistic structural analysis capabilities for optimal results not to mention a culture that nurtures collaboration with translators.
The Data, the Data, The Data
Moses is a data-driven technology and thus is highly dependent on the data that is used. Data volume is required to get good output from the systems and thus users have to gather the data from public sources and it is important to normalize and prepare the data for optimal performance. Most LSPs will not have the data or skills needed to gather the data in an optimal way. I have seen two major SMT engineering initiatives up close, one where training data was scraped off the web by spider programs, and another where data was not allowed to go into training data if it had not passed several human linguistic quality assessment checks. The differing impact of these approaches is quite striking. The dirty data approach requires substantially larger amounts of new data to see any ongoing improvement, while the clean data approach can produce compelling improvement results with much less new data.
This ability to respond to small amounts of corrective feedback is a critical condition for ongoing improvement, and for continued improvements in productivity e.g. raising PEMT throughput up to 15,000+ words/day in the shortest time possible. I have already stated that I was surprised how little attention is paid to data quality in instant Moses approaches presented at TAUS. And while data volume matters, for high quality domain-focused systems, the data you exclude may be more important than what you include. We are in a phase of the web's development where ‘ Big Data” is solving many semantic and linguistic problems, but we have also seen that data is not always the solution to better MT systems.
The upfront data analysis and data preparation, the development of “good” tuning and test sets are critical to the the short and long-term quality and evolution of an MT engine. This is something that takes experience and experimentation to understand and be skillful at. Experts can add huge value at this formative stage. Remember that this is a technology where “Garbage In Garbage Out” (GIGO) will be particularly true. Many who understand how bad TM can get don’t need any further elaboration on this, even though some people in the SMT community remain unconvinced that clean data does matter.
Many of the people who have jumped into instant Moses, do not realize that to get your initial MT engine to improve, will require very large amounts of new data with a standard Moses approach. The rule of thumb I have heard used frequently is that you need 20-25% of the initial training data volume to see meaningful improvements. Thus, if you used 10 million words to build your system, you will need 2-3 million new words to see the system noticeably improve. So most of these instant systems are as good as they are ever going to get when the first engine is produced. In contrast, Asia Online systems can improve dramatically with as little as a few thousand sentences (a single project) and are architected and designed from the outset to improve continuously over time with focused and targeted corrective feedback.
Given the difficulty of getting large amounts of new data, users need systems that can respond to small amounts of corrective feedback and yet show noticeable improvements. One of the major deficiencies of historical MT systems has been the lack of user control, the inability of users to make any meaningful impact on the quality of raw output produced on an ongoing basis.This ability to CONTINUALLY steer the MT engine with financially feasible amounts (i.e. relatively small) of corrective feedback is a key to getting the best long-term productivity results and ROI. I think as users get more informed on how to work with this technology, they will zero in on this ability of some expert MT systems. IMO, it is the single most important criterion when evaluating competitive MT systems:

Control & Data Security
One of the reasons why it may make sense to use Moses sometimes is to keep your data and training and translation activity REALLY REALLY private (e.g. translations of interrogation transcripts where persuasion involving water might be used). The need for security and privacy makes sense for national security applications, but I find it hard to understand the resistance some global companies have, to working in the cloud when a lot of this MT and PEMT content ends up on the web anyway. For most companies cloud computing simply makes sense and spares the user from the substantial IT burden of maintaining the hardware infrastructure needed to play at the highest professional level. (Asia Online actually makes it’s full training and translation environment available for on-premise installation for large enterprise customers like LexisNexis who process hundreds of millions of words a day and have suitable computing and human resource expertise to handle this).
I have heard of several LSPs who have spent $10K–$20K on servers that will probably only do Moses training once a year. If you do not have the data to drive an improvement in your Moses engine, what is the point of having these kinds of servers? There is no point in trying to re-train an engine when you don’t have enough new data to make any noticeable impact. This is a technology that just makes much more sense in the cloud, for scalability, extensibility, security and effective control. Cloud solutions are often more secure than on-premise installations at LSPs because cloud service providers can afford the IT staff that has deep expertise on computer security, data protection and data availability management. (BTW I have also seen what happens when hacks try and manage 200 servers = not pretty). Like many other things in today’s world, IT (Information Technology) has become so specialized and complex that it makes more sense to outsource much of it, and work in the cloud rather than try and do it on your own with a meager and barely trained staff. Compare your IT staff capabilities to any cloud service provider. Even Microsoft Office is finally making the transition to the cloud. Some analysts are even saying that the shift to the cloud will challenge the dominance of older stalwarts like HP, Microsoft, Intel, SAP, RIM, Oracle, Cisco, Dell and that a third of these companies may not be around in in 2020. Remember DEC and Wang? In a world where tablets, smartphones and mobile platforms will increasingly drive global commerce, the desktop/server perspective of traditional IT is already fading, and makes less sense with each passing day. It is ironic to see LSPs jumping on the “On-Premise Server” train just as it about to reach the end of the line.
Cloud based MT can also be setup to be always improving (assuming you have more than basic Moses MT) as new data is added regularly and feedback gathered from users as Google and Bing do. Setting up this kind of infrastructure is a significant undertaking and most Moses users will never get to that point, but this is how the best MT systems will continue to evolve. What some may find is, that their domain focused MT system may be better than the public engines in January, but by June this may no longer be true. You should realize that you are dealing with a moving target and most public engines will continue to improve. All the expert MT developers are constantly updating and enhancing their technology, most have already moved beyond the phrase-based SMT that Moses is today, and are incorporating linguistics in various forms. This can only be done because they understand what they are doing. Some of these enhancements may make it back to Moses years later but the productivity edge will remain with experts in the foreseeable future and I expect in 2012 we will see several case studies where expert MT systems outperform instant Moses systems by significant margins. So my advice; Be wary of any kind of instant MT solution that is not free.
I started the eMpTy Pages blog in early 2010, and one of my earliest posts was on the importance of clean data for SMT. It was blasphemy at the time to question the value of sheer data volume for SMT, but in the period since then, many have validated that working with consolidated TM from multiple sources, trusted though they may be, is a tricky affair and data quality does matter. Pooling data can work sometimes but will also fail often without cleaning and standardization.
The origin of the phrase “Let a thousand flowers bloom” is attributed to a misquote of Mao Zedong. The results for Chinese intellectuals who took Mao seriously were quite unfortunate. Fortunately we live in better times (I think?) and this phrase is not likely to have such dire consequences today. However, while a thousand MT systems may bloom (or at least be seeded), I predict that many will fade and die quickly. This is not necessarily bad, as hopefully institutional, community and industry learning will take place, and some practitioners may actually discover that they now have a much better appreciation for corpus linguistics and some of the skills that drive the creation of better MT systems. The experimental evidence from many failed experiments with Moses will also provide useful information for MT experts and further enhance the state of the art and science of MT. The learning curve for this technology is long and arduous and it may take a while for the dust to settle from the current hype, but I fully expect that by December 21st, 2012 it will be clear that expertise, experience and knowledge does matter with something as complex as Moses. Dead flowers are also used to fertilize gardens and help other plants thrive, and as long as we have the long view, we will continue to move onward and upward. I will restate my prediction, that the best MT systems will still come from close collaboration between MT experts with linguists, translators, LSPs and insight drawn from experience and failure.
The Sheer Complexity
As you can see from the official description, Moses is an open source project that makes its home in the academic research community. This link describes some of the conferences where people with some expertise and understanding of what Moses actually does convene and share information. Take a look at the program committee of these conferences to get a sense of what the focus might be. Now take a look at the “ step-by-step guide”, which students in NLP are expected to be able to handle. It is what you would have to do to build an MT system if did not have the DIY kit. Most of the instant/simplified Moses engine services in the market focus on simplifying this and only this aspect of developing an MT engine.
Clearly it would be good to have some knowledge of what is going on in the magic box BEFORE you begin, and perhaps it would even be really nice to have some limited team expertise with computational linguistics to make your exploration more useful. Remember that hiding complexity is not quite the same as removing complexity, and it would be smart to not underestimate this complexity BEFORE you begin. Anybody who has ventured into this has probably realized already, that while some of the complexity has been hidden, there is still much that is ugly and complicated to deal with in Moses world, and often it feels like the blind leading the blind.
I have noticed that many in professional translation industry have trouble even with basics like MT system BLEU scoring, and even some alleged MT experts barely know how to measure BLEU accurately and fairly. Thus I am skeptical that LSPs will be able to jump into this with any real level of competence in the short term. A level of competence that assures or at least raises the probability of business success i.e. enhances long-term translation productivity. Though it is possible that a hardy few will learn over the next 2-5 years, it is also clear that NLP and computational linguistics is not for everyone. The level and extent of knowledge required is simply too specialized and vast. As Richard Feynman said:”I think it’s much more interesting to live not knowing, than to have answers which might be wrong.” (Though he was talking about beauty, curiosity and mostly about doubt).
Alon Lavie, AMTA President, CMU NLP professor and President of Safaba (which develops hosted MT solutions that are largely built on top of Moses) says:
“ I am of course a strong supporter, and am extremely enthusiastic about Moses and what it has accomplished in both academic research and in the commercial space. I also think there is indeed a lot of value in the various DIY offerings (commercial and Achim's M4L efforts). But these efforts primarily target and solve the *engineering complexity* of deploying Moses. While this undoubtedly is a critical bottleneck, I think there is a potential pitfall here that users that are not MT experts (the vast majority) would come to believe that that's all it takes to build a state-of-the-art MT system. The technology is actually complex and is getting more complex and involved to master. Users may be disappointed with what they get from DIY Moses, and more detrimentally, become convinced that that's the best they can accomplish, when in fact letting expert MT developers do the work can result in far better performance results. I think this is an important message to communicate to potential users, but I'm not sure how best to communicate this message.”
Thus, I will join Alon in trying to convey the message that Moses is a starting point in your exploration of MT and not the final answer, and that experience, expertise and knowledge matter. Perhaps, a way to understand the complexity issue better, is to use some analogies.

The sewing machine/tailor analogy: Moses can be perhaps be viewed as a very basic sewing machine. You still need to understand how to cut cloth, stitching technique, fabric and lining selection, measurement, pocket technique (?), final fit modifications and so on to make clothes. Tailors do it better and expert tailors that only focus on men's suits do it even better than you or I would with the same sewing equipment. The closest to a ready made suit would be the free MT engines, except in this analogy they are only available in one size. Expertise really does matter folks if you want to customize-to-fit.
The DIY car analogy: In this analogy, Moses is the car engine and perhaps a very basic chassis, one that would be dangerous on a highway or bumpy roads. The DIY task is to build a car that can actually be used as transportation. This will require some understanding of auto systems design, matching key components to each other, tires, braking systems, body design and so on. Finally you also need to learn to drive and you would want the car to turn right when you want to. Again, expert mechanics are more likely to be successful even though there are some great DIY kits out there for NASCAR enthusiasts.
The Learning Curve
Even if you do have a team with some NLP expertise, remember that working with any complex technology involves a process of learning and usually an apprenticeship to get to a point of real skill. The people who build SMT engines at Microsoft, Google, Asia Online and other MT research teams have built thousands of MT engines during their MT careers. The skills developed and lessons learned during this experience are not easily replicated and embedded into open source code. Failure is often the best teacher and most of these teams have failed often enough to understand the many pitfalls along an SMT engine development path. To expect that any “instant” Moses solution is going to capture and encapsulate all of this is naïve and and somewhat arrogant. This is the kind of skill where expertise builds slowly, and comes after much experimentation across many different kinds of data and use case scenarios. Just as professional tailors and expert mechanics are likely to produce better results, MT experts who work across many different use scenarios are likely to produce much better results than a do-it-yourself enthusiast might. These results translate into long-term savings that should far exceed an initially higher price.
The objective of MT deployment for most LSP users is to increase translation productivity. (Very few have reached the next phase where they are translating new content that would never be translated were it not for MT). Thus getting the best possible systems that produce the highest possible MT output quality really matters to achieve this core objective of achieving measurable translation productivity. To put this in simpler terms, the difference between instant Moses systems and expert MT systems could be as much as 4,000 words/day versus 10,000+ words a day. Expert MT engine developers like Asia Online have multi-dimensional approaches, NLP skills, and many specialized tools in place to extract the maximum amount of information out of the data they have available. The use of these tools is guided by two team members with deep expertise on the inner workings of Moses and SMT in general. The learning process driving the development of these comprehensive tools takes years, and they enable Asia Online custom systems to produce superior translation output to the free online MT engines consistently. One team member has literally written the book on SMT and created Moses and thus one could presume is quite likely to have the expertise to develop better MT systems than most.
I have already heard from several translators who when asked to post-edit “instant Moses” output they know is inferior, simply run the same source material through Google/Bing and edit that instead, to improve their own personal productivity and save themselves some anguish. So if your Moses engine is not as good as these public engines you will find that translators will simply bypass them whenever they can. And they may not actually tell you that they are doing this. Post-editors will generally choose the best MT output they can get access to, so beware if your engine does not compare well. And buyers, insist on seeing how these instant MT engines compare to the public free engines on a meaningful and comprehensive test set, not just a 100 or so sentences.
However, I am also aware that some Moses initiatives have produced great results e.g. Autodesk,(for you doubters on the value of PEMT, here is clear evidence from a customer viewpoint) and here I would caution against any extrapolation of these results and expectation to achieve this for any and every Moses attempt. The team that produced these systems were more technically capable and knowledgeable than most, and I am also aware that that their training data was better suited for SMT than most of the TM you will find in the TDA or on the web. And even here, I would argue that MT experts would probably produce better results with the same data especially with the Asian languages where other support tools and processes become much more imperative.
As others have stated before me, the global population of people who actually understand how these data-driven systems work is really quite tiny, miniscule in fact. If you are building Moses systems you should be comparing yourself to the public free engines, as you may find that all your effort was much ado about nothing. One would hope that you will produce systems that compare favorably to these “free” options. And if your competition includes the lads and lassies at Microsoft and Google, one would hope that you know more about how to do this than pushing the instant Make-my-engine button. The financial cost of ignorance is substantially higher than most are able to define in terms of lost opportunity costs, and learning costs (a.k.a. mistakes) should be factored into a real TCO (Total Cost of Ownership).
The bottom line: Success with SMT requires very specialized skills that include, some NLP background, massive data handling skills, knowledge of parallel computing processing, linguistic data management tools, corpus analysis and linguistic structural analysis capabilities for optimal results not to mention a culture that nurtures collaboration with translators.
The Data, the Data, The Data
Moses is a data-driven technology and thus is highly dependent on the data that is used. Data volume is required to get good output from the systems and thus users have to gather the data from public sources and it is important to normalize and prepare the data for optimal performance. Most LSPs will not have the data or skills needed to gather the data in an optimal way. I have seen two major SMT engineering initiatives up close, one where training data was scraped off the web by spider programs, and another where data was not allowed to go into training data if it had not passed several human linguistic quality assessment checks. The differing impact of these approaches is quite striking. The dirty data approach requires substantially larger amounts of new data to see any ongoing improvement, while the clean data approach can produce compelling improvement results with much less new data.
This ability to respond to small amounts of corrective feedback is a critical condition for ongoing improvement, and for continued improvements in productivity e.g. raising PEMT throughput up to 15,000+ words/day in the shortest time possible. I have already stated that I was surprised how little attention is paid to data quality in instant Moses approaches presented at TAUS. And while data volume matters, for high quality domain-focused systems, the data you exclude may be more important than what you include. We are in a phase of the web's development where ‘ Big Data” is solving many semantic and linguistic problems, but we have also seen that data is not always the solution to better MT systems.
The upfront data analysis and data preparation, the development of “good” tuning and test sets are critical to the the short and long-term quality and evolution of an MT engine. This is something that takes experience and experimentation to understand and be skillful at. Experts can add huge value at this formative stage. Remember that this is a technology where “Garbage In Garbage Out” (GIGO) will be particularly true. Many who understand how bad TM can get don’t need any further elaboration on this, even though some people in the SMT community remain unconvinced that clean data does matter.
Many of the people who have jumped into instant Moses, do not realize that to get your initial MT engine to improve, will require very large amounts of new data with a standard Moses approach. The rule of thumb I have heard used frequently is that you need 20-25% of the initial training data volume to see meaningful improvements. Thus, if you used 10 million words to build your system, you will need 2-3 million new words to see the system noticeably improve. So most of these instant systems are as good as they are ever going to get when the first engine is produced. In contrast, Asia Online systems can improve dramatically with as little as a few thousand sentences (a single project) and are architected and designed from the outset to improve continuously over time with focused and targeted corrective feedback.
Given the difficulty of getting large amounts of new data, users need systems that can respond to small amounts of corrective feedback and yet show noticeable improvements. One of the major deficiencies of historical MT systems has been the lack of user control, the inability of users to make any meaningful impact on the quality of raw output produced on an ongoing basis.This ability to CONTINUALLY steer the MT engine with financially feasible amounts (i.e. relatively small) of corrective feedback is a key to getting the best long-term productivity results and ROI. I think as users get more informed on how to work with this technology, they will zero in on this ability of some expert MT systems. IMO, it is the single most important criterion when evaluating competitive MT systems:
- What do I have to do to improve the raw system output quality once an initial engine is in place?
- And, how much effort/data is required to get meaningful and measurable improvements?
- Measurable = Rising average throughput of post-editors (By hundreds or thousands of more words a day, and often a multiple of what is possible with instant MT).

Control & Data Security
One of the reasons why it may make sense to use Moses sometimes is to keep your data and training and translation activity REALLY REALLY private (e.g. translations of interrogation transcripts where persuasion involving water might be used). The need for security and privacy makes sense for national security applications, but I find it hard to understand the resistance some global companies have, to working in the cloud when a lot of this MT and PEMT content ends up on the web anyway. For most companies cloud computing simply makes sense and spares the user from the substantial IT burden of maintaining the hardware infrastructure needed to play at the highest professional level. (Asia Online actually makes it’s full training and translation environment available for on-premise installation for large enterprise customers like LexisNexis who process hundreds of millions of words a day and have suitable computing and human resource expertise to handle this).
I have heard of several LSPs who have spent $10K–$20K on servers that will probably only do Moses training once a year. If you do not have the data to drive an improvement in your Moses engine, what is the point of having these kinds of servers? There is no point in trying to re-train an engine when you don’t have enough new data to make any noticeable impact. This is a technology that just makes much more sense in the cloud, for scalability, extensibility, security and effective control. Cloud solutions are often more secure than on-premise installations at LSPs because cloud service providers can afford the IT staff that has deep expertise on computer security, data protection and data availability management. (BTW I have also seen what happens when hacks try and manage 200 servers = not pretty). Like many other things in today’s world, IT (Information Technology) has become so specialized and complex that it makes more sense to outsource much of it, and work in the cloud rather than try and do it on your own with a meager and barely trained staff. Compare your IT staff capabilities to any cloud service provider. Even Microsoft Office is finally making the transition to the cloud. Some analysts are even saying that the shift to the cloud will challenge the dominance of older stalwarts like HP, Microsoft, Intel, SAP, RIM, Oracle, Cisco, Dell and that a third of these companies may not be around in in 2020. Remember DEC and Wang? In a world where tablets, smartphones and mobile platforms will increasingly drive global commerce, the desktop/server perspective of traditional IT is already fading, and makes less sense with each passing day. It is ironic to see LSPs jumping on the “On-Premise Server” train just as it about to reach the end of the line.
Cloud based MT can also be setup to be always improving (assuming you have more than basic Moses MT) as new data is added regularly and feedback gathered from users as Google and Bing do. Setting up this kind of infrastructure is a significant undertaking and most Moses users will never get to that point, but this is how the best MT systems will continue to evolve. What some may find is, that their domain focused MT system may be better than the public engines in January, but by June this may no longer be true. You should realize that you are dealing with a moving target and most public engines will continue to improve. All the expert MT developers are constantly updating and enhancing their technology, most have already moved beyond the phrase-based SMT that Moses is today, and are incorporating linguistics in various forms. This can only be done because they understand what they are doing. Some of these enhancements may make it back to Moses years later but the productivity edge will remain with experts in the foreseeable future and I expect in 2012 we will see several case studies where expert MT systems outperform instant Moses systems by significant margins. So my advice; Be wary of any kind of instant MT solution that is not free.
I started the eMpTy Pages blog in early 2010, and one of my earliest posts was on the importance of clean data for SMT. It was blasphemy at the time to question the value of sheer data volume for SMT, but in the period since then, many have validated that working with consolidated TM from multiple sources, trusted though they may be, is a tricky affair and data quality does matter. Pooling data can work sometimes but will also fail often without cleaning and standardization.
The origin of the phrase “Let a thousand flowers bloom” is attributed to a misquote of Mao Zedong. The results for Chinese intellectuals who took Mao seriously were quite unfortunate. Fortunately we live in better times (I think?) and this phrase is not likely to have such dire consequences today. However, while a thousand MT systems may bloom (or at least be seeded), I predict that many will fade and die quickly. This is not necessarily bad, as hopefully institutional, community and industry learning will take place, and some practitioners may actually discover that they now have a much better appreciation for corpus linguistics and some of the skills that drive the creation of better MT systems. The experimental evidence from many failed experiments with Moses will also provide useful information for MT experts and further enhance the state of the art and science of MT. The learning curve for this technology is long and arduous and it may take a while for the dust to settle from the current hype, but I fully expect that by December 21st, 2012 it will be clear that expertise, experience and knowledge does matter with something as complex as Moses. Dead flowers are also used to fertilize gardens and help other plants thrive, and as long as we have the long view, we will continue to move onward and upward. I will restate my prediction, that the best MT systems will still come from close collaboration between MT experts with linguists, translators, LSPs and insight drawn from experience and failure.
And you can send me dead flowers every morning        
Send me dead flowers by the mail
Send me dead flowers to my wedding
And I won't forget to put roses on your grave
Send me dead flowers by the mail
Send me dead flowers to my wedding
And I won't forget to put roses on your grave

 
 
Kirtee,
ReplyDeleteI take exception at pretty much all that you say here. Far from being "naive" or "arrogant", ALS' self-serve MT platform SmartMATE is founded on a group of MT expertise which you would be hard-pressed to beat anywhere, even in Asia Online. I've been working on MT for nearly 25 years now, and have over 200 peer-reviewed published papers. My colleagues, including Jie Jiang, Sergio Penkale and Rejwanul Haque, and I have a massive amount of MT experience with Moses and many other systems, and I assure you we know exactly what we're doing.
Your article ignores completely the fact that for most people, access to _customised_ SMT solutions is pretty much nil. This is where your analogy with Google & Microsoft fails; with self-serve MT, clients without the necessary MT and computing expertise to install Moses themselves, have for the first time the ability to build an MT system based on their own user requirements pretty much instantly. My bet would be that the vast majority of those engines would deliver better performance than Google or Bing.
Asia Online seems to be the only company on the Language Technology LinkedIn discussion who think that empowering translators in this way is a bad thing. This article adds to that scaremongering; the way forward is not to keep the doors closed and say that the only way in which people can access state-of-the-art MT solutions is by leaving it all to the experts who know better: that's real arrogance, IMHO ...
If you really want me to, I could go through your article and disassemble your arguments one by one. I know it's only a blog, and your opinion, but I think most people will see through it for what it is. You can expect many more flames on this, for sure ...
Andy.
Hey, what a lovely plea for quality in the implementation of MT! I'm sure that none of us translator "quality wimps" could have phrased it more eloquently!
ReplyDeleteSorry about the sarcasm spawned by my rather strange British sense of humour. If you come looking for me you'll find me hiding under the table! ;-) ;-) ;-)
@Andy
ReplyDeleteThanks for your comments. You are right, it’s only a blog and only my opinion and I guess we have different views on the value of DIY Moses.
If you look at the post more closely, you may see that my comments are directed at casual users of instant/DIY Moses solutions, not the developers of the tools. I do not question your (team) competence and credibility in any way, and clearly you have much more experience with MT than I do for certain. I am sure all the other developers are also competent to use the tools themselves.
However, I remain skeptical that a casual user (an average LSP or a power translator) who has little or no understanding of SMT/NLP is likely to benefit from throwing in his bucket of data and producing an instant engine that will beat what he could get for free. I am even more skeptical that they will know what to do next. I am saying that it does matter that they understand the technology to some extent, and that this understanding and informed actions thus will produce better systems. I am also saying that working with experts will produce the “best” systems in terms of long-term business value which you can of course disagree with.
To the extent that your tools help create superior systems, I expect we will see long-term use, as long as these MT systems do in fact improve the throughput of translators/post-editors using them. It is possible I might be wrong in my assumption, but at this point in time I would bet I am right.
As far as empowering translators, I think that train has left the station. GTT allows individual translators to customize already usable MT engines instantly, for free, and has been allowing this for years now. The fact that they might keep and use your TM data for years after, does not seem to stop translators, and I would not be surprised if tens of thousands translators use the public engines regularly as part of their standard work process.
DIY Moses is not equivalent to empowering translators in my opinion. Of course any and all opinions can be wrong, and if I see evidence to the contrary I will admit I was wrong.
We live in a world where complexity is around everywhere. In medicine we have reached a point where no individual doctor can handle or even know all the diagnosis codes (12,000+ by the way). Specialization is necessary and specialists have to have work together with general family doctors to solve patient problems.
In the same way, it is my opinion that specialists (MT experts, LSPs, Translators and linguists) working together are likely to produce much better results, especially in the long-term. And it is my opinion that people who go down this path are likely to get the maximum benefit from MT and produce the “best” and “most effective” systems in terms of business value and ROI.
Honestly, I am not even sure that everybody at Asia Online agrees with me, these are really only personal opinions. I speak as an individual observer and I do understand that my views are not completely impartial.
I couldn't agree more. SMT is a complex domain. However, I think I missed the "hype" in the DIY crowd claiming "near human quality." Oh! For fair and full disclosure, my company owns and distributes DoMY, the first package distribution of all Moses components.
ReplyDeleteThe Moses Toolkit is comprehensive in its abilities. However to achieve excellent results across all possible language combinations requires expertise that exceeds localization engineers, system administrators and project managers at most of today's LSP's. Companies considering its benefits should tread carefully, work with one language pair and learn what's involved, including using the technology themselves at their own pace and in their own budgets.
However, I believe a 2-5 year outlook is over-simplistic and shortsighted. To say that only academics and NLP/computational linguists can unlock the secrets of SMT or should experience the technology in that period smacks of the stereotyping at turn of the 20th century regarding women drivers. Are there similarities to a 1998 paper titled, The Automobile and Gender: An Historical Perspective, by Martin Wachs of the University of California, Berkeley?
"While women who drove in the first decades of the century were assumed to have at least some interest in the mechanical properties of automobiles, during the twenties in order to preserve the boundaries between mens’ and women’s spheres it was increasingly asserted that women lacked interest in or aptitude for mechanical devices."
Here we are 110 years later. Women driver jokes abound in some circles but like the stereotyping, and many consider to be in poor taste.
In a recent linkedin.com discussion about "licence-based model for MT procurement" by Andy Way, Mr Wiggins chimed in "I do not think that the market is mature enough to say that one model is out and the other is in... This position presented in this topic (discussion) is a marketing and sales line, not reality in the market." -- ditto
@Tom
ReplyDeleteI think that your analogy is inaccurate and irrelevant in this case.
I feel that a better way to describe this is: Moses is a sewing machine that can empower people who are interested in sewing or empower professional tailors to produce custom clothes faster and and more efficiently. My point is that owning or having access to a sewing machine does not make you a tailor. Even when somebody gives you a pattern. You still need interest and some knowledge.
While some amateurs can definitely learn to sew and produce professional looking clothes after they get some experience, professional tailors are the ones who are most likely to make the best custom fit clothes, and benefit the most from new sewing machine technology. At least for the initial period until the knowledge of tailoring becomes more widespread.
Anybody interested in learning to be a tailor is likely to benefit, just as anybody who is willing to make the investment to learn about SMT/NLP could benefit from Moses. However, simply being able to push the button to start Moses (without understanding anything else) is far from this, and in most cases is unlikely to produce anything of real business value as I stated in the post.
Knowledge matters. Experience matters and learning takes time.
As Kirti is part of the Asia Online team, we talk often about many issues facing the translation industry. Kirti does not write his posts or debate points on a whim. He gives his perspective from firsthand experience, much of which is dealing with Asia Online customers and prospective customers.
ReplyDeleteI commend Kirti for taking his personal time to help the market as a whole to better understand the issues in MT. As this is his personal blog, he does not always discuss with me what he is writing about. When I saw this article, I immediately thought this was going to be one that was provocative as it addresses the issue of skill and knowledge that the DIY MT community seems to brush under the rug a little too easily. I waited to see the response, fully expecting Kirti’s opinions to be challenged, especially by those in the DIY MT community, in particular those that are marketing DIY MT solutions. And I can see from the responses already that I was correct. Already 2 DIY MT providers have come out on the attack.
I would like to state for the record that I do agree with nearly all that Kirti has posted in this blog post. As this is Kirti’s personal blog and his personal opinion, I seldom comment on his posts. However in this case, @Andy has commented with information that is taken out of context and states a position of Asia Online that is factually incorrect and misleading. @Tom has also taken my words from another context and used them for a different purpose. As such, I am compelled to respond to set the record straight and clear up any misleading information from these 2 DIY MT promoters.
@Andy, I think you are taking Kirti’s words and using them out of content. My read of the above is that you are actually both in agreement with each other.
ReplyDelete1. SKILLS AND KNOWLEDGE
Kirti says “The people who build SMT engines at Microsoft, Google, Asia Online and other MT research teams have built thousands of MT engines during their MT careers. The skills developed and lessons learned during this experience are not easily replicated and embedded into open source code. “
You talk about your 25 years’ experience and the experience of your colleagues. Surely that would classify you in the “other research teams” group mentioned by Kirti.
Kirti’s point is that “To expect that any “instant” Moses solution is going to capture and encapsulate all of this is naïve and somewhat arrogant.” In other words, it is naïve to think that it is possible to capture and encapsulate the equivalent your 25 years of experience simply by clicking a button.
So in this instance, you are both saying that you need experience to do MT properly.
I agree with both you and Kirti on this point. You certainly have the experience to understand and build MT systems, as do people like Asia Online’s Philipp Koehn and Hieu Hoang, 2 of the main developers of the Moses decoder. With resources such as Philipp and Hieu on our team, we have a deeper understanding of what Moses can and cannot do than most. Without these skills and deep understanding of SMT, we could not deliver the quality systems that we have for many of our customers.
The reality is that using Moses alone or DIY Moses solutions is a small part of the overall SMT challenge. Your 25 years of study, research and knowledge is not encapsulated in an instant click of a button. Building a high quality MT solution requires software such as Moses, but that software will not deliver without experience and knowledge, especially in the context of data.
The naivety that Kirti is referring to would be the equivalent of installing Microsoft Word and expecting that you are instantly a good writer and can author a best-selling novel simply because you have installed word processing software. To be a good writer, you need training and experience. Similarly, building a high quality SMT system also requires training and experience. Making the installation easier does not make the “art” any easier.
An SMT system will not deliver to most of its potential unless the people deploying it have:
• Linguistic knowledge
• Natural language programming knowledge
• Knowledge of how SMT works internally (at least at to a basic level)
• Knowledge of what data will work well with SMT
• Knowledge of what data will not work well or could even negatively impact SMT
• Knowledge of data cleaning
• Knowledge of the clients requirements
Yes, DIY Moses tools encapsulate some of the complexities of the software, but none of the complexities of data management for building and refining SMT systems. As Alon states, this solves engineering complexity, but leaves much more that is unsolved.
In addition, knowledge is required of what SMT cannot do or does not do well. With this knowledge, technical and linguistic skills need to be combined to overcome those limitations. As a simple example, performing linguistic analysis on German when translating into English and reordering the main verb to be more German like increases the quality of the translations. This is relatively simple, but in order to do this, you would need programming skills, linguistic skills and an understanding of how SMT reordering works. As such, it would be naïve to think that this could be done out of the box. The preparation of the training data is specific to the language pair as is the runtime processing of the data.
Continued in next post...
Surely you are not promoting that your 25 years of experience is encapsulated behind a single button that makes instant high quality MT systems. As a person with 25 years’ experience, you would know the example above from research published by others and probably even tried it yourself. Yet, this is one of the simplest examples where linguistic knowledge, an understanding of SMT concepts and technical programming knowledge are required. This level of knowledge cannot be expected from the average LSP. In fact it cannot be expected from almost any LSP. At Asia Online we work with some of the largest LSPs in the world and I can tell you as a matter of fact that they do not have these skills. Some LSPs are certainly capable and could be trained or in some cases we are training them directly. However the vast majority of LSPs do not and will not ever have these skills.
ReplyDeleteThe reality is that most do not want to have these skills. The metaphor of a car comes back to mind. In this instance the driver does not need to assemble the car from parts and does not need to gather the parts from places around the globe or scavenge in the trash can and local tips (equivalent of crawling the web for dirty data SMT) for parts. Instead, they buy a mature technology designed to give them a driving experience. They may customize with accessories to make it their own, but they are not starting from the requirement that they must understand how the car works in order to drive it. They are starting from the perspective that trusted professionals have built the car as part of a production line.
2. ACCESS TO CUSTOMIZED SMT SOLUTIONS
You make the point that for most people access to customized SMT solutions is pretty much nil. This is not accurate. Moses has been around for many years. Putting a tool over Moses that makes it easier to install makes it more accessible for less technical people. But all this has really achieved is lowering a technical barrier at the time of installation. The issues faced with data quality and quantities have not changed just because you have a smoother install process for the software. There have been thousands of downloads of Moses by companies and academics alike. But making the install process easier is a tiny part of a huge challenge.
The real question should be “do you need to do this yourself or should you engage a professional?” There is a DIY home improvement model also. Many may have tried this. Some succeed, but many are not satisfied with the results. Due to mistakes, learning curve and lack of experience, most projects are less than satisfactory and can even cost more than if a professional with experience had been engaged in the first place.
The analogy that Kirti provides of the suit and the sewing machine makes a point very clearly. If you have the tools (sewing machine / Moses), it does not make you an instant expert in use of the tools. You need training, experience, skill and knowledge. You can take an off the shelf suit (Google or Bing) and have some level of control.
We have talked with many who don’t have the experience and go ahead and “customize” their MT with Moses and their data. They don’t know how to prepare the data properly. They don’t have all the knowledge listed above. They don’t even know if they have enough data (usually they do not) and they don’t know which data should be used for best results. Naturally they do not get the full potential of their system, even with their “custom” data. Often they buy expensive hardware and spend months trying to make things work. This is despite using a DYI software installer for Moses or other translation tools. They find out the hard way that the data is just as important, if not even more important than the software. So they are no better off with their “custom” SMT system than when they started.
Continued in next post...
So why did these people take this path? Was it simply because it was there? Perhaps they were excited by the promise of “custom” MT and the promise that a DIY solution would magically make up for 25 years of experience. This is the false promise that is being portrayed by many of the DIY or one click solutions. Kirti’s blog post provides a simple explanation to those considering going the DIY path to understand what they are going to need to do and understand in order to be successful. By not providing this information, the promoters of DIY are leaving a gap in the full story of what is required to customized MT. The fact is that software alone is not enough. Throwing random collections of data at this software is also not enough. Any MT vendor that claims it is as simple as uploading your data and having an instant MT system that is customized to a customer’s needs and high quality is misleading their customers. I am surprised that the DIY tools promoters are not more up front about this as it only helps the LSPs and others who “customize” MT with these solutions have a better chance of success.
ReplyDeleteIf you follow Kirti’s blog back through the various articles he has published, you will find that he often raises key issues that are pertinent for the industry to understand that few others have addressed. This is one of them. Having a trail of MT failures helps no one and follows in the path of MT history – 50 years of empty promises. Discussing issues with and informing those who are considering the use of MT for their business is the right thing to do. By not making clear the challenges and issues, it only increases the chance of failure and frustration, while increasing the already existing perception of many that MT cannot deliver. In a LinkedIn post you stated “I'm an MT 'veteran', but new to the industry. To me, we don't do ourselves any favors by overselling our capabilities, or by failing to acknowledge, what is after all, fairly recent MT history” and this leads into an article about “Overgilding the lily”. By not fully informing potential users of MT of any form, but especially users of DIY MT of what they are engaging in, this is the exact kind of hype that you spoke out against.
In terms of delivering better performance or quality translations, the reality is that even some of the biggest LSPs do not have the data volume to do so. We work with many LSPs from large to small and deliver quality solutions. Many of the top 50 LSPs globally simply do not have enough data in a single language pair and a single domain to achieve their quality goals and as such frequently fall short on quality. Many try to pool data from multiple customers and that often makes things worse, not better. They then download data from TAUS and sometimes this helps, sometimes it makes things worse. We did an in-depth study with TAUS some time back and showed clearly how lower quality data from one source and have a significant negative impact. This report is titled “Study on the Impact of Data Consolidation and Sharing for Statistical Machine Translation” and can be downloaded from http://www.asiaonline.net/resources/reportID4523.aspx. This is a comprehensive report with 29 different SMT engines created with different combinations of the same data. The results speak very clearly for themselves that clean data is essential and that outside data can do more harm than good if it is not managed properly.
The reality is that many LSPs would like to do SMT, but most lack the experience in how to work with the data. Providing easy install software does not change the experience level or increase knowledge. Most LSPs claim their TMs are very high quality. Yet, they forget that each project they do has different quality goals and budget. Often LSPs do not manage their TMs very well. This results in mixed domain data and even mixed languages. Asia Online’s data cleaning process typically rejects 20-40% of TMs that an LSP sends us. This is fact, not an opinion.
Continued in next post...
Directly addressing the point of translator empowerment (perhaps the term “enablement” is more clear), this should be in the form of tools that make their job easier. One of the criticisms of MT is that it makes the same mistake over and over.
ReplyDeleteThis is where we have focused on a rapid improvement process that takes direct translator feedback. As Kirti points out, the DIY solutions require huge volumes of data, as do solutions from SDL and others that take the dirty data SMT approach. We recently replaced a SMT system provided by SDL that had very little improvement over a 2 year cycle. This was despite several hundred thousand dollars of expense in post editing feedback being provided to improve the engine. In the end, the client was told to provided even more data to get any real improvements. Naturally the client was disillusioned with MT and the promises made.
We knew going in that we had a challenge with this customer’s perspective of MT. We did a pilot with our unique approach to clean data and the quality was immediately better from the very first system. We extended the pilot and took in editing feedback and got significant improvement within weeks. This is the area where translators are empowered, getting rapid improvement from their editing efforts and providing the means to control the quality of future translations. This case study will be presented in one of our webinars shortly.
Our approach focuses on enabling greater productivity of the translator, not empowering them to have ownership of the entire translation process. In our approach, every edit counts and improves future quality. If the data that the engine is trained on is not right from the outset, this approach will fail. It is all too easy to mistake the ability to train with the ability to improve quickly. As Kirti correctly states, many of our competitors tell their clients they will need to add the equivalent of 20-25% of the initial training data volume to see meaningful improvements. Given this reality, being able to train often is no substitute for being able to improve quality often and at meaningful levels.
Continued in next post...
4. SCAREMONGERING
ReplyDeleteYou state that “This article adds to that scaremongering; the way forward is not to keep the doors closed and say that the only way in which people can access state-of-the-art MT solutions is by leaving it all to the experts who know better: that's real arrogance, IMHO ...”
Kirti does not state that it should be left to the experts such as those with 25 years’ experience like you. What he makes the point of is the many things you will need to know and the skills you will need to acquire and the tasks you will need to perform. The DIY promoters, such as yourself, pay little attention to these issues and instead keep saying messages that are misleading and overpromise or “overgilding the lily” by not informing the potential users of the technology of the full picture of what is involved.
By educating potential users of issues and things that will help to make them be more successful in their DIY efforts, the chances of projects using DIY approaches for success increases. On the contrary, this is the opposite of scaremongering. Brushing them under the rug or ignoring them will only come back to bite in the future.
With a clear understanding of all that is involved and what systems can and cannot do, the prospective user of MT (in any form) can make better informed decisions.
@Tom
ReplyDeleteYou are correct, we do use the term “near human quality” in our marketing, but it is far from hype. We have multiple case studies from LSPs and their end customers who agree and are on record agreeing. We even have case studies with some customers where the MT system was delivering better quality than the human translators. The client I am referring to is a major LSP, Sajan. Their client is one of the largest IT company globally and translated millions of words and received 60% saving on costs and 77% savings on time. The client, not the LSP came back and reported that the MT output from English into Chinese was beating their first pass human translators. It is easy for a MT vendor to make these claims and easy for competitors to call them hype. But these claims are verified by both the LSP and a major multinational IT client directly. They are on record here in both slides and video at the LRC conference in Limerick earlier this year.
Sajan: “The client told us that the quality of the Chinese language machine translation is better than the human translators for the first translation phase of the TEP (Translate, Edit and Proof) process. In other words, it is absolutely near human and the post editing is only needed. It is absolutely there and that was corroborated by the customer.”
Slides: http://www.localisation.ie/resources/conferences/2011/presentations/LRCXVI_Sajan_MT_LRC_2011.pdf
Video: http://www.youtube.com/watch?feature=player_detailpage&v=hjK17GWynoU#t=1535s
So yes, we can claim and deliver “near human quality”. No it is not hype. When you have a sophisticated LSP like Sajan, who went through the training process with Asia Online and learned how to use the tools and worked with Asia Online to build their data correctly, this is indeed possible. If Sajan has just thrown the data at a server to create an “instant” solution, then they would have not achieved this level of quality and the customer would not have been as satisfied with the result. The reality is that most LSPs are not as sophisticated as Sajan. This is where Asia Online works even closer with the LSP to build the system and train the LSP and work with the data.
Sajan’s experience was not a fluke. We have many other case studies also. In a few days from now the latest Asia Online newsletter will be sent out. Inside is another example of “near-human quality”, with the LSP (Hunnect) going on the record about their experience with Asia Online. Their language pair is English into Hungarian, which is known to be particularly difficult for MT. On Hunnect’s very first project they were able to save 46% on time and increase their profit margins from 25% to 45%. Hunnect also put the effort in and worked with Asia Online to understand how to build high quality MT systems. They even created an online training course for their translators.
I think the analogy that you provided does not fit at all. The automobile hides much of the complexity of its inner workings. Even the engine, you buy as a complete unit. You don’t buy it in small bits. DIY Moses is the equivalent of packaging the engine as a complete unit. It still requires the user to learn how to drive once the car is assembled. Kirti is pointing out the challenges that are required in order to deliver a system. “learning how to drive” your MT system is not as easy as learning how to drive your car. It may be in time, but the reality is it is not simple today. Any driver can put gasoline in the tank. But what if the driver is new and puts kerosene, diesel or perhaps even water in the tank instead of gasoline. This is a more pertinent analogy. In Kirt’s messaging, he is not prohibiting people from doing trying to build their own MT systems. Rather he is ensuring that they are aware of the challenges. If they wish to attempt the challenge with this knowledge in hand, then they will have a better chance of success.
Continued in next post...
You state in your response that expertise is required that “exceeds most localization engineers, system administrators and project managers of today’s LSPs.” Along with @Alon’s statement that “Users may be disappointed with what they get from DIY Moses, and more detrimentally, become convinced that that's the best they can accomplish, when in fact letting expert MT developers do the work can result in far better performance results.” Neither you, Kirti, Alon or I are in any way stating that people should not try. What I believe we are all saying is that expertise is needed. If an LSP wants to try, it should acquire the expertise, either by hiring them or from third parties. Those who attempt MT without acquiring the necessary expertise have a higher risk of failure and will be unlikely to achieve the best MT results possible with their data.
ReplyDeleteUsing the car analogy, today anyone with a small amount of training can drive a car. But how many of these drivers understand enough about the car to give it a tune-up. Instead the driver takes the car to a specialist. In the past the car was easier to tune. Today like MT, the complexity of a car’s engine is increasing. More sophistication and knowledge is required in order to tune the car to its optimal performance. Often complex and expensive tools and equipment are required to tune the car. Again, this is like MT, with professionals such as Asia Online developing comprehensive tools that optimize and tune the engine and data to give optimal performance. These tools do not ship in a DIY solution and are proprietary, several of which are in the process of being patented. You will recall that a car is supposed to be maintained by a specialist on a regular basis and drivers take their cars to a service center for this. A small percentage of drivers may change their own oil. An even smaller percentage will work directly on their cars engine to improve it. They take their car to a specialist for this.
As Alon points out “these [DIY] efforts primarily target and solve the *engineering complexity* of deploying Moses.” … “I think there is a potential pitfall here that users that are not MT experts (the vast majority) would come to believe that that's all it takes to build a state-of-the-art MT system.” The DIY community keeps ignoring this very important point and frequently even pushes against it.
With respect to my comment that you quote “I do not think that the market is mature enough to say that one model is out and the other is in... This position presented in this topic is a marketing and sales line, not reality in the market." – this comment was in the context of @Andy’s statement that “Looks like the licence-based model for MT procurement is on its way out in favour of pay as you go”, where he immediately supported his statement with pricing information. The DIY model is one of several and suitable for some and not others. In no way am I advocating that some must not try. What I am stating is that those who do try should be aware of the issues and the challenges and come prepared to address them. Ignoring them will be costly and most likely result in failure.
Continued in next post...
This is corroborated by your statement (which I agree with) that “companies considering its benefits should tread carefully, focus on one language pair and learn what's involved, including the possibility to use the technology themselves at their pace in their budgets.” To expect to achieve results at the levels Sajan was able to achieve with a single click solution in a language pair such as Chinese (or any for that matter, we repeated this in other languages also such as Spanish) is naïve. DIY promoters that advocate knowledge and experience are doing their target audience a favor. Unfortunately there are very few that address the real issues.
ReplyDeleteHolding up the Autodesk case study as evidence of what can be done with Moses is good. But ignoring the skill, effort and knowledge that Autodesk put into these systems to achieve this result and making prospective MT users believe they can do the same without the effort is foolhardy simply misleading. Omission of this information by DIY promoters may not be deliberate, but now that this message has been raised by Kirti, Alon and others, there is no excuse from the DIY community for omitting this important information in future.
@ Mr Wiggins. Got it.
ReplyDeleteThanks for getting to the heart of the matter Kirti and Dion.
ReplyDeleteAnd indeed this is where we believe TAUS has and will continue to play a pivotal role - working hard to ensure the informed use of translation automation.
This is what Jaap and Andrew began fostering when large organizations first shared their experiences at early TAUS meetings in 2005.
It's what reports like Manager's Guide to Implementing Open Source SMT (http://tinyurl.com/c4pccvg), How to Implement Open Source MT Solutions (http://tinyurl.com/bpxtbfl), among many others (http://tinyurl.com/bpkps8r) are aimed at.
It's also why we began providing workshops (hands on tutorials) a couple of years ago (http://tinyurl.com/bux46ch). Something we will probably (at least partially) move online next year to open up access for many more.
It's what the members' inspired TAUS Tracker (http://taustracker.com/) - a set free translation and language directories, which enables users to provide feedback on a myriad of tools - is all about.
Not to mention TAUS Labs (http://tauslabs.com) where we begin to work with members to operationalize our and members' visions of better applications of dynamic quality, interoperability and open source MT.
You are all no doubt aware of our events at which so many have shared (http://www.youtube.com/user/TAUSvideos)
All the above seek to help tackle the skills and knowledge gap.
TAUS Data Association (http://tausdata.org) is of course aimed squarely at opening up data - for the long haul.
No hype, no slight of hand - just plain insights. This will remain a nuanced field for sometime. We believe that by sharing knowledge and resources we will all grow our share of a growing pie.
It's great to see that TAUS members are contributing to this discussion thread!
when I wrote about "The beginning of the MT wars" in April 2010 (http://blog.pangeanic.com/2010/10/04/final-dominance-by-final-technology-the-beginning-of-the-mt-wars-ii/) , I did not expect that the "wars" would come in this shape.
ReplyDeleteI find the original entry too opinionated and biased Kirti. LSPs like Pangeanic, who have run the full circle of learning, adopting, implementing and exporting the technologies would not put their money in operations if the technology had not been tested.
You make a case for AO working closely in a quality feedback loop with translators and LSPs, but it is precisely those LSPs which develop tools around the Moses kit, they deserve public scorn - despite having gone through extensive product testing. I know everyone else who has replied here has done proper product testing, too.
I do not know of any company who claims "instant DIY Moses deployment" without an initial level/stage of customization. We do claim to provide the tools to clean and update the engine(s), which is something any good data-driven MT implementer has developed. Other initiatives provide a harness on Moses (Adobe, see their ppt in Xiamen and the EU-sponsored LetsMT).
Personally, I spent a significant part of my life working with machines that made machines people drive (your analogies to the car industry) - so I am a big fan of automation.
I do believe the future is in empowering the community and what we are witnessing here is a case of professional jealousy or conflicts of business models.
About the comment about how long these initiatives/companies will last, we plan to stick around for a pretty long term. Our business model is not so much output-driven (sell words cheap to LSPs) but about empowering them with the tools we test and develop.
Ljubomir Lukov •
ReplyDeleteAnother good analysis on the blossoming of the DYI Moses engines we see these days. Still, I think that much of the success on the MT tools depends on the people who work on their output and their motivation to improve the text quality.
@Manuel
ReplyDeleteI think you are confusing my comments regarding the unrealistic expectations some LSPs have about Moses instantly delivering value, as criticism of your capabilities. If you look more closely at the post, you will see that the bulk of my comments are directed to users (especially LSPs) not to the tools developers. I am aware that most DIY Moses developers understand what they themselves are doing, but, many of their customers have very little understanding of what SMT involves. My point is that the path to good MT engines that provide business value, involves more than pushing a few buttons in a DIY Moses offering.
There were in fact some DIY Moses presentations at TAUS 2011, that did in fact promise that usable MT systems can be produced almost instantly. This is the hype I refer to. Check out the videos – you will see some do say this.
The Adobe and Autodesk efforts referenced by several here, ignore the fact that their team members are sophisticated technical users, or in the case of Adobe actually have NLP experience. Thus their efforts cannot be equated to what an average LSP might do.
Moses can work for an LSP, but like every other MT effort, requires knowledge, data and some experience and A STEERING EFFORT. The tools are only as good as the people using them, having a sewing machine does not automatically make you a tailor. I am only claiming that pushing a button that sets Moses into motion, without understanding how it works and what might happen is not likely to give you MT engines that improve translation production.
@Manuel
ReplyDeleteI am not sure what war you are referring to. What I see is a maturing of the industry. In fact, what I see is that the industry has passed a tipping point in the year 2011, with a great many steps forward:
1. Google starting to charge for MT, which led to Microsoft starting to charge for MT and many of their users looking to alternatives. If they had to pay, they wanted better quality and more control.
2. Google and Microsoft charging for MT was seen as a sign of maturity that a technology that was once viewed as a joke previously may now be worth investigation.
3. As investigation grew, companies began popping up to take advantage of these new opportunities. Some packaged the work of others to make it easier, while others built their own MT solutions and invested heavily into R&D.
4. Proof points and case studies are began appearing in the market more than ever before – for both failure and success
5. With the wave of investigation and case studies, came new hype, new discussion and a natural pushback from various sources that one model is better than the other.
What is happening in the MT industry is far from a war. On the contrary, there are clean signs of a rapidly maturing of the industry as was noted in our recent newsletter (http://www.asiaonline.net/newsletters/201110.htm#3). This is a good thing for industry players and end users of MT.
Additionally, I do not see Kirti’s post as biased. I see it as an open exploration of what is really and truly required to build successful MT systems. Other than some DIY/Self Service promoters, nearly all comments are in alignment with this, not flaming or in disagreement. Total Cost of Ownership (TCO) is being investigated and Return On Investment (ROI) metrics are being sought. In order to determine these, honesty is required from the market players, where the real effort, time and costs are expressed to the market. This is what Kirti and others have been exploring in this blog post. Through this exploration, issues are being identified and then addressed. The value and issues encountered with a DIY MT installer is one such example. As @Tom pointed out in one of his posts “Mastery takes time… time to learn, time to practice, time to fail, time to experiment and time to contribute.” All of these come at a cost.
As I said to @Tom, omission of this information by DIY/Self Service promoters (thus far) may not be deliberate, but now that this message has been raised by Kirti, Alon and others, there is no excuse from the DIY/Self Service promoters for omitting this important information in future. Now that the realities are being discussed openly, continued failure to make prospective customers aware of these issues would truly be misleading.
With respect to your last point about “how long initiatives/companies will last”, again this has been taken out of context. @Kirti said “The learning curve for this technology is long and arduous and it may take a while for the dust to settle from the current hype, but I fully expect that by December 21st, 2012 it will be clear that expertise, experience and knowledge does matter with something as complex as Moses.” This does not in any way refer to the companies providing these technologies or individuals driving these initiatives within those companies such as yourself. Nor does it refer to how long any such company will exist. It refers to the end users, who are the actual users of these tools, finding out for themselves the appropriate skill level, knowledge and expertise required through actual real world experience.
What surprises me is that some DIY and Self Service MT promoters seem to be afraid to discuss these issues and instead convey the unrealistic message of “upload your TM, 1 click and you have a fantastic MT system almost instantly”. As Kirti mentioned, this is claimed in TAUS videos and even in one of the responses to this blog entry.
@Dion It is not unrealistic, on the contrary it is very realistic!
ReplyDelete@Gavin and @Andy, - I guess I (like others) are having trouble reconciling some of the mixed messages emanating from your company. They are confusing and I personally feel misleading in many cases due to either conflict, lack of clarity or over simplification. Examples provided below within quotes are taken verbatim from recent discussions:
ReplyDelete@Gavin: "Firstly I think it is worth saying that I completely agree with the fact that a completely customized SMT engine with lots of work on data cleaning will always give the best results."
@Andy: "with self-serve MT, clients without the necessary MT and computing expertise to install Moses themselves, have for the first time the ability to build an MT system based on their own user requirements pretty much instantly."
@Gavin: "The point and click type solutions, ourselves included are not suggesting that it will be as good as full customization."
So if I am to understand the message coming from your company correctly it is:
"We process the data that you upload quickly to make an MT system – we call this “self-service” because the only thing you need to do is upload the data and you do not need MT or computing expertise to build a system. However the fact is that a completely customized SMT engine with lots of work on data cleaning will always give better results than our systems.”
Another way to say this more simply would be:
“The best MT systems are always those that are fully customized.”
“The best MT systems are always those that have been built through a focus on the data.”
Pretending for a moment that an “upload your TM, 1 click and you have a fantastic MT system almost instantly” approach actually was realistic, is it must then be realistic in this hypothetical scenario that the LSP has all the knowledge necessary to answer the following questions?
1.What is the right data to upload for my MT system?
2.How should I prepare my data?
3.What cleaning can I do that the magic 1 click button does not do?
4.What impact will my data have on the MT system?
5.Will the data I upload improve or decrease quality?
6.What will mixing data from multiple domains do to my MT system?
7.Should I add some or all of the TAUS data to my system?
8.Once I have a system, how can I make it better?
9.When I see an error in my MT output, how can I know the cause of the error?
10.When I see an error in my MT output, how can I fix the error?
11.….
I could keep listing many areas of knowledge that builders MT engines using any MT technology will need to know in order to build a quality engine and why a “1 click instant MT” approach will always deliver lower quality. Of course it is not realistic for the LSP to possess all this knowledge – at least not without training and experience. Just to be clear – this is experience the LSP needs, not the MT provider. Several have been very quick to caim their own level of expertise, but this is no substitution for the LSP having their own expertise and knowledge. It does not matter how experienced the team is at the MT provider if the LSP has not concept or knowledge of what happens when they upload a specific set of data. In the report “Study on the Impact of Data Consolidation and Sharing for Statistical Machine Translation” (http://www.asiaonline.net/resources/reportID4523.aspx) this is analyzed in great depth and shows conclusively that the wrong data can have a vastly negative impact on engine quality.
I do agree with @Gavin that a fully customized system will always give the best results. However, were I disagree is in my position is that knowledge is required – even in a single click approach, rather than the opposing view point put forward by @Gavin and @Andy that knowledge is not required and a single click will deliver a fantastic MT system.
@Dion in a response to @Andy earlier in this blog post “Surely you are not promoting that your 25 years of experience is encapsulated behind a single button that makes instant high quality MT systems.”
Continues in next post...
...Continued from previous post.
ReplyDelete@Dion: "What surprises me is that some DIY and Self Service MT promoters seem to be afraid to discuss these issues and instead convey the unrealistic message of ‘upload your TM, 1 click and you have a fantastic MT system almost instantly.’"
@Gavin: It is not unrealistic, on the contrary it is very realistic!
I was originally encouraged by the post from @Andy in his “Overgilding the lily” blog entry. Excerpts are below:
@Andy: “As an academic, for over 20 years I’ve been educating students – many of whom now work in the industry – to not over-hype the capability of their companies’ tools and services.”
@Andy: “We must all act collectively to make realistic claims about what we can and cannot achieve, and to be aware when engaging in publicising recent developments in our respective companies to be aware of the rich history of developments in our field, and to ensure that claims are in synch with what has gone before, lest we alienate the very people we’re all trying to attract.”
However, my enthusiasm was rapidly eroded as I read recent posts with claims that “upload your TM, 1 click and you have a fantastic MT system almost instantly” are realistic, while also stating in other recent posts that “completely customized systems will always give the best results” sends a mixed and somewhat contradictory message. Even those who try DIY solutions will note that after the installation is done (or in the self service model, data uploaded) that nothing is instant – training an MT system takes time, CPU resources, disk resources and human resources.
I agree with Andy that we should not overhype and when statements are made, back them up with fact and proof points. That is why I replied to @Tom when he challenged Asia Online's claim of being able to deliver “near-human quality” MT output with actual case studies published or presented by third parties other than Asia Online. This showed clearly where our technology was used successfully and provided actual proof points without any hype. In one case (http://bit.ly/s2KPyq), a major multinational was able to deliver English->Chinese MT output that beat their first pass human translators and Sajan, the LSP that we worked with to create the fully customized engines for this client is on video record with their presentation at the LRC conference in Limerick earlier this year.
@Gavin "There was a comment made in another post which suggested that anything less than a hand tailored suit was not worth having." The source of the sewing machine and tailor metaphor was Asia Online’s recent presentations on ROI (http://bit.ly/t7f7wf). Not once did we say what @Gavin has suggested. It seems @Gavin has not reviewed our presentation (slides and video) properly or he would have understood what was meant by our reference in a bullet to “off the rack suite”. In the presentation, we refer to the “off the rack suit” as representing Google and Bing. A DIY or self-service solution is not the equivalent of a rack suit. We point out in the presentation that a DIY or self-service solution is the equivalent of being presented with a sewing machine and a bunch of fabric and then being told to make your own suit with it.
Continued in next post...
...Continued from previous post.
ReplyDeleteThe reality is that Google and Bing will often give better results than a DIY or self-service solution, especially when the LSP does not have an understanding of the data and the necessary knowledge that is outlined above. Good MT systems can come from both DIY and self-service with this knowledge and skill. But this knowledge cannot be encapsulated behind a single click button now or ever. It takes time to build the necessary knowledge. There are some great results from DIY solutions such as Autodesk's example, but that was achieved via a lot of hard work, some failures, some experimentation and learning. Uploading data with a single click can be achieved, as can some cleaning of the data. But the real work on the data is not to remove formatting tags, it is to apply the right data necessary in the right quantities to build the best engine for a specific task. This is why, as @Gavin correctly acknowledges, a completely customized solution will always give the best results.
Any MT vendor that tries to convince you that you can “upload your TM, 1 click and you have a fantastic MT system almost instantly” is misleading customers and will ultimately disappoint. Once the MT system is built, customers must use it and will immediately find that they have to spend far more on post editing the MT output than if they had built a system properly in the first place. The effort must be put in somewhere. There is no magic wand for great MT systems. They come from hard work and knowledge of what it takes to build a MT system. If you don’t put the effort into building the MT system, you will put even more effort into editing the MT output and ultimately end up costing more.
@Mr Wiggins. Your claim that @Mr Hoar challenged AO's claim to "hear-human quality" is blatantly false and intentionally misleading.
ReplyDeletePlease read my post carefully. I wrote I "missed the 'hype' from the DIY crowd claiming 'near human quality.'" In the entire post, I did not reference AO or challenge its claims to "near human quality. My comments about 'near human quality' were clearly limited to the DIY community. Surely, a word craftsman and former VP of Gartner Research can see the distinction. It was your comments that brought AO into the discussion, not once, but twice.
By the way, has Gartner commented on AO's leveraging its Hype Cycle methodology in the blog's opening graph.