Saturday, December 29, 2012

Annual Review–Most Popular Posts of 2012

“Blogs are about sharing with authenticity. A good blog can help you really connect deeply with your audience in a meaningful way because the content is not only relevant but insightful and personal. I think most enterprises miss that point. When you do it right, your customers will walk away not only having learned something new but will also feel much more connected to your brand.David Armano EVP, Global Innovation & Integration at Edelman Digital
It seems like it was a just a moment ago that I summarized the most interesting blog posts of 2011 but here we are again and the world has not ended. I was not as active writing in 2012 as I was in 2011 as I felt that I had said much of what I had to say, and really there is only so much one can really write about machine translation without being repetitive. The topic has had more coverage across the industry and is perhaps slightly better understood now than it was last year. I am limiting the list to the top 6 since I had fewer new posts this year.  Since Google has killed the PostRank service I am now reduced to only providing the most popular list of blog posts. PostRank used to give us much better insight into the broader influence of any web content and helped identify seminal and influential rather than simply popular content. I resolve to be more active in the new year if I have ideas for new material and I am always open to suggestions. There are still many misconceptions about MT and I think that it would be useful to cover this in more detail and perhaps I will delve into that in 2013. 

Here is the list of most popular posts in order of popularity:

  1. Exploring Issues Related to Post-Editing MT Compensation This article continues to get attention today even though it was written early in the year and it still shows up regularly in the top 3 for every week. The post has links to several interesting comments on post-editing and I think this is possibly one of the reasons why it continues to be popular as it gathers different opinions and viewpoints in a useful and unbiased way. The popularity of this post suggests that this is an important issue to resolve in a fair and equitable way to enable broader MT adoption. All parties involved need to work together to establish trusted and equitable compensation for this process. I hope that others will step forward to share opinions and approaches that might further the dialogue. It would be useful for translators especially to step forward and suggest ways to do this more efficiently and accurately. For example this post by Jason Hall shows that simply equating MT output quality to TM matches may not make sense, and that leveraging MT is entirely different from leveraging TM.

  2. The Moses Madness and Dead Flowers This post was written very late in 2011 and thus it’s popularity was not reflected in the 2011 list. But it is another post that has continued to see regular traffic as more people wade through the Moses technology and realize that “free” and “DIY” is a still really a pipe dream with MT. Being able to whip up some sort of an MT system by throwing data into a computer has become very easy but the technology is still very complex and hairy, and requires at least "some" fundamental knowledge for any real success. I remain very skeptical about any instant MT approaches and I think we will continue to see a market where you get what you pay for. I would avoid any LSP whose strategy is based around instant MT solutions.

  3. Emerging Language Industry & Language Technology Trends This was a post that seemed to strike a chord and it very rapidly rose to being one of the most popular posts of the year. Thanks to all those who shared their opinions to provide broader context. In case you missed it you may also wish to take a look at Translation Guy’s humorous take on the post. You may also find the Asia Online Trends and Translation Industry predictions interesting and you can access the webinar and slides through the link provided.

  4. A Short Guide to Measuring and Comparing Machine Translation Engines This post provided specific and constructive advice on using BLEU scores correctly to assess your MT systems in a fair and accurate way. I see BLEU scores continually being used to mislead gullible users on a regular basis and there were even some presentations at the AMTA 2012 conference that claimed systems having .90 or 90 which to my mind is only possible if you cheat. In short BLEU measures the quality of MT system output against one or more human reference translations of the same material. It needs to be done carefully if you want meaningful and accurate results. It is possible to calculate BLEU scores on two human translations of the same material, and even there I have never seen a score higher than .7 or 70 since humans do things quite differently. There is a great discussion on the many issues with BLEU in this article and I recommend it so that you can understand the increasing number of discussions where it is referenced today.

  5. The Relationship Between Productivity and Effective Use of Translation Technology MT should only be used when it actually provides measurable productivity advantages. Higher quality MT systems generally provide much higher return on investment (ROI) and this post explores this issue in some detail. MT is a means to build long-term production advantage, but only when you do it well and if you are going to invest in this technology my advice is to do it as well as possible. Most of the short cuts will lead to dead-ends and remember that with MT, you are competing with smart people at Microsoft and Google who are doing the best they can for a general internet user population. Most translators will likely prefer to use these "free" engines to crappy LSP produced Moses and RbMT engines.

  6. Understanding Post-Editing  This is one of several posts on the subject of post-editing. This is a subject that is worth exploring more as there are also many misconceptions about the nature of the process and it would be useful for more voices to air both good and bad post-editing experiences so others can learn. Jost Zetsche has written about this in some detail in his newsletter but the scope and understanding of the role of language experts is still evolving and it is a worthwhile discussion to continue. I have not seen anything really useful coming out of conferences so I suspect the best stuff on the subject will happen in blogs and LinkedIn discussion forums.
I once again invite any interested guest authors who might wish to use this blog as a way to share an idea or an opinion on the translation industry. (There is a good blend of buyers, LSPs and translators who watch this blog). I do not seek only those who agree with me to apply to do this, and in fact I hope that some who disagree will also step forward. I have always thought that it is useful to hear many different opinions to better understand a subject. So please don’t hesitate to send me contributions that you think might be interesting to the audience that has been following this blog. I thank you for your support and I hope that the content here will continue to earn your interest and comments to extend the discussion beyond my thoughts on key translation automation related issues.

It is also interesting to note that some older posts continue to strike a chord with readers and remain active in terms of visibility because the themes are longer lived and also perhaps because they ring true. The original post on standards, the analysis of why Google changed the use model of their MT systems and some of the posts that discuss the reaction to automation or industry disintermediation were also posts that generate continuing interest and continue to show up high in the list in Google Analytics.

I found a very interesting blog post that I think is worth a read, as it points to the changes that widespread information availability and ease of access creates to traditional commerce by socially engaged human beings. There is also a link to the research data from Mary Meeker on the changing online world that is worth at least a quick look. I think we are heading back to world where it is more important to understand how people connect rather than assume that technology and data will solve every problem known to man. I have always preferred the emphasis on Why? rather than  How?

Commerce in 2013 is about integrating the whole experience around the customer -- social, local, and mobile, bricks and clicks, in real life, in real time, and over time.  
Finally, I want to share a beautiful piece of music by Mercedes Bahleda that I discovered through Pandora  - the video is also very evocative and sublime with scenes of inter-species communication and a langorous swim dance. Those of you who find the sight of a female human breast offensive (there are unfortunately many in America who actually do) may wish to avoid actually looking at the video. I suggest you turn the volume up and play this on good speakers for maximum effect.

Happy New Year – I wish you health, happiness and joy

I don’t tell the murky world
To turn pure.
I purify myself
And check my reflection
In the water of the valley brook.

Zen Master Ryokan

“If the light’s not in you, you’re in the dark.”
Marty Rubin


  1. This is a good post.. Thanks for your post on post editing topic.. helped me a lot to know more of it..

  2. Good day! Were you somehow able to execute all the options of this site on your own or you needed some extra help?

  3. I think the things you covered through the post are quiet impressive, good job and great efforts.

    Scientific translation