Pages

Wednesday, June 1, 2011

Analysis of the Shutdown Announcements of the Google Translate API


There has been some buzz about what this means to the translation industry and so I thought it would be good to have a detailed and in-depth analysis of  this announcement. This is an insightful post by guest writer Dion Wiggins, CEO of Asia Online (dion.wiggins@asiaonline.net) and a former senior Gartner analyst. The opinions and analysis are that of the author alone.

Reviewing the Facts of the Announcement

A simple read of the announcement would lead one to believe that Google has merely shut the door for developers who wish to integrate automated Google translations into their code and products, while still allowing users to translate web content on-the-fly with either the Google Translate web page or the Google Translate Widget/Web Element. However there is more to this announcement than one might realize at first.

Google has recently made a number of formal announcements, in addition to a few quiet actions, affecting users of their various language translation tools. On Thursday, May 26, 2011 Adam Feldman , Google’s APIs Product Manager, announced in a blog post (http://googlecode.blogspot.com/2011/05/spring-cleaning-for-some-of-our-apis.html) that Google was adding 7 new Application Programmer Interfaces (APIs) for use, but 18 Google APIs covering a variety of areas will first be deprecated and many then shut down. One of those APIs destined to be terminated is the Translate API.

Google updated the Google Translate API, Transliteration API and Translator Toolkit API webpages with the following messages:
Important: The Google Translate API has been officially deprecated as of May 26, 2011. Due to the substantial economic burden caused by extensive abuse, the number of requests you may make per day will be limited and the API will be shut off completely on December 1, 2011. For website translations, we encourage you to use the Google Translate Element.
Important: The Google Transliteration API has been officially deprecated as of May 26, 2011. It will continue to work as per our deprecation policy.
Google frequently offers alternatives to deprecated API’s. An alternative to the Google Translate API would have been the Google Translator Toolkit API. However, without making any announcement, Google also has quietly modified access to the Translator Toolkit API, removing all documentation and restricting access.
Important: The Google Translator Toolkit API is now a restricted API. However, we have no current plans to remove the functionality for current users. If you are a current user of the API or are interested in access to the documentation, please let us know.
These changes do not mean that Google is going to do any less with its own machine translation efforts. It simply means that all forms of translation API are now being progressively deprecated or restricted for use by developers. 

This impact to Google’s translation services as a whole can be summarized as follows:
  • Google Translate web page (http://translate.google.com/) will still translate text that is typed into the text box and will also translate a HTML web page when a URL is submitted.
  • Google Translate Widget/Web Element (http://www.google.com/webelements/#!/translate) will continue to function and will still translate content on-demand when a viewer of a web page requests a translation.
  • Google Transliteration API (http://code.google.com/apis/language/transliterate) will continue to function up until May 26, 2014 as per the deprecation policy.
  • Google Translator Toolkit (http://translate.google.com/toolkit) will still function as before and users can submit TMX and other documents formats for translation.
  • Google Translator Toolkit API (http://code.google.com/apis/gtt/) will continue to function as before and documents can be submitted and retrieved as previously. However new development using this API has been restricted.

Abuse, Economic Burden and Google’s Right to Shut down the Translate API

Google has been offering the Translate API free of charge. The Terms of Use (http://code.google.com/apis/language/translate/terms.html) discuss deprecation of the service. In the terms Google states:
For a period of 3 years after an announcement (the "Deprecation Period"), Google will use commercially reasonable efforts to continue to operate the Deprecated Version of the Service

Google has however noted that it will continue providing the service only until December 1, 2011, as they consider that there is indeed a substantial economic burden. There is confusion between the blog entry, which says that “Following the standard deprecation period – often, as long as three years – some of the deprecated APIs will be shut down.” and the December 2011 date. The two statements are contradictory and bring into question the extent to which users can rely on Google Terms of Use.
Google has worded the announcement carefully so as to allow the use of the following clause:
Google reserves the right in its discretion to cease providing all or any part of a Deprecated Version of the Service immediately without any notice if:
d. providing the Deprecated Version of the Service could create a substantial economic burden on Google as determined by Google in its reasonable good faith judgment; 

While Google may be within its rights to shut down the service if there is indeed a substantial economic burden, lack of clarity means that Google risks further upsetting, confusing and frustrating its users and developer community by not being fully transparent on the reasons for the decision.

The vagueness of the announcement and the lack of information on the rationale behind it are already giving rise to speculation as to what the actual reasons could be:

Substantial Economic Burden: The amount of text translated via Google’s Translate API is believed to be only a fraction of the volume compared to other means of translation provided by Google such as via the web interface or the Google Translate Widget/Web Element. Costs incurred by Google would include bandwidth and processing capacity, but the Translation API expenses overall would be miniscule when compared to those of Search, YouTube, Gmail and Google Apps. Therefore we can deduce that the substantial economic burden is not related to the operational costs of the API itself.

Extensive Abuse: This could be interpreted as users of the API not following the Terms of Use (http://code.google.com/apis/language/translate/terms.html ) of the Google Translate API. Abuses would include activities such as using the Translate API for commercial purposes in a manner that Google deems in violation of its Terms of Use or incorporating output from the Translate API in website content.

Deeper Analysis

Google’s stated mission is to “organize the world's information and make it universally accessible and useful.” Language translation is one method for the creation of content for this purpose that helps achieve this mission. However, with this announcement, Google is making it very clear that it reserves the right to exclusively use this method within its own applications such as the Translate Widget/Web Element. Google wants to control how and when content is translated into another language and by whom.


In order to better understand the announcement and analyze possible reasons for Google shutting down the API, it is important to understand Google’s products and customers. Google’s primary revenue source is advertising driven by content and the contextual analysis of said content.

In this model, Google’s customers are advertisers who purchase ads, not users of Google services. Google’s products are not really search, translation or Gmail – these are tools that Google offers to users. In fact, Google’s product that it is selling to its advertising customers is the large number of users of various Google tools. 

When Google was a young company, users adopted the new search tool rapidly because it provided superior quality search results over alternatives. Increasingly Google is now being challenged by other search tools, especially in non-English markets. In addition Google is frequently criticized for delivering lower quality content in its results than alternative offerings in the Search arena. 

On February 24, 2011, Google’s official blog discussed many of the issues that it faced deliver high-quality results in a blog entry entitled “Finding more high-quality sites in search” (http://googleblog.blogspot.com/2011/02/finding-more-high-quality-sites-in.html). The post states:
Our goal is simple: to give people the most relevant answers to their queries as quickly as possible. This requires constant tuning of our algorithms, as new content—both good and bad—comes online all the time.
But in the last day or so we launched a pretty big algorithmic improvement to our ranking—a change that noticeably impacts 11.8% of our queries

This was followed by a major algorithmic update announcement on April 11, 2011 on a blog post entitled “High-quality sites algorithm goes global, incorporates user feedback”  which was more commonly known as the “Farmer Update”. The aim of this update of the Google search algorithm was to downplay the influence of mass produced low quality content specifically created for Search Engine Optimization (SEO) purposes and also reduce the rankings of search results associated with content farms such as Demand Media. The update impacted a further 12% of Google US search queries and was initially only applicable to English language content, but was followed shortly afterwards by the “Panda Update” for non-English content.

It is clear by these actions that Google has realized that while it has significant market dominance in search, emerging competitors are starting to gain ground by delivering higher quality results, just as Google did when it first launched its search tool many years ago.

Google remains the undisputed leader in most major markets. However there are some notable exceptions which include Russia, China, Japan and South Korea. In these markets, local search operators such as Yandex, Baidu, Yahoo! Japan and NHN respectively have dominant market share. It is easy to blame local factors (such as governmental policy or influence) or other restrictions for lower market share. But, Google is in reality playing catch-up in many of these non-English markets. Local search operators have both deep market insights and an even deeper understanding of their own language and culture. These advantages have allowed these companies to deliver higher quality search results in their local language targeted at specific local audiences. 
In order for Google to deliver high quality search results, Google relies on high quality web content. But the forces of globalization may actually be leading to lower quality web content through automated translation without the human post-editing process that would ensure quality. With the rapid expansion of the Internet and globalization for many companies, enterprises and website owners have been increasingly translating their content from English and other languages into the languages of new markets across the world.

Translating content professionally is both slow and expensive. Depending on the domain of translation and the language pairs being translated, professional translation can cost as much as US$0.50 per word for a language such as Japanese. For European languages, costs typically range between US$0.08 to US$0.20 per word. For many publishers, this translation expense is too high and cannot be justified. Human translators typically translate at around 2,000-3,000 words per day. Rapid translation of time sensitive content using machine translation is an alternative. Being first to market with new information can be a significant competitive advantage and bring significantly greater advertising revenue to online publishers. 

It would therefore be no surprise that some online publishers are abusing the free Google Translate API to translate content and then publish local language content to complement existing content that has been created by human translators or authored in local language. This is a common technique that SEO companies have applied to bring more users to a website and then in turn link through to premium content.
Google’s Terms of Use state clearly that this use of the Translate API is not permitted and there are very good reasons for doing so beyond the financial cost of bandwidth and processing power.

Internet Stats (https://supplygem.com/internet-usage-statistics/) estimates that more than 50%, more than 2 billion Internet users come from Asia, with continued rapid growth expected. Other non-English speaking markets in Europe, Africa and the Middle East are also growing quickly. Like the websites that are abusing the Google Translate API to deliver local language content and gain users in global markets, Google also aspires to compete more effectively in many non-English markets and grow its worldwide user base. 

Google’s ultimate product for its advertising customers is the expansion of its ability to secure large volumes of users in non-English language markets to complement its English centric origins where it has already established a dominant market share.

In order to achieve this aim Google must control the means of access and the quality of the content. Google’s Translate Widget/Web Element offers a real time translation alternative to the Translate API which has many additional benefits to Google that gives it not only this control, but also control of who accesses translation functionality. It is therefore no surprise that Google is deprecating the Google Translate API, in favor of translation methods that it controls directly. 

Advantages and Disadvantages of Google’s Strategy
  • Content that is translated and published via the Google Translate API and then stored by publishers reflects the quality of Google’s translation technologies at the time the translation was performed. In many cases, this is sometime in the past (most users of the API create static content that is not updated often), and does not reflect the improvements in translation quality as Google updates its translation technologies on a continuous basis. When this old content then ranks highly on a Google search result, users may be frustrated at the lower quality result. 
  • User frustration at lower quality local language content is only a part of the issue. Advertisers want to target their advertisements onto high-quality local language content that will in turn driver users to click on their advertisements. Lower quality sites have a negative impact on the click-through rate that disappoints Google’s customers – the advertisers.
  • By shutting down the Translate API, Google forces web publishers to find an alternative translation tool or use the Google Translate Widget/Web Element. Using the Google Translate Widget/Web Element has the advantage that the user can still see the content translated into their local language, but Google’s crawlers do not see this content as it is only translated for users on demand on a one-time basis. On demand translation also means the user is seeing the most updated output from Google’s latest translation technology that will usually be better than an older translation.
The last benefit above is huge, and it is the most likely, but not the only, reason for shutting down the Google Translate API.

Polluting Its Own Drinking Water
Google crawls and gathers data from many sources. In turn this data is used for a variety of purposes. In order to deliver high-quality search and local language content results, Google needs high-quality data. In recent times, it can be assumed that an increasing amount of the website data that Google has been gathering has been translated from one language to another using Google’s own Translate API. Often, this data has been published online with no human editing or quality checking, and is then represented as high-quality local language content. Google represents that data in its search results and also integrates this mix of local language content into tools such as Google Translate. 

It is not easy to determine if local language content has been translated by machine or by human or perhaps whether it is in its original authored language. By crawling and processing local language web content that has been published without any human proof reading after being translated using the Google Translate API, Google is in reality “polluting its own drinking water.” By indexing local-language content translated in this manner, Google delivers a mix of very different quality local language search results, which are often frustrating for users in many parts of the world.

This problem only gets worse when you consider that the same data that Google crawls for indexing websites is also used to improve its language translation technologies. Using the technique of statistical machine translation (SMT), Google relies on huge quantities of local language content sourced from its crawlers of the web to continually improve and enhance its language translation software.

The higher the quality of input to this training process, the higher quality the resulting engine can translate. So the increasing amount of “polluted drinking water” is becoming more statistically relevant. Over time, instead of improving each time more machine learning data is added, the opposite can occur. Errors in the original translation of web content can result in good statistical patterns becoming less relevant, and bad patterns becoming more statistically relevant. Poor translations are feeding back into the learning system, creating software that repeats previous mistakes and can even exaggerate them. This results in potentially lower quality translations over time, rather than improvements. 

One of Google’s key differentiators has been its ability to efficiently and effectively process extremely large volumes of data. While Google has not publicized how much data it has gathered to train its SMT engines, various articles indicate that Google has scanned about 11% of all printed content ever published. Google has access to widely varying volumes of data depending on language pair involved. While there are massive amounts of online language content for some languages (such as Tier 1 languages which include English, Spanish, Chinese, French, Japanese, etc.), this is not true for the vast majority of languages in the world. In general, the glass ceiling of data limitations is relatively low for many languages. While more data is becoming available online every day, but the challenge for Google is getting sufficient data volumes for Tier 2 and Tier 3 languages in order to reach a quality level that is acceptable for users. 

But even for Tier 1 languages, Google is facing a significant data glass ceiling. In a Guardian Newspaper article entitled “Can Google break the computer language barrier?”, Google’s Andreas Zollmann discussed the data glass ceiling issue and states that "Each doubling of the amount of translated data input led to about a 0.5% improvement in the quality of the output.” He makes a very important point about the limits of Google’s approach “We are now at this limit where there isn't that much more data in the world that we can use, so now it is much more important again to add on different approaches and rules-based models." 

Putting this in context, each time Google doubles the data, it gets diminishing returns. If Google doubles the data 3 times (11%x 2 = 22%, 22% x 2 = 44%, 44% x 2 = 88%) it quickly reaches the limit of data that it can collect. But despite this vast volume of data, the quality improvement is just 1.5%. Kirti Vashee, blogged on this topic back in January this year when Google first publicly discussed the data glass ceiling issue.

 So Why is Google Shutting Down the Translate API?
What Google did not anticipate was extent of abuse of the Google Translate API in a manner prohibited by its Terms of Use. This has resulted in such a significant mass of poorly translated content that the impact on Google’s core search business is notable and poses a significant threat to the quality of Google’s search results and the quality of its future translation initiatives. Given how important search and translation are to Google’s current and future business, this is most likely the “Substantial Economic Burden” and “abuse” that Google refers to in its shutdown announcement. With this realization, it makes sense that Google is taking action to rectify the problem. 

Possible Additional Reasons for Shutting Down the Translate API

Google’s market-beating revenue growth in 2010 can be attributed to three key business pillars:
  • Search – Google’s core revenue stream
  • Video – Short-term revenue
  • Mobile and Google Apps – Long-term revenue
Each pillar in turn has already integrated machine translation and Google is expected to add further functionality in future as demand expands further beyond English only content. 
The one thing that Google has most notably not yet been successful in, despite several attempts, is social networking. Facebook’s rapid rise and entrance into the online advertising space means that there is now a real competitor for online advertising dollars beyond Google’s AdSense. 
Many of Google’s users are already Facebook users. Facebook has already mastered the use of crowdsourcing. With the Google Translate API arguably offering the best translation quality any of the free translation tools, developers had already created products that integrate Google’s Translate API with the Facebook API to deliver a bridge between languages for Facebook users. 

With Microsoft as an investor in Facebook and Facebook being a significant threat to Google’s market share for both users (product) and advertising (customers), helping Facebook become even more popular via the free use of Google Translate API is certainly something that Google would not find desirable.

What About Google’s Software Developer Community?
Google has come to understand the strategic benefits of limiting the reach of machine translation, but cannot limit it to some (i.e. Facebook) and keep it open for others. By shutting down the Translate API to developers, Google is now the only software developer that can develop applications using Google Translate. 

By allowing use of the Google Translate API, Google has successfully seeded the market with applications that leverage machine translation as a core function within a product. There will be a literal smorgasbord of great applications that no longer function as a result of the Translate API shutdown that Google can take the best elements from and launch its own products without competition. 

This is going to be particularly important in the browser and mobile application space.
·   Web Browsers: Google Chrome already has embedded foreign language auto-detection and translation of content. Similar third-party plugins for Microsoft Internet Explorer and Firefox built using the Translate API will cease to function at the beginning of December. 

·   Mobile: Developers that have built products for Android using the Translate API will have the same problem, but expect Google to deliver increasing functionality that incorporates translation in the Android OS. Developers who built applications for Android competitors such as Apple iPhone and iPad will be harder hit when mobile translation applications cease functioning on their platforms at the beginning of December. It is unlikely that Google will build similar applications and functionality when it can maintain significant advantage for Android by withholding such functionality. 

By deprecating and then shutting down the API, Google reduces the capability of abusers to freely use automated translation to produce content and slows the rate of low quality content appearing on the Internet. However in doing so, Google is taking a risk.

Developers have invested money and time into their software products, many of which will cease to function (or need to be updated) with the shutdown of the Google Translate API. Just as Google underestimated the abuse of the Translate API, Google may also have underestimated the backlash from the developer community for what is seen by many as one of the most valuable Google APIs. Hundreds of postings have been made already discussing the shutdown of the APIs. Most of the posters are upset about the Translate API, with very few comments on the shutdown of the other 17 APIs. Emotions range from surprise, anger and distrust of Google to bewilderment. 

Google shutting down key tools such as the Translate API without offering an alternative to developers decreases confidence in the use of any Google API. Due to the dominance of Google and its tools, development and innovation in new innovation in Internet applications may slow or be stifled as a result of trust issues now brought to the forefront by the Google Translate API shutdown. 

The impact will likely reach beyond just the Google APIs and have knock-on effects on the use of all free APIs irrespective of who is providing them. There is a risk of a perception being created with developers that if a company as large as Google can pull the rug out from under developers, then any company could do so, with hundreds of hours of software development and marketing costs being wiped out with a simple shutdown notice.

When the Google Translate API was released in March 2008, Google released a train from the station that went hurtling down the tracks at a pace that Google had not anticipated. Developers quickly integrated the technology into their applications. With little management and oversight on the Google Translate API, Google quickly lost control of how, when and by whom it was used. 

The developer outcry in response to the shutdown of the Google Translate API is a clear indicator that Google’s attempt to recall the train back to the station is not going to be taken lightly. Many developers will not allow their hard work to be wasted. Developers will simply switch to other competing technologies. 

Smart developers have already built support for multiple free translation technologies into their products. The loss of functionality provided by Google Translate may mean fewer language pairs or lower quality translation in some cases, but once the train has left the station, there is no turning back. Microsoft Bing will be the most likely beneficiary and it would be no surprise to see Microsoft investing even more into its translation technologies as a result. 

Google’s attempt to control access to machine translation may be too late. In opening up the Google Translate API, controls should have been in place from the outset. It would not be surprising to see either a commercial or an open source API for translation appear in near future that encapsulates all remaining free translation technologies in addition to many commercial translation technologies in a single consolidated API. The demand is clearly there and such an API would make it even easier for developers to integrate translation into their products. Indeed, the result may be the exact opposite of what Google intended, with the further proliferation of machine translation products and content at an even more rapid pace.

Conclusions

·      Google is shutting down the Translate API, but Google Translate will continue to exist and improve in a manner that allows Google to leverage Google Translate in its own applications, but will not allow third-party developers to leverage the technology. Although late in applying controls to translation and with some risk, this is probably the best strategy for Google as a business.
  • Eliminating language as a barrier to knowledge and communication is one of the last great challenges of the Internet. Google most certainly understands the benefit and potential of its automated translation technology and is now trying to regain a level of control over it.
  • By shutting down the Translate API, Google is able to control when users access their translate functionality and when they can deliver advertisements to these users. Google also benefits by reducing the quantity of machine translated content that is misrepresented by websites as quality local language content.
  • The “substantial financial burden” that Google refers to is not related to the operational costs of the API itself, but the burden and risk to Google’s business as a whole that uncontrolled access to Google Translate functionality represents.
  • It is clear that Google understands the potential for translation. But it is also clear that Google understands the potential for abuse of translation and the knock-on impact that it is facing or may face. Not having control of what is translated and how the translations are used creates a threat to Google’s core revenue streams and potentially helps competitors such as Facebook to increase their value at Google’s expense. This is most likely the substantial economic burden that Google refers to in its announcement.
  • The shutdown of the Translate API is truly a shame for the many software developers that did not violate the Terms of Use and used the Translate API in a manner permitted. Developers should be aware of the limitations of free APIs where they have no control or say in the future of the service. Business models built around a free API with little other value-add are doomed to failure from the outset. If a free API must be used, then developers should try to look for multiple providers of similar functionality and build in support for as many APIs as possible in order to reduce risk. Developers should anticipate the possibility of competition from Google in applications that leverage automated translation and move to protect themselves via patents and by offering features that go beyond those of interest to Google.

The analysis in this post has been focused on gaining a clearer understanding of Google’s announcement as well as the probable reasons for Google shutting down the Google Translate API. A follow-up post that analyses the impact on the language services and translation industry will be posted shortly.

32 comments:

  1. What an amazingly thorough analysis of the Google Translate API situation, and of Google's strategy in general. Thank you, Dion for your thoughtful essay!

    ReplyDelete
  2. @Charles - thank you for your kind words. It was fun to put my old Gartner analyst hat back on for a while :)

    My next post in a few days time provide analysis of what it means for the professional translation industry.

    ReplyDelete
  3. Nice and logical analysis, Dion. I like it!

    ReplyDelete
  4. Sorry, but human translation is "slow and expensive" for a reason -- because it's a PROFESSION that requires intellectual engagement. It is precisely the opposite -- in moral as well as literal terms -- of "machine" translation (which is not translation at all and should not be called translation; it is, in essence, nothing more than word substitution, a very complex macro). The fact that open heart surgery, plumbing, or teaching a course in American History can be "slow and expensive" isn't a good argument for replacing professionals with robots. One of the "economic impacts" that Google has perhaps not considered is the damage its translation tools has done to the translation profession: unscrupulous "translators" google-translate documents and then charge for them, while clients with no knowledge of English (or whatever target language is required) accept google-translations and put them on their websites or in their publications without thought, having been convinced entirely by the "cost" argument. The fact that "free" translation is widely and falsely believed to be available (because, again, what Google translate provides is NOT translation) has played a major role in lowering the fees that translators can ask and, thus, contributes to reducing our ability to earn a living as skilled workers. Google, by asking people to contribute for free to its product has only continued this form of unfair competition. Nor has Google ever made any attempt to collaborate with professional translators in providing appropriate education to the public regarding how to use Google Translate in a way that does not abuse translators, clients, or language itself. Google should have closed this app long ago; it has always done more harm than good.

    ReplyDelete
  5. Dion - timely, thorough, and comprehensive writing - we at dotSUB have had discussions about this over the past few days, and your words add insight to our thinking.

    Thanks

    ReplyDelete
  6. @No Peanuts! – I think you are reading this very wrong. Google has never tried to compete with professional translators or act as a substitution. Some may have tried to use Google that way, but that is not the purpose of Google Translate. You are correct in stating that there is a reason for human translation to be slow and expensive and that it is the level of professional skills required in order to be able to deliver a high quality translation.

    However, there are many scenarios where “fast and somewhat understandable” can be better than “slow but high quality”. Example: Asia Online has a client translating billions of words (10’s of millions of patents) into English from other languages so that they can be searched and found using English search tools. Once the document is identified to be of value, it can then be human translated. In this scenario, the patent document would not have previously been accessible as it was written in a language that was not understood by the person searching. But once it was identified, the professional translator got work that would not have otherwise existed without the use of machine translation first.

    There is a very big difference in “translation so that you can understand something” – aka “gist translation” and “translation so that you can publish something” – aka “professional translation”. What Google translate is aimed at is giving someone the ability to understand content in another language, not to publish content in another language. Google’s Terms of Use state clearly that publishing content is not permitted. But this is unfortunately what has happened. By your own description in your comment, it is clear that many of the “unscrupulous translators” you refer to are actually professional translators that are taking short cuts – and I must admit that I have met several of these people myself.

    Google is no more to blame for the actions of these individuals than a broadband provider or the manufacturer of the PC that submitted the content. In its Terms of Use Google set out how the Translate API is permitted to be used and users publishing content that has been translated using Google Translate are in violation of those terms. To blame Google for the actions of “unscrupulous translators" is unfair. However, what Google can be blamed for is not putting the necessary controls on their tools to ensure this kind of activity did not occur. Controls they are now attempting to put in place.

    I have many thoughts on how Google has impacted the professional translation industry and I will be posting them in part 2 of my analysis of the Google announcement in the next few days. Thank you for your thoughts on Part 1. I look forward to your comments and thoughts once you have had a chance to read Part 2 also.

    ReplyDelete
  7. Hi Dion,

    Thanks very much for your thoughts on this matter, very insightful indeed. But I'm not really sure with this statement: "Google has never tried to compete with professional translators or act as a substitution".
    Because if we want intend "substitution" (not for freelancers but for translation companies) as a marketplace, like Proz for instance, remember rumours, 3 years ago, about the Google Translation Center (see @ http://blogoscoped.com/archive/2008-08-04-n48.html ), which was a concept so far much more ambitious than the current Google Translation Toolkit.
    So what if Google decided to run the GTC allowing translators to use his API only on Google servers, and so they decide to shut down first all Google Translate integrations...
    It would be a phenomenal disruption in our field!
    Jean-Marie

    ReplyDelete
  8. Excellent analysis, Kirti, very interesting and thought-provoking. As a professional translator, I feel this is a good thing.

    Posted by Nicholas Ferreira

    ReplyDelete
  9. Somebody mentioned to me that Google has only deprecated Google Translate API V1 and that Google will charge for access to Google Translate API V2.

    I did a little double checking. As can be seen from the link below Google Translate API V2 has also been deprecated and will shutdown on December 1, 2011.
    http://code.google.com/apis/language/translate/v2/getting_started.html

    ReplyDelete
  10. @Jean-Marie - Thanks for your feedback. There has been much speculation since the launch of Google Translate about competing with professional translators or the professional translation industry.

    The reality is that first of all the quality is not that of a human and second that Google is not in the business of translation. Google is in the business of organizing the worlds information so that it can attract advertising revenue (customers) to the content that it has helped organized so that users (product) can view it. Translation is a tool for Google that helps to achieve this goal.

    In the upcoming Part 2 of my analysis I will cover many of the issues that relate to the professional translation industry.

    ReplyDelete
  11. Dion, thank you for your answer, but I don't think it's a matter about competition between Google and professional translators or professional translation industry. As Google initially stated in his "Google Translation Center's Role":

    Google Translation Center provides a venue for you to enter into and complete translation transactions. Except when you use Google Translation Center as provided in Section 4 (...), Google is not involved in any transactions in Google Translation Center.

    Moreover, if the problem of the burden was depending only on the bad quality of MT translated and published text, there is no difference with all abuses committed with other MT systems (MSFT first of all), and I think it's quite easy for Google to detect with a 100% rate all MT translated text and don't include them in any statistical computing.

    The opening of GTC would desintermediate current business models of translation companies and act as a "universal Proz", with an advantage for Google to keep all memories on his servers and to increase hugely his Universal Translation Memory.

    I made a post on my blog about that, but it's in French, sorry :-)

    ReplyDelete
  12. For some reason many people seem to be interpreting this announcement as "no more free MT from Google".

    Google intends to keep the interactive and dynamic real-time capabilities very much in place. Also the basic GTT appears to be untouched - it is only the programmatic access (APIs = Application Programming Interface i.e. software controls that allow two software programs to talk to each other and hand each other tasks) that are affected. And even products like Trados will not be able to access this anymore though the same chunks of text could easily be translated through the Google Translate interface.

    ReplyDelete
  13. The members of this list are a very small subset of users, and we know what APIs are about, and that MT is even embedded into other apps via APIs.
    On the poll on ProZ, comments are the poll show users who clearly state that they don't know what API stands for.
    Earlier this week I had breakfast with a friend/colleague who has been attending a big international linguistics conference all week and when this Google API topic was brought up, one person interpreted it as Alphabet Phonetique International (API) which led the entire discussion about the International Phonetic Alphabet (IPA) and yet not related to the initial API question at all.
    Not sure that the majority of MT users out there even understand what this means with regard to the other apps they regularly use which consume the MT technologies via APIs.
    In any case, this is an amusing week on the MT front.

    ReplyDelete
  14. Good points, Jeff ... There is probably a need for a clear, plain-English discussion piece that puts all of this in context without the jargon.

    At the same time, I find it a bit sad that "API" is so poorly understood by people who want to use MT. In my opinion, it is this ability to embed an automated step in another workflow (with or without traditional human translator involvement depending on the case in point) that is the key to productivity gains. And I suspect that some of the crowdsource portal plays out there were among the "abusers" of the Google API along those lines.

    As for the "abuse" ... It seems to me that the essence of the abuse was that it separated the Google MT technology from the Google revenue model ... they do not charge for the MT, but MT does fit into the way they build revenue. As long as everyone acts per their expectations, they benefit, but when the capability can be abstracted and repurposed beyond their control, that begins to look (to them) like "abuse". It will be interesting to see whether Kirti's suggestion becomes reality.

    ReplyDelete
  15. Google has only one main revenue model, and that is the advertising that succeeds by getting millions of eyeballs to Google and the ads it displays. The MT API added no new eyeballs to those ads, so it was superfluous. From my discussions with Google, I can surmise that they spend a great deal of money all over the world on their quest to produce high quality MT (note that I said "quest", not "achieved"), both in hardware and in human labor. Without the ability to show Google ads through the API, there is no incentive for them to support the API. I would suggest that those costs had a significant uplift by these non-lookers using the API interface. Not a terribly complex problem or solution.

    ReplyDelete
  16. does it mean that GT embedded in Wordfast, for example, will be dead ASAP?

    BTW, I was never able to use GT to translate&deliver, but I found it very useful to pretranslate/postedit&deliver, so sparing a lot of keystrokes/time, but I'll survive easily if it will be stopped

    ReplyDelete
  17. Many comments seem to indicate that the most determined opponents of machine translation do not really know much about it, and that they are even less interested in knowing. They mostly seem distressed by the harm that a software application that they do not hesitate to define poor can cause them, as if their work were shoddy, and then they have to compete with this demon that they are eager to kill or at least see dropped dead. It is clear that they do not even know what an API is, what to do with it, how to use it, and how it can be used, and they are not able to grasp the meaning of Dion’s acute analysis. Pulli ad margaritas. The biggest nonsense I've read here is that Google Translate (and why not Bing, or Babelfish, or Systran as well?) would be a major causes in price drop, the same drop that others who are also fighting against the same enemy obstinately reject. Sometimes I think the best way to explain why translators receive (and deserve) so little consideration is to let them speak up.
    By the way, to write this comment I used Google Translate.

    ReplyDelete
  18. Excellent analysis in the article on Kirti's blog. So Google's own translated data is polluting their database.
    I was wondering about this in a post I made earlier about the Translate widget/Element and I now assume that pollution is only an issue if people put Google Translate results on a static web page and not when translating on the fly.

    Posted by Ray Lloyd

    ReplyDelete
  19. And now a new announcement:
    Google not killing translate API, will develop paid version

    http://news.cnet.com/8301-1023_3-20068839-93/google-not-killing-translate-api-will-develop-paid-version/

    Posted by Jeff Allen

    ReplyDelete
  20. I think converting the API into a paid service does change the picture here quite substantially, and I am curious to hear thoughts from others about this. The most important aspect that I see here is that this simultaneously (1) will make it financially impossible for a large number of applications - including popular professionally used CAT tools - to offer a free connection to Google MT; while (2) create a legally-sound basis for commercial use of Google translations by developers of commercial MT products and services.

    Nobody (consumer or enterprise business) likes to pay for something they can get for free elsewhere. That means that, in the future, anyone embedding Google translations within their application will have to demonstrate clear added value. Just retrieving a translation from Google won't be enough. The added value could be in how the application incorporates the translations it retrieves from Google into what it provides the user, or something else, but it clearly has to be there, or users will just flock away and not pay.

    Posted by Alon Lavie

    ReplyDelete
  21. Very interesting subject and comments !

    If we interpret the facts in their simpliest possible way (ie, from free to paid version), don't you think it could also indicates a slight change in the future business model of Google: less free services, more paid services?

    Related to that matter, it is impressive how many sales-related positions Google offers today as compared to a few years ago.

    What do you think of that?

    My Kind regards to Dion, Bob and Kirti.

    Dominique MARET

    ReplyDelete
  22. @Dominique it is quite possible that given the growing influence of Facebook, and the possibility that FB becomes an increasingly preferred avenue for advertising over search, that Google is thinking about starting up new revenue streams. Thus you may be right.

    This could very well be the beginning of many "free" Google services turning into the paid services.

    I have always thought that Google's real motto was: Don't be evil unless financially inconvenient, and we will see if this is true as more advertisers shift to Facebook as a more effective means to reach target customers.

    ReplyDelete
  23. James Fallows of The Atlantic magazine just did an article on the "economic burden" that Google has been claiming with many references to the eMpTy Pages blog posting by Dion.

    http://www.theatlantic.com/technology/archive/2011/06/an-economic-burden-google-can-no-longer-bear/240283/

    ReplyDelete
  24. I think the Atlantic article is off-base, in two ways.

    First, I don't think Google is seriously concerned about "polluting the water", by allowing the translations from Google Translate (GT) to be added to their language corpora as if they were high-quality human translations They have never claimed to add translated material taken off the Internet without review, instead, they usually rely on officially translated materials from known organizations (United Nations, EU).

    Also, the cost to Google to operate the Google Translate API is tiny, compared to Google Translate itself (the web version) and especially as part of the Google empire (Gmail, YouTube, ...)

    Instead, there's another side to Google's business that is being affected, involving non-English speaking markets. See this article:

    http://kv-emptypages.blogspot.com/2011/06/analysis-of-shutdown-announcements-of.\
    html

    Andrew: There will always be a place for human translators, but it is certainly true that computers will assume an ever-increasing amount of the load. I have used a TM program for almost 10 years. The latest versions can look up passages at Google, so I'm no longer limited to my own TM, biological memory (brain), and reference materials. I still have final say but GT has more than once given me exactly the phrase or term I meant to say.

    The automobile (or truck or train or airplane) did not make human drivers or walking obsolete, it enables humans to travel farther and faster and to carry heavier loads farther and faster. Computers are already doing the same for translations.

    Posted by Steven Marzuola

    ReplyDelete
  25. Google frequently offers alternatives to deprecated API’s. An alternative to the Google Translate API would have been the Google Translator Toolkit API. However, without making any announcement, Google also has quietly modified access to the Translator Toolkit API, removing all documentation and restricting access.
    really help full information, keep it up
    Translation Services

    ReplyDelete
  26. Wow, great job. Thanks for "translating" Google's announcement for me (sorry for the pun). I didn't have the patience to sift through all of it, but your analysis was definitely helpful. I wonder how this will affect the Best PPC Company and firms that had their ads on those pages.

    ReplyDelete
  27. While I think that the authors comment about "Polluting Its Own Drinking Water" is true, I also think that Google can't turn the clock backwards. With the current tools (e.g. Moses) there will always be free machine translations available. Deprecating the Translate API will temporarily reduce using machine translation in generating translated web sites but the effect will not be permanent. Google has now created room for new MT services.

    ReplyDelete
  28. Language always plays an important role in our day to day lives. Language can be express as the sum total of set rules regulations and symbols. We manipulate it, modify and improvise these rules and symbles to suit our needs and requirements. You can say language is single most important thing that differentiates humans from other species on the earth. It’s very hard to imagine a proper functioning of any social life without language. In view of immense importance of languages and developments taken place in the social, economical, natural and technological spheres of surrounded life, Croatian translation services has a significant place.

    ReplyDelete
  29. nice info.thanks for your post.

    ReplyDelete
  30. Many thanks for the amazing essay I really gained a lot of info. That I was searching for

    ReplyDelete
  31. Thank you for this informative post!!

    ReplyDelete