tag:blogger.com,1999:blog-6748877443699290050.post934266072812030483..comments2024-03-29T00:21:17.976-07:00Comments on eMpTy Pages: Most Popular Blog Posts of 2010Kirti Vasheehttp://www.blogger.com/profile/16795076802721564830noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-6748877443699290050.post-73555966475138691442011-01-05T05:23:36.719-08:002011-01-05T05:23:36.719-08:00Kirti, thank you for the link, this is highly intr...Kirti, thank you for the link, this is highly intriguing. My gut feeling says that your suspicion is spot on. Best wishes for 2011! ChristianUnknownhttps://www.blogger.com/profile/03173485740756583589noreply@blogger.comtag:blogger.com,1999:blog-6748877443699290050.post-37629046830156150412011-01-04T09:37:52.831-08:002011-01-04T09:37:52.831-08:00Christian
Thanks for catching the bad link. I ha...Christian <br /><br />Thanks for catching the bad link. I have fixed it now and have supplied two links now. Here is teh direct quote:<br /><br />Andreas Zollmann, who has been researching in the field for many years and working at Google Translate for the last year, suggests, along with Blunsom, that the idea that more and more data can be introduced to make the system better and better is probably a false premise. "Each doubling of the amount of translated data input led to about a 0.5% improvement in the quality of the output," he suggests, but the doublings are not infinite. "We are now at this limit where there isn't that much more data in the world that we can use," he admits. "So now it is much more important again to add on different approaches and rules-based models."<br /><br />This of course is only true for high data density languages (FIGS, CJK, Portuguese) - many of the Google systems will continue to improve as they climb in data volume.<br /><br />Ultan's article is also a good summary of the issues at: <br />http://blogs.oracle.com/translation/2011/01/where_next_for_google_translate_and_what_of_information_quality.html <br /><br />To improve in future I think they will need clean data, IQ, more skilled human feedback on linguistic issues to add linguistic structural knowledge rather than just adding the old RbMT stuff to the SMT foundation.Kirti Vasheehttps://www.blogger.com/profile/16795076802721564830noreply@blogger.comtag:blogger.com,1999:blog-6748877443699290050.post-27752221329276086592011-01-04T08:48:10.478-08:002011-01-04T08:48:10.478-08:00Hi Kirti, thanks for the summary. The link to Goog...Hi Kirti, thanks for the summary. The link to Google admitting to having reached limits seems to be not working, can you post the correct URL? Christian.Unknownhttps://www.blogger.com/profile/03173485740756583589noreply@blogger.com