Sunday, December 19, 2010

Using Translation as a Force to Address Information Poverty: AGIS 2010

I have the good fortune to reflect and report on the AGIS 2010 conference as we approach “the holiday season,” which is a time of reflection for many in the world. A time of goodwill and at least temporary good deeds for some. The conference was held in India, which can be a challenge because the basic infrastructure is still primitive, but the event went off well with very few glitches and I think AGIS is slowly building momentum.

AGIS stands for Action for Global Information Sharing, and is focused on conducting a resolute crusade against Information Poverty since its inception. The overall tone and tenor of this conference is very different from the typical conference in the world of enterprise localization (LocWorld, GALA, LISA). The focus is on making all kinds of knowledge and information accessible in places where it has never been available before, not just to sell products. There is clear evidence shared by many speakers, that shows that access to information creates the conditions for economic prosperity or perhaps even actually drives it. In some parts of the world localization is all about reducing information poverty and improving the human condition. Reinhard has provided a summary of the highlights of the event in his blog. There was also coverage here. And for those who ask why do need yet another conference in the industry, Reinhard explains below:
You might be asking yourself, “Why AGIS, why YALC (yet another localisation conference)? What makes AGIS so different?” Well, first of all, it is not owned by any particular organisation, it is not run for profit, and it is (almost) free to attend. Then, it takes place where people need localisation, not where people are rich enough to pay for it. Nothing is sold, nothing is bought at AGIS. And last but not least, AGIS attendees have a social agenda, not (just) a commercial one.
The highlights that can be also be found in the twitter stream on my Friendfeed (Scroll back to newer items to see the chronological sequence. I don’t know why Twitter has already made much of the data unavailable).  

In the keynote, Dr. Vijay Bhatkar (a digital visionary of India) pointed out how globalization and localization are tightly linked and how NLP, MT and language technologies are only just beginning their evolutionary march. He pointed out how Japan’s hopes for world dominance were stymied by linguistic issues and that access to knowledge and information in the 21st century will be key to building prosperity as an increasing part of the GDP of many nations will come from knowledge services. This is already true for India. He informed the Indians in the room, that India can not consider itself an IT power when 350 million people are illiterate and urged the community to preserve the Indian languages while continuing the push forward with English education. He also pointed out that both Telephone and TV are mostly language neutral but information cannot be, and localization is critical to broad access. 

Reinhard Schaler and the Rosetta Foundation are leading an initiative to build a platform to facilitate self configurable, distributed and shared data based global localization initiatives. CNGL and University of Limerick students provided overviews and demonstrations of these tools. Reinhard highlighted that each day 24,000 die children because of lack of access to basic healthcare information. That is 1 every 3 seconds! These deaths could be avoided if information was available more easily. This appears to be a primary motivator and raison d'etre for the Rosetta Foundation.

Ms. Swaran Lata painted a clear picture of the amazing complexity of the Indian linguistic landscape. 20+ major languages with some states having 3-4 languages and multiple scripts. The CDAC organization is attempting to solve the linguistic computing issues to ensure that Indian languages gain a stronger digital presence and are preserved. As Prof. Bhatkar asked: "Can you really say you know Hindi if you cannot  use it on a computer? This is key in the information age."  Ms. Lata described initiatives that focused on the digital education of youth, which interestingly also resulted in the knowledge being passed on to illiterate parents and grandparents. She talked also about initiatives to reach out to the “other side of India” to ensure that illiterate people are not left behind. As Indian consumers become more powerful, Indian languages are critical to reaching their purchasing power.

I spoke about how the Asia Online vision is finally coming to fruition, when we start rolling out a Thai Wikipedia comprising of translations of 3.5 million articles starting in January 2011. When all these articles are up and ready, Thailand will have the second largest Wikipedia in the world after the English one.  This is a huge boost from the current 60,000 article Thai Wikipedia, many which are just barely more than stubs. In contrast, the index alone for just the article titles in the English Wikipedia are in excess of 600,000 pages! The Asia Online project is an initiative that directly addresses information poverty. Shockingly it was also uncovered that the Hindi Wikipedia only has about 50,000 articles for a population of almost 400 million people! This means that a child that does not speak English is deprived of basic educational information access and has a fraction of the content available to an English speaking child.
Some other interesting information from the conference:
  • Ravi Gupta pointed out there are 62,000 newspapers in India and 92% of these are not in English
  • Subtitles do not work well in India because of literacy issues but they can also be a means of building literacy
  • There are no English TV channels in the Top 100 TV channels in India but English speaking consumers are the wealthiest consumers
  • Ravi Kumar, The President of  the Indian Translators Association made an impassioned plea asking that buyers and the community at large respect translators as professionals
  • The CNGL team showed various elements of the open SOLAS platform they are making available to anybody who needs it
  • Mahesh Kulkarni’s wonderful presentation on standards which he called traffic rules that ease both user and creator experience. He has a much more holistic and systematic view of standards than we see from the feeble standards initiatives in the traditional localization industry but he too, expressed the difficulties of getting good standards in place.
  • He also pointed that that there are 670 million mobile phones in India and asked is this the end of the internet as we know it? 
 Mahesh Kulkarni and Raimond Doctor are a joy to behold; passionate, knowledgeable and driven in spite of having to deal with Indian governmental bureaucracy as part of their daily lives. Raimond is perhaps the most erudite and knowledgeable person I have met on comparative linguistics. He shares his deep knowledge and insight with a verve that draws you right into his delight for language. I hope that CDAC realize what treasures these men are, and gives them room and resources to execute on their vision and passion.

Reinhard also pointed out that the non profit world was substantially larger in terms of market potential and actual localization activities than the Fortune 500 market. Non-profit does not mean no payment, no recognition and jobs, it is in fact bigger than the energy sector in US.  

Take a look at this TED video to see how information access can change lives and empower people to learn, take control and transform their own lives.
If there is a revolution coming in translation – I think one is much more likely to see the first signs of the revolution at a conference like AGIS, rather than at more mainstream localization conferences. We see all the key elements lining up here: people focused on large scale collaboration infrastructure, community and crowdsourcing management, massive translation automation and standards and the most important ingredient of all: PASSION. We see people at this conference who are driven by a passion to change the world. We see people who are not making a lot of money but are still working long and diligent hours. We see people undertaking translation projects that will involve hundreds of millions of words on a routine basis. We see technology, collaboration tools & infrastructure and community coming together in ways that just does not happen at traditional professional translation events.  We see people who want to make an impact on the human condition. We hear and see people talking about nation building and the human right to knowledge. This kind of talk gets me all warm inside and I think this is what we all had in common, a “higher” sense of purpose and mission which does not equate to a stance of moral superiority as some might think. Many of the people here have a soul satisfying answer to the question: Why do you do what you do? Isn't that enough?

My interest in automated translation has always been related to the potential impact this technology has on improving information access and thus improving human lives across the world and also potentially improving the quality and depth of communication between linguistic groups, cultures and nations. One step on the way to Pacem in Terris? Foolish and idealistic perhaps, but we need to dare to dream first before we can actually make it happen. As a teenager, a wise man told me once, “You are the world and the world is you” and I have explored that statement ever since,  holding it close to heart as a seminal influence in my life.

Join and support the Rosetta Foundation and help make this into a movement that cannot be stopped. If you have the ability to influence a major corporate entity to get involved and support this please do so now, and join Reinhard as he forges and builds this new path to change the world for the better.
And as if that were not enough, I even had a brief meeting with former Indian Prime Minister I.K. Gujral, who even in his nineties has a stature, grace and humility that is disarming. He was impressed by the Thai Wikipedia project I am involved with, and said it would be wonderful for India to do the same in Hindi. And thanks to my friend Vishal I also got to meet several industry leaders of emerging India who seek to build transparency and a relatively corruption free government.

I was also greatly heartened to see the corruption establishment take a serious blow when Minister Raja was exposed for taking obscenely huge bribes, in excess of $40 billion I believe. What makes corruption in India especially horrific is the complete lack of remorse and shame that these public officials have. India is on the move but still has far to go as the culture of corruption is everywhere you turn, and will not die easily. One of the other benefits of free flowing information is that it also makes this kind of self dealing and abuse of trust harder to maintain. Information poverty is also an enabler and friend of corrupt officials and thus this is yet another reason to address this issue. 

Happy Holidays to you all and I hope that you explore and find "goodness" in your life. And here is the link to holiday greetings in many languages.


  1. Thanks, Vashee, for sharing. I'm confident that the fast growing information need around the world will create a new category of translators that will take from making information widely available the thrill professionals take from their jobs. The usefulness of translations will take priority over the well written translation because the good enough translation that, bring down the number of death from diarrhea, will have more value. The value will be in making a difference in people's life. Congratulations on the Thai Wikipedia. Wonderful project!

  2. I was alerted to this wonderful blog entry by George Weyman who regaled us with his flute playing in Ireland at the last AGIS conference.

    Translation as Means of Increasing Intellectual Production in the Middle East | Meedan Blog

    Some excerpts from this blog:

    "Translation is a pressing duty of those interested in increasing access to knowledge around the world. Societies are not going to learn English en masse at a rate capable of checking these growing knowledge imbalances. Instead, we need methods for scaling translation.

    The other issue is to do with the value of translation itself. It would be a reasonable premise to suggest that in periods of high intellectual activity, societies invest in translation. Why? There is evidence to suggest that access to diverse perspectives enables better intellectual outputs.

    ....access to diverse networks and forms of knowledge improves our ability to innovate.

    Translation increases network diversity and it reduces knowledge divides.

    Translation today can be conducted cheaply and to high quality by a combination of machines and humans. Automated translation can provide a first draft which can be edited by one or many individual translators working together on small chunks of text – much like a Wikipedia page entry. Translation revisions show the lineage of the translation, and help alert moderators to problems or vandalism. And the fun part is that each translation contributes to the improvement of the automated translation – so you can continually translate more, and better. The humans focus increasingly on the really hard bits.

    The list of organizations working on this model is wide, and Meedan is in the thick of it. If we want to make the web more polyglot, and increase the amount of Arabic content in the next ten years, we need to put energy into the tools to make scaled social translation a natural and intuitive publishing gesture on the web. This is not to suggest that professional translators are going to be out of a job any time soon – quite the reverse, the more we translate content the more demand there will be for the services of translators with real expertise for the most difficult translation problems. But respect for the translation profession should not be a straitjacket into which we put the vision of a polyglot web."

    And just in case you still have doubts about the value of the AGIS mission. Consider the following facts:

    There are 250 million Arabic speakers in the world, but only a very small proportion of translated foreign material available to read.

    To put this into context:
    -- Spain translates in one year the number of books that have been translated into Arabic in the past 1000 years and
    -- For every one million Arabs only one book is translated into Arabic each year

    (Source: UNDP Arab Human Development Report, 2003)

  3. The largest translation project in the world !