Monday, April 11, 2011

The Rush to Manage and Control Standards

There has been a lot of talk about standards since the demise of LISA, perhaps because the collapse of LISA was announced almost immediately after their final event, a “Standards Summit” in early March, 2011. We are now seeing something of a rush, with industry groups setting up positions (perhaps even well intentioned)  to establish a controlling interest on “what happens next with standards”. There is still much less clarity on what standards we are talking about, and almost no clarity on why we should care or why it matters.


What are the standards that matter?

The post I wrote on the lack of standards in May 2010 is the single most influential (popular?) post I have written in this blog according to PostRank.  So what all this new posturing on standards is about? From my vantage point (as I stated last year), standards are important to enable information to flow from the information creators to the information consumers as efficiently as possible. Thus my view of standards is about those rules and structures that enable clean and efficient data interchange, archival, and reuse of linguistic assets in new language and text technology paradigms. Search, Semantic search, SMT, language search (like Linguee) and text analytics is what I am thinking about. (You may recall that I had much more clarity on what and why standards matter than on how to get there.) Good standards require that vendors play well with each other, that language industry tools interface usefully with corporate content management systems and make life easier for both the information creators and consumers, not just people involved in translation.  

However, I have also seen that there is more conflation on this issue of standards than almost any other issue (“quality“ of course is the winner) amongst localization professionals. I am aware that there are at least three different perspectives on standards:

1. End to End Process Standards: ISO 9001, EN15038, Microsoft QA and LISA QA 3.1. They have a strong focus is on administrative, documentation, review and revision processes not just the quality assessment of the final translation.
2. Linguistic Quality of Translation (TQM): Automated metrics like BLEU, METEOR, TERp, F-Measure, Rouge and several others that only focus on rapidly scoring MT output and human measurements that look at the linguistic quality by error categorization and subjective human quality assessment, usually at a sentence level. SAE J2450, the LISA Quality Metric and perhaps the Butler Hill TQ Metric
3. Linguistic Data Interchange: These standards facilitate data exchange from content creation and enable transformation of textual data within a broader organizational data flow context than just translation, good interchange standards can ensure that fast flowing streams of content get transformed more rapidly and get to customers as quickly as possible. XLIFF and TMX are examples of this, but I think the future is likely to be more about interfacing with “real” mission-critical systems (DBMS, Collaboration and CMS) used by companies rather than just TMS and TM systems which IMO are very likely to become less relevant and important to large scale corporate translation initiatives.

It is my sense that we have a lot of development on the first kind of "standard" listed above, but have a long way to go before we have meaningful standards in the second and third categories listed above.

So it is interesting to see the new TAUS and GALA initiatives to become standards leaders when you consider that LISA was actually not very effective in developing standards that really mattered. LISA was an organization that apparently involved buyers, LSPs and tools vendors but were unable to produce standards that really mattered to the industry. (In spite of sincere efforts to the contrary). TMX today is a weak standard at best and there are many variations that result in data loss and leverage loss whenever data interchange is involved. (Are the other standards they produced even worth mentioning? Who uses them?) Are we going to see more of the same with these new initiatives? Take a look at the TAUS board and the GALA board as these people will steer (and fund) these new initiatives. Pretty much all good folks, but do they really represent all the viewpoints necessary to develop standards that make sense to the whole emerging eco-system?


Why do standards matter?

Real standards make life easier for the whole eco-system, i.e. the content creators, the professional  translation community, the content consumers and everybody else who interacts, transforms or modifies valuable content along the way.  Standards matter if you are setting up translation production lines and pushing translation volumes up. At AGIS2010, Mahesh Kulkarni made a comment about standards in localization. He called them traffic rules that ease both user and creator experience (and of course these rules matter much more when there is a lot of traffic)  and he also said that standards evolve and have to be tested and need frequent revision before they settle. It is interesting to me that the focus in the non-profit world is on studying successful standards development in other IT areas in contrast to what we see at TAUS and GALA where the modus operandi seems to be to create separate new groups, with new missions and objectives, though they both claim to be in the interest of “everyone”.

There was a great posting by Arle Lommel on the LISA site that is now gone on why standards matter, and there is also a perspective presented by Smith Yewell on the TAUS site on why we should care. I hope there will be more discussion on why standards matter as this may help drive meaningful action on what to do next, and produce more collaborative action.

So today we are at a point where we have TAUS saying that it is taking on the role of an "industry watchdog for interoperability"  by funding activities that will track compliance of the various tools and appointing a person as a full-time standards monitor.  Jost Zetzsche has pointed out that this is fabulous, but the TAUS initiative only really represents the viewpoint of “buyers” i.e. localization managers, (not the actual corporate managers who run international businesses). The REAL buyer (Global Customer Support, Global Sales & Marketing Management)  probably care less about TM leverage rates than they do about getting the right information to the global customer in a timely and cost-effective way on internet schedules i.e. really fast so that it has an impact on market share in the near term. Not to mention the fact that compliance and law enforcement can be tricky without a system of checks and balances, but it is good to see that the issue has been recognized and a discussion has begun. TAUS is attempting to soften the language they use in defining their role, as watchdogs are often not very friendly.

GALA announced soon after, that it would also start a standards initiative  which will "seek input from localization buyers and suppliers, tool developers, and various partner localization and standards organizations." Arle Lommel, the former director of standards at LISA will be appointed as the GALA standards guy. Their objective they say is: “The culmination of Phase I will be an industry standards plan that will lay out what standards should be pursued, how the standards will be developed in an open and unbiased way, and how the ongoing standards initiative can be funded by the industry.” Again, Jost points out (in his 188th Tool Kit Newsletter) that this will be a perspective dominated by translation service companies and asks how will the needs and view of individual translators be incorporated into any new standards initiatives? He also appeals to translators to express their opinions on what matters to them and suggests that a body like FIT (Federation of International Translators) perhaps also engage in this dialogue to represent the perspective of translators.   

There are clearly some skeptics who see nothing of substance coming from these new initiatives. Ultan points out how standards tend to stray, how expensive this is for users and also raises some key questions about where compliance might best belong. However, I think it is worth at least trying to see if there is some potential to channel this new energy into something that might be useful for the industry.  I too, see some things that need to be addressed to get forward momentum on standards initiatives which I suspect get stalled because the objectives are not that clear. There are three things at least, that need to be addressed.

1) Involve Content Creators – Most of the discussion has focused only on translation industry related players. Given the quality of the technology in the industry I think we really do need to get CMS, DBMS and Collaboration software/user perspectives on what really matters for textual data interchange if we actually are concerned with developing meaningful standards. We should find a way to get them involved especially for data interchange standards.
2) Produce Standards That Make Sense to Translators – The whole point of standards is to ease the data flow from creation to transformation to consumption. Translators spend an inappropriately huge amount of time in format related issues, rather than with translation and linguistic issue management. Standards should make it easier for translators to ONLY deal with translation related problems and allow them to build linguistic assets that are independent of any single translation tool or product. A good standard should enhance translator productivity.
3) Having Multiple Organizations Focused On The Same Standards is Unlikely to Succeed – By definition standards are most effective when there is only one. Most standards initiatives in the information technology arena involve a single body or entity that reflects the needs of many different kinds of users. It would probably be worth taking a close look at the history of some of these to understand how to do this better. The best standards initiatives have hard core techies who understand how to translate clearly specified business requirements into a fairly robust technical specification that could evolve but leaves some core always untouched.
One of the problems in establishing a constructive dialogue is that the needs and the technical skills of the key stakeholders (Content creators, Buyers, LSPs, Translators) differ greatly. A clearer understanding of this is perhaps a good place to start. If we can find common ground here, it is possible to build a kernel that matters and is valuable to everybody. I doubt that TAUS and GALA are open and transparent enough to really engage all the parties, but I hope that I am proven wrong. Perhaps the first step is to identify the different viewpoints and clearly identify their key needs before coming together and defining the standards. It is worth speaking up (as constructively as possible) whatever one may think of these initiatives. We all stand to gain if we get it right, but functioning democracy also requires vigilance and participation, so get involved and let people know what you think.


  1. Many good points here, Kirti. Thanks for the summary. I agree with the "three things that need to be addressed, and also with your closing point that we need a "clearer understanding" of the way in which the various perspectives on standards differ. It has always seemed to me that "standards matter" but that the ones we have in this industry do not (or at least not very much) precisely because we have not done the hard work of reconciling all the perspectives.

    When LISA first announced its demise, Common Sense Advisory suggested that a standards body such as OASYS might be a good home for language industry standards. I do think, though, that GALA (at least) has the intention of bringing all perspectives to bear. We will learn more about that when they announce the results of Phase 1 in 2-3 months time. Meanwhile, I hope that you, Jost, Jaap and others will continue to drive the focus toward "why". If we get the "why" right, then coming up with the "what" will not be too difficult.

  2. "Why" - this is my belief in an independent, wealthy and speedy translator. "Independent", meaning not bount to the usage of any translation software, working anywhere and anytime,setting up his own rates, "wealthy" - understandable, "speedy" - 2,3,4,5 times higher translation productivity.

    A translator is the main hero in the translation industry. He receives information from a content creator, buyer or LSP. He KNOWS needs, problems, expectations of his customers. He DOES the job. Therefore, problems of translators they face daily are the starting points to create innovative products and services, develop new standards.

  3. This is a good overview, but I think misses one absolutely crucial point: certification. Without certification there can be no standard, since no one can be sure that it is really supported. LISA was always pretty weak on certification and I am not hearing from any of the new bodies making strong statements on certification - which is the unpopular "policing" part of being a standards watchdog.

  4. There is an interesting but long article by Rustin Gibbs on The Importance of Interoperability on the TAUS site that provides more details on why with many analogies.

  5. Andrew

    Unless I am mistaken I think that TAUS is planning to develop a Certification Meter of some sort. They say they will track compliance of the various tools and hopefully give the tools some kind of rating or score which indicates compliance with a specific and well defined standard. Thus 100% compliance would mean certification.

    But perhaps it is too early to tell.

  6. Perhaps would it be good to look into with at least an outlook at Achieving Technical Interoperability - the ETSI Approach ?
    ETSI, mother of GSM standard, 700 Members, 62 countries, free download of all standars has a lot of common ETSI+LISA+TAUS+GALA members..ETSI is creating an ETSI Industry Specification Group (ISG ) on Localization Industry Standards (LIS) to offer the 19 common stakeholders and all non-ETSI members to continue, support, host, park, maintain... the LISA standards in ETSI ISG LIS. This is up to the members to decide to join this initiative. ETSI already collaborates with OASIS, ISO, ICANN. We are involved in global ICT standards, including localisation (3GPP/LTE, TC HF, TC MCD). It would be a shame if I cannot demonstrate how nice, quick, easy and cheap it is to park these standards at ETSI.

    Posted by Patrick GUILLEMIN

  7. This is a great post Kirti. It is nice to see the interest in this topic, and I suggest that we push the associations who represent our industry to aggregate our collective needs and interests to effect change. This is how standards have become more widely adopted and successful in other industries.

    Posted by Smith Yewell

  8. Standards are important as long as they represent a means to assess the quality of the product AND the process. In my opinion, LSP, translators, firms and all professionals involved in the content creation should choose a model that they can share, a model able to assess the quality of the process and the product (translation). A model like the CMMI used by CSA to create the LMM could be a good starting point. The CMMI should be used as a "bridge" between clients and suppliers because the requirements it uses to evaluate the overall process are standard principles that can be applied in every production cycle (translation and content creation process included). Many of the standards available on the market do not take into consideration both of these aspects, so they are very poor.

  9. A great post, Kirti. What you write is very valid for the localization side of translation automation. But there are those of us who use translation automation for "intelligence gathering", and we have no say whatsoever in the format or quality of the material we have to handle. We can't dictate anything: we work with what we're given. Indeed, even today we have to handle pdf's generated from terrible scans!

  10. @Valeria

    Your suggestion sounds intriguing and I am sure that others would be interested to see it more fully explained. Please consider writing it up in more detail as it could help guide the things that are being considered everywhere.


    I would bet that your feedback on what is most difficult to handle would be very valuable (as a list of things to avoid) to all of us who are trying to provide some feedback here. Please share what not to do.

  11. interesting article and cool for translation