There has been a lot of talk about standards since the demise of LISA, perhaps because the collapse of LISA was announced almost immediately after their final event, a “Standards Summit” in early March, 2011. We are now seeing something of a rush, with industry groups setting up positions (perhaps even well intentioned) to establish a controlling interest on “what happens next with standards”. There is still much less clarity on what standards we are talking about, and almost no clarity on why we should care or why it matters.
What are the standards that matter?
The post I wrote on the lack of standards in May 2010 is the single most influential (popular?) post I have written in this blog according to PostRank. So what all this new posturing on standards is about? From my vantage point (as I stated last year), standards are important to enable information to flow from the information creators to the information consumers as efficiently as possible. Thus my view of standards is about those rules and structures that enable clean and efficient data interchange, archival, and reuse of linguistic assets in new language and text technology paradigms. Search, Semantic search, SMT, language search (like Linguee) and text analytics is what I am thinking about. (You may recall that I had much more clarity on what and why standards matter than on how to get there.) Good standards require that vendors play well with each other, that language industry tools interface usefully with corporate content management systems and make life easier for both the information creators and consumers, not just people involved in translation.
However, I have also seen that there is more conflation on this issue of standards than almost any other issue (“quality“ of course is the winner) amongst localization professionals. I am aware that there are at least three different perspectives on standards:
1. End to End Process Standards: ISO 9001, EN15038, Microsoft QA and LISA QA 3.1. They have a strong focus is on administrative, documentation, review and revision processes not just the quality assessment of the final translation.
2. Linguistic Quality of Translation (TQM): Automated metrics like BLEU, METEOR, TERp, F-Measure, Rouge and several others that only focus on rapidly scoring MT output and human measurements that look at the linguistic quality by error categorization and subjective human quality assessment, usually at a sentence level. SAE J2450, the LISA Quality Metric and perhaps the Butler Hill TQ Metric
3. Linguistic Data Interchange: These standards facilitate data exchange from content creation and enable transformation of textual data within a broader organizational data flow context than just translation, good interchange standards can ensure that fast flowing streams of content get transformed more rapidly and get to customers as quickly as possible. XLIFF and TMX are examples of this, but I think the future is likely to be more about interfacing with “real” mission-critical systems (DBMS, Collaboration and CMS) used by companies rather than just TMS and TM systems which IMO are very likely to become less relevant and important to large scale corporate translation initiatives.
It is my sense that we have a lot of development on the first kind of "standard" listed above, but have a long way to go before we have meaningful standards in the second and third categories listed above.
So it is interesting to see the new TAUS and GALA initiatives to become standards leaders when you consider that LISA was actually not very effective in developing standards that really mattered. LISA was an organization that apparently involved buyers, LSPs and tools vendors but were unable to produce standards that really mattered to the industry. (In spite of sincere efforts to the contrary). TMX today is a weak standard at best and there are many variations that result in data loss and leverage loss whenever data interchange is involved. (Are the other standards they produced even worth mentioning? Who uses them?) Are we going to see more of the same with these new initiatives? Take a look at the TAUS board and the GALA board as these people will steer (and fund) these new initiatives. Pretty much all good folks, but do they really represent all the viewpoints necessary to develop standards that make sense to the whole emerging eco-system?
Why do standards matter?
Real standards make life easier for the whole eco-system, i.e. the content creators, the professional translation community, the content consumers and everybody else who interacts, transforms or modifies valuable content along the way. Standards matter if you are setting up translation production lines and pushing translation volumes up. At AGIS2010, Mahesh Kulkarni made a comment about standards in localization. He called them traffic rules that ease both user and creator experience (and of course these rules matter much more when there is a lot of traffic) and he also said that standards evolve and have to be tested and need frequent revision before they settle. It is interesting to me that the focus in the non-profit world is on studying successful standards development in other IT areas in contrast to what we see at TAUS and GALA where the modus operandi seems to be to create separate new groups, with new missions and objectives, though they both claim to be in the interest of “everyone”.
There was a great posting by Arle Lommel on the LISA site that is now gone on why standards matter, and there is also a perspective presented by Smith Yewell on the TAUS site on why we should care. I hope there will be more discussion on why standards matter as this may help drive meaningful action on what to do next, and produce more collaborative action.
So today we are at a point where we have TAUS saying that it is taking on the role of an "industry watchdog for interoperability" by funding activities that will track compliance of the various tools and appointing a person as a full-time standards monitor. Jost Zetzsche has pointed out that this is fabulous, but the TAUS initiative only really represents the viewpoint of “buyers” i.e. localization managers, (not the actual corporate managers who run international businesses). The REAL buyer (Global Customer Support, Global Sales & Marketing Management) probably care less about TM leverage rates than they do about getting the right information to the global customer in a timely and cost-effective way on internet schedules i.e. really fast so that it has an impact on market share in the near term. Not to mention the fact that compliance and law enforcement can be tricky without a system of checks and balances, but it is good to see that the issue has been recognized and a discussion has begun. TAUS is attempting to soften the language they use in defining their role, as watchdogs are often not very friendly.
GALA announced soon after, that it would also start a standards initiative which will "seek input from localization buyers and suppliers, tool developers, and various partner localization and standards organizations." Arle Lommel, the former director of standards at LISA will be appointed as the GALA standards guy. Their objective they say is: “The culmination of Phase I will be an industry standards plan that will lay out what standards should be pursued, how the standards will be developed in an open and unbiased way, and how the ongoing standards initiative can be funded by the industry.” Again, Jost points out (in his 188th Tool Kit Newsletter) that this will be a perspective dominated by translation service companies and asks how will the needs and view of individual translators be incorporated into any new standards initiatives? He also appeals to translators to express their opinions on what matters to them and suggests that a body like FIT (Federation of International Translators) perhaps also engage in this dialogue to represent the perspective of translators.
There are clearly some skeptics who see nothing of substance coming from these new initiatives. Ultan points out how standards tend to stray, how expensive this is for users and also raises some key questions about where compliance might best belong. However, I think it is worth at least trying to see if there is some potential to channel this new energy into something that might be useful for the industry. I too, see some things that need to be addressed to get forward momentum on standards initiatives which I suspect get stalled because the objectives are not that clear. There are three things at least, that need to be addressed.
1) Involve Content Creators – Most of the discussion has focused only on translation industry related players. Given the quality of the technology in the industry I think we really do need to get CMS, DBMS and Collaboration software/user perspectives on what really matters for textual data interchange if we actually are concerned with developing meaningful standards. We should find a way to get them involved especially for data interchange standards.
2) Produce Standards That Make Sense to Translators – The whole point of standards is to ease the data flow from creation to transformation to consumption. Translators spend an inappropriately huge amount of time in format related issues, rather than with translation and linguistic issue management. Standards should make it easier for translators to ONLY deal with translation related problems and allow them to build linguistic assets that are independent of any single translation tool or product. A good standard should enhance translator productivity.
3) Having Multiple Organizations Focused On The Same Standards is Unlikely to Succeed – By definition standards are most effective when there is only one. Most standards initiatives in the information technology arena involve a single body or entity that reflects the needs of many different kinds of users. It would probably be worth taking a close look at the history of some of these to understand how to do this better. The best standards initiatives have hard core techies who understand how to translate clearly specified business requirements into a fairly robust technical specification that could evolve but leaves some core always untouched.
One of the problems in establishing a constructive dialogue is that the needs and the technical skills of the key stakeholders (Content creators, Buyers, LSPs, Translators) differ greatly. A clearer understanding of this is perhaps a good place to start. If we can find common ground here, it is possible to build a kernel that matters and is valuable to everybody. I doubt that TAUS and GALA are open and transparent enough to really engage all the parties, but I hope that I am proven wrong. Perhaps the first step is to identify the different viewpoints and clearly identify their key needs before coming together and defining the standards. It is worth speaking up (as constructively as possible) whatever one may think of these initiatives. We all stand to gain if we get it right, but functioning democracy also requires vigilance and participation, so get involved and let people know what you think.