What a huge improvement over the sorry mess that we call standards in localization e.g. TMX, TTX, TBX etc… I loved the fact that I can edit a document downstream with an application that did not create the original data and send it on to others who can continue the editing in other preferred applications. I think this is a big deal. I think this is the future, as data flows more freely in and out of organizations.
I do not know very much about translation industry standards except that they do not work very well and I invite anybody who reads this to come forward and comment or even write a guest post to explain what does work to me and others who are interested. I was involved with the logical file system standard ISO 9660 early on in my career, so I know what a real working standard looks and feels like. This standard allows CDs to be read across hundreds of millions of devices, across multiple versions of PC, Mac, Unix and mainframe operating systems. Data recorded on a DOS PC in 1990 can be read today on a Mac or Windows 7 machine without problem. (Though if you saved Wordperfect files you may still have a problem.) The important factor is that your data is safe if it can be read today.
The value of standards is very clear in the physical world: electric power plugs, shipping containers, tires, CD and DVD discs etc… Life would indeed be messy if these things were not standardized. Even in communications we have standards that enable us to easily communicate: GSM, TCP/IP, HTTP, SMTP and the whole set in the OSI layers. Even regular people care and know about some of these. These standards make many things possible: exchange, interoperability, integration into larger business processes, evolving designs and architecture. In the software world it gets murkier, standards are often de-facto (RTF, SQL?, PDF, DOC?, DOCX?) or just really hard to define. In software it is easier to stray, so MP3 becomes WMA and AIFF and there is always a reason, usually involving words like better and improved to move away from the original standard. The result: You cannot easily move your music collection from iPod to Zune or vice versa, or to a new better technology without some pain. You are stuck with data silos or a significant data conversion task.
The closest we have to a standard in the translation is TMX 1.4 (not the others) and with all due respect to the good folks at LISA, it is a pretty lame “standard” mostly because it is not standard, and mostly because some vendors choose to break away from the LISA specification. It does sort of work but is far from transparent. SDL has it’s own variant and so do others, and data interchange and exchange is difficult without some kind of normalization and conversion effort even amongst SDL products!!! And data exchange among tools usually means at least some loss in data value. Translation tools often trap your data in a silo because the vendors WANT to lock you in and make it painful for you to leave. (Yes Mark, I mean you). To be fair, this is the strategy that IBM, Microsoft and especially Apple follow too. (Though I have always felt that SDL is more akin to DEC.) Remember that a specification is not a standard - it has to actually be USED as a matter of course by many to really be a standard.
In a world with ever increasing amounts of data, the data is more important than the application that created it.
For most people it is becoming more and more about the data. That is where the long-term value is. As tools evolve I want to be able to take my data to new and better applications easily. I want my data to be in a state where it does not matter if I change my application tool, and all related in-line applications can easily access my data and further process it as needed. I want to be able to link my data up, down, backwards and forward in the business process chain I live in, and I want to be able to do this without asking the vendor(s). I care about my data, not the vendor or the application I am using. If better tools appear, I want to be able to leave with my data, intact and portable.
So what would that that look like in the translation world? If a real standard existed for translation data I would be able to move my data from Authoring and IQ systems to CMS to TM to TMS to DTP or MT or Web sites and back with relative ease. And the people in the chain would be able to use whatever tool they preferred without issue. (Wouldn’t that be nice?) It could mean that translators could use whatever single TM tool they preferred for every job they did. The long-term leverage possible from this could be huge in terms of productivity improvements, potential new applications and making translation ubiquitous. The graphic below is my mental picture of it. (Who knows if it really makes sense?)
None of the “standards” in the picture today would be able to do this and perhaps real standards will come from the CMS world or elsewhere where standards are more critical. @Localization pointed out a good article on translation related standards at Sun. I think a strong and generic XML foundation (DITA compliant according to an IBM expert I talked to) will be at the heart of a “meaningful” standard. Ultan (aka @localization) has an interesting blog entry on DITA that warns about believing the (over) promises. I keep hearing that XLIFF and possibly OAXAL could lead us to the promised land but of course it requires investment. To work, any of these need commitment and collaboration from multiple parties and this is where the industry falls short. We need a discussion focused on the data and keeping it safe and clean, not the tools. Let them add value within the tool but they should always hand over a standard format so other apps can use it. Again, Ultan who knows much more about this issue than I do says: " We need to move from bringing data to people to bringing people to data. Forget XML as a transport. Use it as structure...:)"
Meanwhile, others are figuring out what XML based standards can do. XBRL is set to become the standard way of recording, storing and transmitting business financial information. It is capable of use throughout the world, whatever the language of the country concerned, for a wide variety of business purposes. It will deliver major cost savings and gains in efficiency, improving processes in companies, governments and other organizations. Check out this link to see how powerful and revolutionary this already is and will continue to be.
As we move to more dynamic content and into intelligent data applications in the “semantic web” of the future, standards are really going to matter as continuous data interchange between key applications from content creation to SMT and Websites will be necessary, and I for one hope that the old vanguard (yes it starts with an S) does not lead us into yet another rabbit hole with no light in sight.You can vote by insisting and making standard-based data preservation a big deal for any product you buy and use. I hope you do.
I would love to hear from any others who have an opinion on this, as you may have gathered I am really fuzzy on how to proceed on the standards issue. (My instincts tell me the two that matter the most are generic and standard XML and XLIFF, but what do I know?). Please enlighten me (us).