Friday, May 14, 2010

The Importance of Information Quality & Standards

I spent a day at the Adobe headquarters in San Jose this week at aQuatic (acrolinx quality assurance tool users) together with Bay Area MT User Group conference and thought I would share some of the highlights. The conference was focused on “Information Quality” (or IQ for short) and several users of the acrolinx quality assurance tool were there describing their experiences and best practices, including users from Adobe, Cisco, Autodesk, IBM, Symantec and John Deere. 

It is clear that any effort towards improving content early in the content development process makes all kinds of subsequent processes like translation and MT much easier and is clearly worthwhile. IQ as acrolinx calls it, is sometimes mixed up with “controlled language” which is what I would characterize as a 1st generation approach to making content more easily leverageable for translation and other downstream processes. Controlled Language is a strategy that was most frequently used with RbMT systems but some of the basic principles in a less stilted form can be useful to SMT as well. If you are interested in seeing a more detailed discussion and many useful links on controlled language vs. simplified English vs. source cleanup, check out the MT group in the LinkedIn forum:Discussion on the use of Controlled Language in various kinds of MT approaches.

The presentations showed that documentation creation is now a much more dynamic process and also showed how the community and Web 2.0 concepts are affecting the content production process.  I have provided links to what I felt were the best presentations below, hopefully the links work even if you are not a BAMTUG member:

Get Ready for Socially-enabled Everything – Scott Abel, The Content Wrangler (starts at 5:10) about how social networks and Web 2.0 concepts are accelerating B2C change and how important community content and social networks have become (includes the overplayed and now somewhat tired video from Socialnomics). (It was played twice during the day!) This is useful for those who still think developing corporate websites is just an internal affair.

MT Best Practices: Pre-editing, Common MT Errors and Cures - Mike Dillinger, Ph.D., Translation Optimization. Interesting anecdotes about how content cleanup can have a huge impact on success with MT or usability for any general user. Lots of good advice in here (including me rudely interrupting on Microsoft KB satisfaction rating interpretation).

The two most interesting and instructive presentations (for me anyway) were:

acrolinx Roadmap and Future Directions - Andrew Bredenkamp, CEO, acrolinx. I was happy to see that an API is being developed that would allow automated cleanup on content prior to MT and that there is a growing awareness of how IQ technology can be used to make “community content” more useful. (Andrew and I have the same birthday so that could also have given him the edge)

Adobe Technical Publication Suite 2 - Mahesh Kumar Gupta, Product Manager, Adobe Systems. I was struck by how relatively open and portable and REALLY standards based this product was for content creation and organization. What a huge improvement over the sorry mess that we call standards in localization e.g. TMX, TTX, TBX etc… I loved the fact that I can edit a document in process with an application that did not create the original and send it on to others who can continue the editing in other preferred applications. I think this kind of flexibility is a huge deal and I will explain why in a future entry. Having a real standards foundation (generic XML, DITA) allows an agility and ability to respond with ease and effectiveness to all kinds of change and dynamically shifting processes and situations. This is dramatic in contrast to the lame standards that exist in the TM and TMS world. Standards allow you to use the best tools and change these tools without compromising or losing your data. Standards give you flexibility. 

Anyway it was an interesting day and I learnt much. The presentations are worth a look.

No comments:

Post a Comment