Tuesday, May 24, 2011

A Case Study on the Use & Benefits of Controlled Language

This is a post by guest writer Anna Fellet who I met in Rome last month. This further explores the theme of Controlled Language and Process Standards and continues on the themes presented Valeria Cannavina in her posting earlier this month. This slide presentation provides additional background on this case study.

=================================================================================

This article presents a case study to show the benefits of Controlled Language strategies, and highlights the key lessons learnt in the pilot project on a dedicated MT workflow created for ARREX Le Cucine, a leading Italian furniture company. This post also contains a reply to Laura Rossi’s comments on Valeria Cannavina’s previous posting on standards and the application of the CMMI model to translation.

A Business Case for Controlled Language

The goal of the ARREX project was the development of a corporate controlled language for Italian to be used in a customized authoring and (machine) translation workflow.

Why a Controlled Language?

A CL was chosen to eliminate ambiguity and complexity in product data sheets, installation and maintenance instructions (for support), catalogs, price lists, orders, reports, memos, documents for compliance. We chose to test our CL with both RbMT provided by Synthema and with SMT by Asia Online. We improved the repetitiveness of ARREX texts, and the result with RbMT was successful

As for SMT, which is a typical brute-force data-driven computing application, most of the difficulties come from the high degree of unpredictability in searching through a massive set of possible options of even in the simplest word combinations. As the number of words in a sequence increases, the precision score decreases because longer matching word sequences are more difficult to find. A controlled language increases predictability, increases statistical density and thus improves probability and boosts SMT success.

So, the single most powerful rule for authors/writers still holds its validity: one idea per sentence makes text that is easier for humans to understand also easier for MT engines to understand.

Poor source quality can lead to low quality target language content (e.g. SAP translations often result in hardly translatable/comprehensible Italian), however technical documents are ideally all written in the same “language”, even though with different idioms. Setting up terminology resources and developing writing rules enables the Italian text to be more easily handled by the MT system.

Moreover, language combinations with English are more commonly implemented, so by translating Italian into a terminologically coherent and syntactically simple English target we could use it as a starting point for other potentially successful combinations.

At the end of our preliminary investigations, we found that ARREX CL adds value to technical documentation as it allows:

Increase in the perceived value of the product and of the whole brand: consistent, stylistically uniform, and controllable documentation (user-targeted material) created for a user/client to understand and thus helps to build customer loyalty;
More efficient communication with clients/distribution partners/maintenance staff, thus reducing customer support calls and general costs associated with customer service;
Reduction of translation costs (see table below);

Source text	Without CL	With CL	Difference
Words to be translated	70.000	64.000	-8%
Repetitions	39.800	38.300	+3%
Words to be translated from scratch	32.100	25.700	-5%
Human translation costs (250 words/hour)	280 hours	255 hours	-9%

	Human translation costs	MT post editing cost	Difference
Translation Costs	280 hours	50 hours	-80%

We moved beyond the study on Italian CL for customized MT, and discovered that much can be done with a holistic approach to authoring workflows.

We became convinced that by adopting an ad hoc CL, by creating reusable corporate specific terminology resources, training corporate internal staff on authoring strategies for MT we could influence the company authoring workflow at a greater extent. By ad hoc CL we mean that rules are created specifically for ARREX. These rules may be valid for other domains, as well, but we had to focus on and improve inefficient writing practices unique to ARREX’s own internal corporate-speak. We were brought to focus on critical aspects that the company may not have had clear at the beginning of the project e.g. improve ARREX terminology standards by extracting most frequently used terms, and analyze synonyms and non-standard/irregular uses of terms that ARREX had already implemented.

Not only did this project help us understand how to clean data for MT, create resources for MT (glossaries, TM, post-editing guidelines), highlight costly and time consuming translation processes, i.e. outsourcing translation/editing and publishing, it also helped us in seamlessly adapting our work to the existing corporate strategy by addressing the internal staff’s needs directly.

WORKFLOW & VALUE

The graphic below shows how we changed the company’s workflow and the results we achieved.

Activity
WORKFLOW ANALYSIS	Goal	Before	After
WORKFLOW ANALYSIS	Exhaustive map of how ARREX processes are organized and who is in charge for what.	Texts were translated externally or (sometimes) internally.	MT (internal); monolingual review with support of term base and glossaries approved by ARREX.
Value		Faster, cheaper and more accurate translations and reviewer’s feedback for continuous improvement of CL and MT.

Activity
RESOURCES ANALYSIS
	Goal	Before	After
	Measuring ARREX staff performance.	No trained technical writers and no unique point of reference for the production of technical material and translated material.	Training of staff to repeat and manage the process; new professionals (pre-editing, post editing).
Value		Involvement through requalification of internal staff in charge for documentation. Resources are the only feature of the whole process that cannot be cut or reduced. People will always be the key element to deliver ‘quality’. Internal staff is the best people to talk to, to understand the quality level expected. We are only providing the right tools and the right knowledge to achieve such ‘quality’ (e.g. glossaries, CL style guide, MT workflow, QA report). We offer an improvement of the quality of the process, not of the product. Products can always change.

PLANNING	Goal	Before	After
	Address the workflow step by step to build long term relationships with valid collaborators.	Undocumented processes, undefined organization of roles for technical writing and translation.	Ad hoc procedures for each phase of the process, from technical writing to delivery of translated material.
Value		The only possible way to deal with planning is to set a common framework to communicate with ARREX to find the appropriate strategy for text editing and MT.

Activity
ACTION	Goal	Before	After
	Write a protocol of requirements suitable for new requests.	Fragmented process.	Independent and autonomous management.
Value		Flexible processes become repeatable.

Repeatable processes

Process documentation;
Roles definition and (re)qualification;
Building of internal writing team;
Internal terminology approval procedure;
Target Language Monolingual reviewer selection;
Target Language Monolingual reviewer feedback.

Q&A to questions posed by Laura Rossi in comments of previous post.

Laura Rossi: Will translation software developers be ready to provide their customers just with what they need, instead of trying to ‘impose’ an overall comprehensive solution, which, in fact, force them to follow a specific process and workflow?

Will the definition of a standard model not be another reason for them to justify this rigidity?

ARREX was anchored to old trusted but imperfect and inefficient processes, and the change we introduced was sometimes shocking for internal technical writers. “The difficulty lies, not in the new ideas, but in escaping from the old ones” (John Maynard Keynes), but if one sees the new idea as a means to improve one’s work (and save time), participation will be natural. In this sense, ARREX drove its own change.

Laura Rossi: As long as translation and localization will be considered as an accessory activity and a cost by the customers, more than a possibly business-driving and revenue-generating task, there won't be much interest from side of the customers to rethink their internal processes and organization, as well as from side of the LSPs and translation software companies to really act as part of their customers' development and production cycle.

I fully agree with Laura’s response to Valeria’s Post, it’s time for “translation (software) companies to really act as part of their customers' development and production cycle”, but I do not agree that service providers should “teach customers to involve LSPs in an early stage of development”. I think that it’s the other way around.

Providers should be able to integrate seamlessly in a company strategy for content, and detect processes that can be improved. This is what we could define as a holistic approach to content creation, where translation is only one piece of a broader internal and external corporate communication puzzle.

Laura Rossi: I think the landscape is actually changing, but the change is still quite slow, especially from the side of the customers, and I wonder what will be able to cause the shift on a massive scale from the traditional way of seeing translation as a 'service industry' to consider it an essential part of a business.

I might be wrong, but I suspect this shift is driven by economic imperatives, and MT offers terrific time and cost savings. Now MT is the right technology, handling repetitive tasks to let humans do what they are best at, but technology can be applied to processes, not to outcomes. One useful approach to a realistic, sustainable translation market is to explicitly differentiate between processes and outcomes.

This is why we focus on the quality of the process, instead of the quality of the product/output, and think of a Customer Centered Business Model, with single services satisfying multiple needs.

“Quality in a product or service is not what the supplier puts in. It is what the customer gets out and is willing to pay for. A product is not quality because it is hard to make and costs a lot of money, as manufacturers typically believe. Customers pay only for what is of use to them and gives them value. Nothing else constitutes quality” (Peter Drucker).

We see that:

translation students are not trained on MT (in Italy), and mostly they don’t have a sense for the realities of the professional translation workplace after graduation;
professional translators are very suspicious of MT, and generally do not welcome new ways of approaching the job, perhaps because they don’t have direct access to the client company, due to agencies (LSPs) intermediation;
Agencies (LSPs) see MT only as a means to pay translators much less than what they pay them presently.

In this scenario, translators with the skills required to offer premium services would just abandon the industry. Underpaid and undervalued, they will simply disappear with no one to replace them. Those hard-to-acquire skills will be transferred to other areas.

We believe there are possibilities for new approaches to content creation, and translation management, and that companies wishing to change the way they write and translate their content, like ARREX did, will drive this change, not LSPs.

Laura Rossi: Can we avoid the trap ‘we-are-following-a-standard-or-model-therefore-we-are-good’, which, as you say, can ‘hook’ the customers, but, in my view, does not necessarily ensure their satisfaction?

How can we make sure that a possible specific translation industry standard process model will be flexible and modular enough so as to avoid the risk of LSPs and customers ‘anchoring’ to that as a ‘given’ and a ‘must’?

During the ARREX project we saw that it was hard for clients to ask for specific services since they see translation as marginal and take it for granted, and as Renato Beninatto often says, “translation is really like toilet paper, it’s only important when it’s not there when you need it.”

We ended seeing translation as a product, and not as a service provided with methods akin to those of industrial production. With the commoditizatioin of translation, i.e. with almost no difference between suppliers, there is an undue and ineffective emphasis on prescriptive standards and the ‘we-are-following-a-standard-or-model-therefore-we-are-good’ scenario. Prescriptive standards, though, are useless for those LSPs wishing to differentiate, and to adapt their service to the client’s needs, because they may have to change their service and approach for the unique requirements of each client. Process Standards are not, cannot be, and should not be laws, not even strict regulations because every company is different. The general state of information asymmetry between the LSP and client, make a process standard useful only if it implies transparency and flexibility. Reiterative and rigid procedures, instead, lead to static monolithic workflows.

This is why a scalable and transparent path like CMMI is useful. Only if client and customer are transparent in processes, can they find the most adequate actions to interact. The client will explain (and understand) its own level of maturity (requirements) to leverage the service provided, and the provider will be able to address the unique needs of the customer.

When it comes to adopting standards, a company does not know exactly where the ensuing process changes will lead. It can also lead to other changes that were not originally envisioned. This happened to ARREX as well: when they saw that along with improving their authoring and writing strategy, they could also act on other issues, i.e. translation, they did not hesitate in considering that, as well.

Laura Rossi: Is it really possible to capture in a standard something as ‘subjective’ as quality?

One could wonder what standards really mean to customers. Are they all concerned about “quality”? Quality is subjective in this sense that it is subjective and dependent on each customer.

A list of ‘ad hoc’ requirements to measure the level of adequacy of the service for the particular customer is useful both to assess customer satisfaction, service improvements, and to define client’s profile and demands in different domains. It is also true that if the quality of the product is subject to the assessment of the client, you can not say the same for the evaluation of the process. In fact, process standards should aim at increasing and improving the quality of the process. This can only be done with transparency in client-vendor relationships.

In our experience, we saw that processes such as post-editing can be measured either by the customer, in terms of satisfaction with the final result (via a series of requirements that must be met by the output of the MT output), and by the monolingual reviewer in a questionnaire on ‘linguistic’ aspects of translation and measurements of price/time/productivity. In this sense, requirements can be defined differently as: MT output for publication, pre-translation or internal use.

What I think would be extremely useful, and hope to see promoted in the industry, is a framework for pricing, especially for post-editing, to help customer-vendor relationships be more transparent. Crowdsourcing, as well, could be better and more widely accepted and used with a clear, simple, and common standards framework. Repeatable processes are worth sharing.

Mark Zuckerberg said “By giving people the power to share, we're making the world more transparent.” It has proved to be a very profitable strategy, as well.

Anna graduated in 2007 in modern languages and cultures at the University of Padua, and in 2009 in technical and scientific translation at LUSPIO of Rome with a final dissertation on '‘Machine Translation: productivity, quality, customer satisfaction.’ At present she works as a freelance translator and subtitle translator and on a pilot project on Italian Controlled Language and Machine Translation with LUSPIO University, Asia Online, Synthema and ARREX Le Cucine. She can be reached at anna@s-quid.it http://www.s-quid.it

Wednesday, May 18, 2011

Can a Controlled Language Help Machine Translation?

Here is another guest posting initiated from the LTAC conference at LUSPIO in Rome.This post authored by Orlando Chiarello, provides an overview of the benefits of controlled language, or in a broader sense improved / standardized source material to any translation process. Historically CL has often been associated with RbMT but the benefit of cleaning and standardizing source material is beneficial to SMT as well, as the example below shows. Any efforts made to improve and/or standardize source material are very likely to result in better MT quality and help any ongoing translation automation process. While the degree of control suggested by CL is not always possible with dynamic customer content, this post presents some examples of where this approach does make sense.

For a more complete set of links and further discussion on this subject, some may also wish to refer to the old but still relevant discussion in the LinkedIn Automated Translation Group (requires membership) Discussion on the use of Controlled Language in SMT vs RbMT This link has a detailed discussion on how CL or source language simplification can improve the results obtained from MT.

==================================================

The ASD* Simplified Technical English Maintenance Group, or STEMG (www.asd-ste100.org) is having its Spring Meeting these days (17 – 20 May) at Airbus in Toulouse. I am the Chair of this group and I would like to take this opportunity to give a brief overview of ASD Simplified Technical English, ASD-STE100 (STE).

STE is an international specification for the preparation of maintenance documentation in a controlled language.

It was developed in the early Eighties (as AECMA Simplified English) to help the users of English-language documentation to understand what they read. The STE provides a set of Writing Rules and a Dictionary of controlled vocabulary.

The Writing Rules cover aspects of grammar and style; the Dictionary specifies the general words that can be used. These words were chosen for their simplicity and ease of recognition. In general, there is only one word for one meaning, and one part of speech for one word. In addition to the specified general vocabulary, STE accepts the use of company-specific or project-oriented technical words (Technical Names and Technical Verbs), provided that they fit into one of the categories listed in the specification.

The international language of many industries and specifically of the aviation industry is English and English is the language most used for technical documentation. However, it is often the native language neither of the readers nor of the authors of such documentation. Many readers have knowledge of English that is limited, and are easily confused by complex sentence structures and by the number of meanings and synonyms which English words can have.

The controlled grammatical structures and vocabulary – on which STE is based – have the purpose of producing texts that are easily understandable and, consequently, STE reduces errors during the maintenance tasks.

Although this controlled language was originally designed for the aviation industry, companies from other industries and domains use it to standardize their documentation in an easy, understandable and unambiguous way. As an example, in March, I gave a two day training course on STE to a company located in Munich producing medical devices. The course turned out to be a great success.

Also, the LUSPIO University in Rome was involved in a project with an Italian company producing furniture for the development of a controlled Italian to be used by that company in all their documentation. The STE principles and rules have been the primary basis for the creation of this Controlled Italian. The results of this project were presented at the LTAC (LUSPIO Translation Automation Conference) on 5 and 6 April where I was also invited and made a presentation of STE.

STE can really help also Machine Translation, which was one of the primary objectives when this Controlled Language was developed. As an example, the following is a paragraph in STE (taken from a component maintenance manual) translated into Spanish by simply copying the text in the Web Google Translator and run it:

Original text in STE:

The procedures in this manual are a guide to do the correct maintenance of the component. Some equivalent procedures - that come from the experience and skills of the maintenance personnel - are also satisfactory.

Text translated by Google:

Los procedimientos en este manual son una guía para hacer el mantenimiento correcto de los componentes. Algunos procedimientos equivalentes - que vienen de la experiencia y habilidades del personal de mantenimiento - son también satisfactorios.

As we can see, the result is quite impressive.

To conclude, the above example proves that if the "source text" is English and the text is written in STE, Machine Translation can be dramatically helped by the principle of "one word = one meaning". A further help to Machine Translation could be the availability of a "mirror" Controlled Language based on STE. For example, the French Aviation Industries (GIFAS) in the Eighties created the "Rationalized French" based on STE. They actually used the same structure of the Writing Rules and Dictionary and adapted them to French. The result was exceptionally good with benefit to translations in both ways. Other attempts were made and other are currently in progress with other Languages including Swedish, German, Spanish, Chinese and Italian.

“ASD represents the aeronautics, space, and defense industries in Europe. ASD has 28 member associations in 20 countries, representing over 2000 companies with a further 80 000 suppliers, many of which are SMEs. Total annual industry turnover is over €137 billion.

Orlando Chiarello is the Product Support Manager of Secondo Mona, an Italian aerospace equipment manufacturer. He is responsible for the aftermarket support of the company products.

He is also the Chairman of the ASD Simplified Technical English Maintenance Group (STEMG), responsible for the development and maintenance of the ASD-STE100 Specification.

Monday, May 9, 2011

Standards: the Importance of Measurement

While in Rome early last month, I had an interesting discussion with Valeria Cannavina about standards and I thought it would useful to have her provide her perspective, which is a blend of theory and observations from an actual case study on the value of standards-in-use and implementation at Arrex Le Cucine. I thought that her view of standards driving ongoing improvements in quality, in addition to making other interaction processes smoother, was interesting and worth sharing with the broader standards discussion going on in the community. The following posting is a detailed response from Valeria on the discussion started in previous posts on standards related issues.

=====================================================

Can you control what you can't measure?

Following the discussion started on this blog posting, I would like to make some comments and hopefully present some new perspectives on this subject, based on my personal experience with standards.

I believe that standards should be a means to assess both the quality of products and processes. All LSPs, ranging from multi-language vendors to individual freelance translators, should have a common reference framework to pursue quality.

In 2008, for my final dissertation titled: ‘The adoption of the CMMI : perspective and benefits for LSP s’, I investigated a maturity model for LSPs to measure processes, starting from a few preliminary considerations:

measuring a process will improve it and enable clients and suppliers to create a reference standard to assess any process;
the measurement of a business process should not be an absolute end in itself (as in ISO 9000 or EN15038), but rather an ultimate goal in a path to continuous improvement;
data must be interpreted and linked to overall project goals that are pivotal in understanding and assessing the value added to a product;
process measurement and improvement actions should be conducted by all the parties that contribute to a transaction, i.e. customer(s) and supplier(s);
customers and suppliers should adopt the same model and/or the same best practices, to provide assessment parameters.

Why should an LSP choose CMMI?

When I started my research, no standard was available yet in the localization industry, for continuous improvement of processes; in fact, the LMM (Localization Maturity Model) described by Common Sense Advisory was applicable for clients only. A major obstacle to adopting a common standard in the translation industry is the information asymmetry between clients and suppliers: a standard like EN 15038 allows the stronger contractor/vendor to impose its own metrics, making it sometimes difficult to ensure full customer satisfaction. The lack of guidelines to regulate the customer-LSP relationship also does not take the freelancer into account. These freelancers are appointed only as third parties, and are responsible for the performance of a very small part of the project. EN 15038 provides also for the possible establishment of a service agreement between the customer and the TSP. Although it is important for an industry standard to regulate this aspect, it is also true that this would pose a hazard because of a lack in detail. Indeed, the major limitation of EN 15038 is the lack of specific and accurate metrics that can help regulate processes and tasks. This may limit the efficiency of processes and the ability to create value.

CMMI (Capability Maturity Model Integration) is a suite of standards focusing to asset development and maintenance in a product’s complete life cycle from initial conception to sales, from maintenance to withdrawal and end of life. CMMI is divided into process areas, with their own goals and tasks, and is based on task repeatability. CMMI consists of five maturity levels; to ascend from one level to another, each task of the process area of the lower level must be accomplished. The continuous improvement model is best suited for innovation processes, and it is no coincidence that CMMI companies operate in very different fields (Nokia, Siemens, Motorola, Reuters, Deloitte, BMW, General Motors, TATA, Canon, Light Pharma, and many others).

Source: Wikipedia http://en.wikipedia.org/wiki/CMMI

The goals of my investigation were to demonstrate that freelancers and LSPs can both adopt CMMI, and illustrate how LSPs can benefit from a model aiming at the continuous improvement of processes, improving efficiency and containing costs while increasing profits and increasing the quality of finished products. At this point in time, CMMI has been adopted mostly by manufacturers, but this does not mean that it cannot be adopted by service companies. In fact the key process areas taken into consideration (i.e. organization, process, technology and finance) can be also be applied to the GILT industry, since translation is also an economic activity/product: not surprisingly, CSA used it as a reference to develop its own maturity model for the localization industry and we can see that there is value in using these constructs to understand and drive continued improvements in processes.

The way ahead?

Four years have passed since I completed my research, but I still see the same landscape: LSPs and their clients still look at translation industry standards with the same “traditional approach,” considering only the basic tasks (analysis, translation, review, delivery) of the typical TEP model. In my humble opinion, translation industry standards lack a larger perspective, since standards should be used internally and externally, and should be very flexible to address different roles, skills and processes. A larger standards perspective will include many related tasks, ranging from content creation to translation and raw/post-edited machine translation.

Deeper and tighter collaboration is needed between vendors and clients, involving all departments of a company, especially export/international sales and global customer support managers, who will be key in leading the information flow between external and internal resources in translation processes.

The flexibility of a process definition based on concurrent tasks is a key differentiator in an industry whose clients are increasingly averse to improvisation and disorganization. Adapting CMMI to MT processes could be a way to assemble and link translation processes and new market trends into a single solution. The tasks involved in a typical MT process, from training resources and terminology work to support, audit and editing are also run accordingly to a model based on continuous improvement.

By implementing a process-oriented model, classic-flavored translation with a touch of MT can also be performed as a process based on client-specific best practices. The implementation of a process-oriented model allows for breaking down translation in discrete tasks.What I suggest, for a company wishing to adopt a standard and learn how to take advantage of certification is:

document all activities involved in the production cycle
highlight implicit and explicit tasks
identify all tasks that can generate/add value and those that are redundant and that can be reduced or deleted
make a list of best practices that can be repeated
make a list of any possible improvements (i.e. investments in training and linguistic data preparation technology) that can help increase the overall efficiency of a process.

At present, I am working on a pilot project for Arrex Le Cucine (a leading Italian kitchen furniture manufacturer with a top 5 home market standing that also exports to 35 countries all over the world) to create a controlled language to boost machine translation.

My previous research on CMMI helped my colleague Anna Fellet and me to analyze the translation process for Arrex by:

breaking down the documentation process in tasks;
documenting every stage of the process;
suggesting ways to improve the documentation process;
customizing the translation workflow;
creating a repeatable methodology.

For ARREX, we made things as simple as possible - yet no simpler, and we are now analyzing results and working on a business case with practical examples, which we look forward to presenting in the next few weeks.

Valeria holds a degree in language and culture mediation, and a master's degree in technical and scientific translation from LUSPIO in Rome. Her final dissertation was titled "Adoption of CMMI to the GILT industry: benefits and prospects for the language service providers." She collaborated with Renato Beninatto and Don DePalma of Common Sense Advisory in a research project. After two years as Project Manager at ILT Group, she now works on a pilot project on Italian Controlled Language and Machine Translation with Asia Online, Synthema and ARREX Le Cucine. She can be reached at fellet.cannavina@gmail.com

eMpTy Pages

Pages