Here is another guest posting initiated from the LTAC conference at LUSPIO in Rome.This post authored by Orlando Chiarello, provides an overview of the benefits of controlled language, or in a broader sense improved / standardized source material to any translation process. Historically CL has often been associated with RbMT but the benefit of cleaning and standardizing source material is beneficial to SMT as well, as the example below shows. Any efforts made to improve and/or standardize source material are very likely to result in better MT quality and help any ongoing translation automation process. While the degree of control suggested by CL is not always possible with dynamic customer content, this post presents some examples of where this approach does make sense.
For a more complete set of links and further discussion on this subject, some may also wish to refer to the old but still relevant discussion in the LinkedIn Automated Translation Group (requires membership) Discussion on the use of Controlled Language in SMT vs RbMT This link has a detailed discussion on how CL or source language simplification can improve the results obtained from MT.
The ASD* Simplified Technical English Maintenance Group, or STEMG (www.asd-ste100.org) is having its Spring Meeting these days (17 – 20 May) at Airbus in Toulouse. I am the Chair of this group and I would like to take this opportunity to give a brief overview of ASD Simplified Technical English, ASD-STE100 (STE).
STE is an international specification for the preparation of maintenance documentation in a controlled language.
It was developed in the early Eighties (as AECMA Simplified English) to help the users of English-language documentation to understand what they read. The STE provides a set of Writing Rules and a Dictionary of controlled vocabulary.
The Writing Rules cover aspects of grammar and style; the Dictionary specifies the general words that can be used. These words were chosen for their simplicity and ease of recognition. In general, there is only one word for one meaning, and one part of speech for one word. In addition to the specified general vocabulary, STE accepts the use of company-specific or project-oriented technical words (Technical Names and Technical Verbs), provided that they fit into one of the categories listed in the specification.
The international language of many industries and specifically of the aviation industry is English and English is the language most used for technical documentation. However, it is often the native language neither of the readers nor of the authors of such documentation. Many readers have knowledge of English that is limited, and are easily confused by complex sentence structures and by the number of meanings and synonyms which English words can have.
The controlled grammatical structures and vocabulary – on which STE is based – have the purpose of producing texts that are easily understandable and, consequently, STE reduces errors during the maintenance tasks.
Although this controlled language was originally designed for the aviation industry, companies from other industries and domains use it to standardize their documentation in an easy, understandable and unambiguous way. As an example, in March, I gave a two day training course on STE to a company located in Munich producing medical devices. The course turned out to be a great success.
Also, the LUSPIO University in Rome was involved in a project with an Italian company producing furniture for the development of a controlled Italian to be used by that company in all their documentation. The STE principles and rules have been the primary basis for the creation of this Controlled Italian. The results of this project were presented at the LTAC (LUSPIO Translation Automation Conference) on 5 and 6 April where I was also invited and made a presentation of STE.
STE can really help also Machine Translation, which was one of the primary objectives when this Controlled Language was developed. As an example, the following is a paragraph in STE (taken from a component maintenance manual) translated into Spanish by simply copying the text in the Web Google Translator and run it:
Original text in STE:
The procedures in this manual are a guide to do the correct maintenance of the component. Some equivalent procedures - that come from the experience and skills of the maintenance personnel - are also satisfactory.
Text translated by Google:
Los procedimientos en este manual son una guía para hacer el mantenimiento correcto de los componentes. Algunos procedimientos equivalentes - que vienen de la experiencia y habilidades del personal de mantenimiento - son también satisfactorios.
As we can see, the result is quite impressive.
To conclude, the above example proves that if the "source text" is English and the text is written in STE, Machine Translation can be dramatically helped by the principle of "one word = one meaning". A further help to Machine Translation could be the availability of a "mirror" Controlled Language based on STE. For example, the French Aviation Industries (GIFAS) in the Eighties created the "Rationalized French" based on STE. They actually used the same structure of the Writing Rules and Dictionary and adapted them to French. The result was exceptionally good with benefit to translations in both ways. Other attempts were made and other are currently in progress with other Languages including Swedish, German, Spanish, Chinese and Italian.
“ASD represents the aeronautics, space, and defense industries in Europe. ASD has 28 member associations in 20 countries, representing over 2000 companies with a further 80 000 suppliers, many of which are SMEs. Total annual industry turnover is over €137 billion.
Orlando Chiarello is the Product Support Manager of Secondo Mona, an Italian aerospace equipment manufacturer. He is responsible for the aftermarket support of the company products.
He is also the Chairman of the ASD Simplified Technical English Maintenance Group (STEMG), responsible for the development and maintenance of the ASD-STE100 Specification.