Tuesday, June 7, 2016

The ABRATES Conference in Rio: Translators focusing on MT

I had the honor of participating in the 7th ABRATES International Translation and Interpreting Conference in Rio de Janeiro last week. An event that had over 500 attendees, based on my casual observation. A large portion of the attendees were translators, but there were also some LSPs and Enterprise representatives. As much of the information was presented in Portuguese I had direct experience with simultaneous translation via a headset which was also kind of cool, and it was fun to switch around when I was less interested in the actual subject matter.

The formidable, emotion packed sign language interpretation by Paloma Bueno, intensely focused simultaneous interpreter volunteers in the booth, and the abounding loveliness of Rio.

I found the conference surprisingly refreshing for several reasons including:
  • The high level of understanding that many translators had about MT, Post-Editing practice and their general attitude that it is better to understand and use translation technology than fight it or fear it.
  • The beautiful location, as Rio is a naturally scenic and inviting spot.
  • An emotionally powerful sign language interpretation of the keynote session by Paloma Bueno who I cannot believe was doing this in real time.
  • The eagerness and openness of many translators present, to try and understand how they as translators could engage and work with MT and develop meaningful expertise in MT related skills.
  • The willingness to explore and understand how translation technology will continue to evolve and possibly impact their professional work.
  • Several conversations with translators who had long term experience with MT and thus had direct knowledge of MT systems that improved over time and had also seen both good and bad MT engines over the years, so were much more coherent in their criticism.
  • The shared experience of many different kinds of MT encounters from a variety of translators, ranging from DIY horror, experts systems that slowly evolved in quality gradually over years, and some proprietary efforts that produce astonishing quality.  
  • The presence of several very competent presentation sessions on developing MT related skills including:
    • Corpus Preparation for MT training
    • Working with the varying quality of PEMT output that translators get from LSPs
    • Using REGEX (Regular Expression) to develop more powerful text based editing skills when deal with corpora
    • PEMT best practices and tools and shared experiences
I also found this conference special, because I personally had no corporate allegiance at the event and was truly just an independent spokesperson with some knowledge of MT technology and it’s potential and place within the context of many of the attendees professional lives. As I am no longer employed or affiliated with Asia Online I felt very comfortable sharing my opinions, with no concern about persuading anybody to go one way or another. My opinions were all truly independent and the truest expression of what I try and do in this blog, i.e. provide useful and relevant information to inquiring minds. So while I am indeed looking for professional work, I am really enjoying this independence and focus on what really matters.

It was interesting to find that when one has this kind of openness and lack of bias as a presenter, there is an opening of the perception, and I was able to see much of what I was saying with a new and fresh eye. It was like playing improvised music to a keen and attentive audience, the shared attention of the musician and the audience creates a new, more evolved, version of an existing musical idea. I will share some of those insights in upcoming posts.

I also understood much more clearly that most often, translators have very little control of the content they are given to translate, because of the current structure of the professional translation business which is usually: Enterprise > MLV (Big Agency) > SLV (Small Agency) > Translator. Thus translators are often left to deal with poor quality source which cannot by contract be corrected or changed, work with crappy MT output produced by DIY practitioners who do not know how to actually do it themselves, or have no say in how the MT engines evolve since they are so far down the production line. Thus we have the current situation of unnecessarily mind numbing PEMT work, rather than evolving and rapidly evolving MT technology from more efficient production processes. And very often the extremely valuable linguistic feedback that translators provide is lost or ignored. An MT paradigm that organizes and collects valuable translator feedback will surely be more competitive and produce higher quality and benefit to all concerned. Not to mention that it will be personally rewarding for the many translators who will need to be involved, as the nature of the problems they solve will evolve in value and impact from the typical LSP project.

Plenary session on MT
I had an interesting experience during a plenary session panel on MT where all the other speakers were speaking in Portuguese, so I had to have a headset to understand what they were saying. When I started speaking, the interpreters of course started speaking in Portuguese, and I found it very strange and unsettling to hear a voice saying everything I was saying in English in Portuguese in real time. Somebody once said that MT is magic which I felt deserved some scorn, but to me this act of listening and translating into another language in the instant, not knowing what I was going to say, was surely closer to magic.

If this conference is an indicator of what is happening in the professional translation world, it is very promising for several kinds of translation technology initiatives. I have always felt, much to the chagrin of my former employers, that the real promise of MT will be seen when translators seek it out and learn to steer, drive and enhance the ongoing evolution of this technology. If this conference is really only representative of the Brazilian reality with translation technology, then I predict that the most exciting advances in MT will come from those working with Portuguese. This community is primed for the most interesting new Adaptive MT initiatives like Lilt which can empower motivated and technically savvy translators.

You can find some Twitter coverage of the event by searching on the hashtag #abrates16  or if you look up the following accounts:  


Ipanema Street Market


  1. Kirti, Congrats for your new walk on the wild side.

    Continuing our brief exchange in you last post, The Concept of MT Maturity, anonymous testimonials are okay but they have limits. Professionals are beginning to speak out openly about their experiences with Slate Desktop.

    Emma Goldsmith's "Slate Desktop and SDL Language Cloud Custom Machine Translation Engines" describes her experiences. It's a true David vs Goliath lineup of two very new products.

    As you're well aware, the confidentiality issue is the longest-standing resistance to Cloud-based translator tools (MT, CAT or others) among translators and agencies. That includes confidentiality regarding client content and confidentiality regarding the translator's/agency's trade secrets.

    In one of Emma's closing comments she concludes: "Slate [Desktop] has solved the dichotomy between confidentiality and machine translation because the entire process takes place on your local machine. Client confidentiality cannot be breached."

    Igor Goldfarb in the comments section gives real numbers from his personal experience.

    Also in the comments, someone cross-linked a Facebook posting from Loek van Kooten who had a less-than-positive perspective on his experience.

    Also in the comments, Emma responds to one reader's questions about Lilt.

    Times, they are a-changin!

    Good luck with your search for professional work!


    1. Thank you Tom for bringing this post to my attention, and to the attention of others who may be reading this blog.

      However, it is not surprising that the "success" was with ES > EN and not with the other language pairs mentioned by the other test case. I would expect that ES and PT would provide the most positive case studies, as often is with any Moses or SMT case. Most practitioners have a much harder time achieving similar results with languages outside the Romance language cluster.

      Also, I maintain that a user who understands something about corpus preparation, data quality, machine learning and other SMT basics is likely to have a more positive experience than one who has no idea about these things. A toolkit does not replace the need for some minimal amount of expertise.

      We are also now at the cusp of a shift from pure SMT to SMT + Deep Neural Net or pure Neural Net approaches which are showing much better outcomes than plain Phrase based SMT. This does not mean that the days of PBSMT are over, but it does suggest that we will see more improved MT output from those who solve some of the difficulties of working in the NMT paradigm.

      Also, while LILT is a new way to interact with MT and requires a translator to step away from familiar tools, I think there is growing evidence that a robust and dynamically learning MT engine developed by deep expertise is a very promising way forward for individual translators if not ALWAYS the most promising way. Especially if these solutions can preserve data confidentiality to some extent. Just because it is a cloud solution it does not mean that all confidentiality is compromised. And given that many translators use GMAIL to send files back and forth it seems odd to question could solutions so vociferously.

      Thanks again for bringing this case study to light. I am sure many at the ABRATES conference would find it interesting.

  2. Kirti, I agree knowledge and skills are required. At ABRATES you discovered the tip of an iceberg: an abundance of expertise and professionalism among freelance translators whose fee schedules put them out of reach from most large agencies. It’s rewarding to work with them!

    Note that Emma does not declare a felling blow. Nor does she speculate about technologies that are years away from her personal workflow. She merely compares existing feature for feature. Three products are emerging:

    * Lilt – a cloud-based CAT with an integrated engine based on Stanford's Phrasal SMT toolkit
    * SDL Language Cloud Custom Machine Translation Engines – a cloud-based engine presumably based on SDL's LanguageWeaver with an MT connector to Trados Studio (possibly others?).
    * Slate Desktop – a desktop translation engine based on Moses toolkit with MT connectors to Trados Studio, memoQ, cafeTran, Wordfast, Deja Vu, OmegaT (others coming).

    All of these bypass the need for MT experts to customize high quality SMT engines. Maybe some engines could achieve better quality if you pay experts for further customization but improvements are not guaranteed. Instead, each of these products offer the best possible quality at a price point under $1,200 per year per customized engine.

    I like your quality range descriptions from "horror" to "astonishing." Every batch of SMT output includes segments spanning this range. In this new world, translator are the experts who make personal customized engines from their own TUs for their exclusive use. These personal customized engines generate the preponderance of astonishing results. Translators learn to triage the astonishing segments across a batch. Then, they can quickly choose a course of action for the balance.

    This is a business process model, not a technology. The model and its price point are untenable when you insert an MT expert into the workflow. Furthermore, the MT expert is clearly unnecessary when the translator is already achieving "astonishing quality" without the expert.

    We're in almost total disagreement (BTW, disagreement is not a bad thing) regarding PEMT as a best practice. PEMT is a compromise, not a best practice. So, optimizing “best practices” around a compromise yields a better compromise not a best. PEMT dates back to 1966 in the ALPAC report that defines MT as a technology to replace humans and states MT is only viable with humans edits. I propose we've moved beyond 1966 and entered a new era where personal customized MT engines render PEMT irrelevant.

    Let's use Igor's EN-RU experience as an example (comments on Emma's blog). MT experts generally agree this is a difficult language pair “outside the Romance language cluster.” He created his engine with ~70,000 TUs that he personally translated. This number is woefully inadequate for a traditional customized engine shared across many translators, but his engine is designed for his exclusive benefit. He achieves "40% acceptable suggestions, among them 13% were very good." Of those, he spends less than 1 minute per correction (from our support site). For the remaining 60%, he reverts to his preferred CAT and termbase tools. This business process transcends PEMT and returns to traditional proofreading and correction (sorry, no fancy acronym).

    Emma stated, “My very brief foray into Lilt showed it offered similar quality to Slate and Language Cloud,...” If quality parity holds true over time and if Igor’s experience represents an average over time, then the second half of Emma’s comment becomes the focus of this new world of personal customized engines, “… without all the benefits that I get from working in a fully developed desktop environment.”

    In this new world, quality is simply assumed. Translator’s personal work and lifestyle preferences take priority. We'll see if a leader emerges as more users share their experiences. Patience is key.

    1. Tom

      Both Lilt & SDL Language Cloud are actually expert built systems to begin with. They start at a level that few do-it-yourself Moses systems will ever reach, precisely because experts are involved in building the foundation system that can then be further enhanced with much less effort by an individual translator. Thus it is reasonable that they cost more. There will be a few cases where a cold start may be better, but I expect that this will be the exception rather than the rule.

      While some DIY MT systems are successful and useful, it is my experience that most are not. Most do not even reach the level of free MT. Mostly because building good MT systems requires deep knowledge and real expertise, and it is a challenge that the best researchers have struggled with for many decades. They continue to struggle because it is a problem worth solving. The leverage to Google alone of 143B words/day is probably worth more in advertising revenue impact than the whole $35B/year "translation industry".

      I think as more translators explore the alternatives available to them, they will realize that the starting point for each user does matter. This is why so many translators do use Google MT as a dictionary or just as a way to accelerate the typing task. Even though they do not strictly do post-editing, it is changing the way they work.

      Cloud based computing has become the preferred modelfor all types of enterprise and personal computing needs for reasons too numerous to mention. The translation industry is still in transition on this but I would bet that it too heads that way. Mostly because it will be possible to do the same tasks more efficiently and effectively in a cloud architecture than a desktop one. We disagree on this point.

      I expect that many of the tools in use today are on the way to obsolescence. Facebook is already saying that SMT is done. This is the nature of technology product life cycles.

      Finally, it will be about whether a tool provides work/business leverage and productivity or not. If your tools do this relative to other options available, then they will be adopted. Otherwise ....

      Anyway, thank you for your detailed comments. It is always useful to see differing viewpoints on the same issues.

  3. You're welcome, Kirti. While there are many pedigree experts, it's clear that Slate Desktop and PTTools do not live up to your standards. Likewise, there are many strategies for cloud support ranging from Office365 running on the customer's host to Google Docs running in the on Google's hosts. It's good to know that these are your opinions and your truest expression as an independent spokesperson. Again, best of luck with your search for professional work!

  4. Tom, It is not the software tools that are the problem, rather it is the naive expectation and general ignorance of users about how things work or not that I think is the more serious problem.