TAUS RE VIEW of language business and technology
Reviews of Language Business and Technology in Europe, Asia and Africa. Columns by Nicholas Ostler, Lane Greene, Luigi Muzii and Jost Zetzsche. PLUS a Game Changer For The
Translation Industry
October 2015 - No. V
1
2
Magazine with a Mission How do we communicate in an ever more globalizing world? Will we all learn to speak the same language? A lingua franca, English, Chinese, Spanish? Or will we rely on translators to help us bridge the language divides? Language business and technology are core to the world economy and to the prevailing trend of globalization of business and governance. And yet, the language sector, its actors and innovations do not get much visibility in the media. Since 2005 TAUS has published numerous articles on translation automation and language business innovation on its web site. Now we are bundling them in TAUS Review, an online quarterly magazine. TAUS Review is a magazine with a mission. We believe that a vibrant language and translation industry helps the world communicate better, become more prosperous and more peaceful. Communicating across hundreds – if not thousands – of languages requires adoption of technology. In the age of the Internet of Things and the internet of you, translation – in every language – becomes embedded in every app, on every screen, on every web site, in every thing. In TAUS Review reporters and columnists worldwide monitor how machines and humans work together to help the world communicate better. We tell the stories about the successes and the excitements, but also about the frustrations, the failures and shortcomings of technologies and innovative models. We are conscious of the pressure on the profession, but convinced that language and translation technologies lead to greater opportunities. TAUS Review follows a simple and straightforward structure. In every issue we publish reports from four different continents – Africa, Americas, Asia and Europe – on new technologies, use cases and developments in language business and technology from these regions. In every issue we also publish perspectives from four different ‘personas’ – researcher, journalist, translator and language – by well-known writers from the language sector. This is complemented by features and conversations that are different in each issue. The knowledge we share in TAUS Review is part of the ‘shared commons’ that TAUS develops as a foundation for the global language and translation market to lift itself to a high-tech sector. TAUS is a think tank and resource center for the global translation industry, offering access to best practices, shared translation data, metrics and tools for quality evaluation, training and research.
Colophon TAUS Review is a free online magazine, published four times per year. TAUS members and nonmembers may distribute the magazine through their web sites and online media. Please write to editor@taus.net for the embed code. TAUS Review currently has about 7,000 readers globally.
Disclaimer
Publisher & managing editor: Jaap van der Meer
The views or opinions expressed by the various
Editor & content manager: Mick Rooney
authors in the TAUS review do not necessarily re-
Design, distribution & advertisements: Anne-Maj
flect the views or opinions of TAUS. While we try
van der Meer
to ensure that the information provided is correct, we cannot guarantee the accuracy of the material.
Enquiries about distribution and advertisements:
If you do notice any mistakes then please let us
review@taus.net
know.
3
Content
Leader
Features
5. Leader by Mick Rooney
32. An Interview with Jaap van der Meer and Attila Görög by Toos Stoker
Reviews of language business & technologies 8. In Europe by Andrew Joscelyne 12. In Asia by Mike Tian-Jian Jiang 16. In Africa by Amlaku Eshetie
Columns 19. The Language Perspective by Nicholas Ostler 22. The Journalist’s Perspective by Lane Greene 25. The Translator’s Perspective by Jost Zetzsche 28. The Research Perspective by Luigi Muzii
4
34. Contributors 36. Directory of Distributors 39. Industry Agenda
Leader
by Mick Rooney
In our last issue of TAUS Review we said the magazine had a mission and belief that a vibrant language and translation
industry
working
together
helps
the
world communicate better, become more prosperous and more peaceful.
Part
of that mission is also to
provide a channel for our reporters and columnists across four continents to share their thoughts and experiences, whether it is about new technologies, use cases or developments in language business and technology.
In previous issues of TAUS Review, often we have chosen particular themes like Quality, Innovation and Data for our reporters and columnists to write about. For this issue we decided to give them free reign. In our regional review from Europe, Andrew Joscelyne reflects on the landscape of European Commission project funding over the past four decades and wonders if the glory days of funding for “human language technology” will soon be at an end. He says that European Commission “information society” cash will be mostly channeled into research and innovation for Big Data, Robotics or Sustainable Energy among other areas. He considers the three possible reactions to this and what options are available.
We should not be so parochial when it comes to innovation and development.
Mike Tian-Jian Jiang reports from the annual ACL-IJCNLP conference in Asia, held during July, and reflects on the topics that arose from submitted papers, sessions and workshops. According to Mike, the key phrase “quality estimation” popped up frequently. Amlaku Eshetie, reporting from the African region, discusses some of the issues of
the translation and localization industry in Ethiopia. One of these is the reality that Ethiopia does not have a proper translation agency. While there are translation services, they primarily serve the local needs of translation of documents and small communication texts without any quality standardization nor use of tools. In our persona perspectives, Lane Greene questions how well are translation innovators serving the world’s smaller languages? While technology companies like Google and Microsoft have put efforts into providing speakers with free translation tools for some small languages, some languages still become victims of the Language Last Mile without supported services. Nicholas Ostler begins his persona perspective on a related theme and explains why he set up the Foundation for Endangered Languages in 1995. He felt that there was a tendency in the field of Natural language Processing – if it attracted the support of industry – to pressurize the language used into being more predictable, restricted and regimented. While perhaps discouraging to creative researchers, Ostler believes that the use of language should be both colourful and unpredictable and
Part of that mission is also to provide a channel for our reporters and columnists across four continents to share their thoughts and experiences.
5
Leader by Mick Rooney
this regimentation is one major reason to be concerned for languages that are in danger of disappearing. Luigi Muzii asks Who’s Afraid of Frankenstein? His persona perspective deals with the fear of automation. While more and more people today believe that, together, image recognition, voice recognition and translation software will soon make translators and interpreters u n n e c e s s a r y, technology cannot properly address many cultural facets, but it will soon be capable of handling most translation duties.
But the beauty of all languages big and small is how we enable and share information and understanding.
Jost Zetzsche reflects on a commonplace perception that the translation industry is lagging behind other industries when it comes to technological breakthroughs. He warns that we should not be so parochial when it comes to innovation and development. Zetzsche offers three thoughts on the pace of language translation technology development in his persona perspective Are we really so far behind? In
reading
through
our
regional reports and persona perspectives submitted for this edition of TAUS Review, and as a new member to the TAUS team this month as a content strategist, I was impressed with the diversity of material and ideas. Coming from the world of publishing, I see so many comparisons in both industries. But singularly, the common thread is communication.
Some languages still become victims of the Language Last Mile without supported services.
6
Translation — whether manual or machine — is one fundamental tool in the process of communication. But the beauty of all languages big and small is how we enable and share information and understanding.
Send your comments or questions to review@taus.net
y mrofni lliw I An artificial intelligence approach touotranslation oitalsnart otwill hcaorevolutionize rppa ecnegilletn i laicďŹ business. itra nA gnikcehc ret fa your .kcots ruo
To realize a society in which everyone can interact freely across language barriers with the use of machine translation technology, and thereby contribute to invigoration and innovation in businesses. https://miraitranslate.com/en/
7
Review of language business & technologies in Europe by Andrew Joscelyne
Europe: From Funded Projects to a Language Tech Business Platform As
everyone knows, the
European Union tries to take Millions of euros
care of its language ecosystem.
have been disbursed over the last four decades to help establish what is now morphing into a multilingual political, administrative and retail space
– the Digital 24 countries and 60 languages, otherwise. Most of the relevant funding
Single Market official or
for
has gone into applied research in natural language processing
(NLP)
and machine translation
(MT),
usually in the form of short term projects teaming up academics with small/midsize tech businesses.
However, most people who have been involved in this process now sense that the strategic tide has turned. After the current Horizon2020 program, it looks as if European Commission “information society” cash will be mostly channeled into research and innovation for Big Data, Robotics or Sustainable Energy among other hot topics. The glory days of funding for “human language technology” (as it was first dubbed) may well be behind us. There are three possible reactions to this state of affairs. You can protest as energetically as possible against the predicted shift in funding. You could encourage national rather than European-level governments to sink money into language t e c h n o l o g y i n n o v a t i o n , possibly via some new partnership structure so that national languages constantly interconnect with each other horizontally over a panEuropean grid. Or you can kick-start a brand new, independent platform that leverages Europe’s language technology heritage into real-world business value. A quick look at each of these options.
The glory days of funding for “human language technology” (as it was first dubbed) may well be behind us.
8
1. Starting in March this year, over 3,600 European researchers and industry practitioners in NLP and related disciplines (speech technology, text mining, MT, etc.) signed an Open Letter to the European Commission from the European Language (Technology) Community entitled Europe’s Digital Single Market must be multilingual! Here is the final paragraph: We, the undersigned stakeholders – researchers, developers, SMEs, market leaders
and opinion
makers, and individuals – ask the European Commission to address the multilingual challenge in the Digital Single Market strategy and pledge to work together to provide a solution for overcoming language barriers, thereby making a truly integrated Digital Single Market a reality.
It is hard to gauge whether this community is punching above its weight. After all, there are over 500 million people living in the EU. Probably your average European would agree with the general thrust of this letter, although the very notion of a “solution” to overcoming language barriers will raise a smile: there are already hundreds of solutions, and there may not be any single one that will ever be universally satisfactory, due to the eternally shifting sands of human communicative
Review of language business & technologies in Europe by Andrew Joscelyne
praxis. Word has it, though, that the EC has heard this call for action so there may well be some response to this challenge from those evaluating the EU’s overall innovation strategy. We can only hope that it goes beyond the vision expressed in an EC roadmap ten years ago, when the European Information Society Te c h n o l o g i e s Advisory Group (ISTAG) published a draft report on the Grand Challenges in the Evolution of the Information Society. One of the futures predicted for 2010 (five years ago for us!) was devoted to the “serious linguistic challenges” of a very large Union, and the crystal-ball story it invented was that of a Multilingual Companion - a small portable device that “would rapidly and accurately render translations in text and speech in any other European language desired. It could be used to take voice dictation, immediately generating the text in any number of languages, thereby streaming the connection between humans and computers. The companion would be extremely useful for disseminating the minutes of meetings to many countries simultaneously, to each in its own language.”
Apple launched the iPhone in 2007 and we all still half-believe that speechto-speech translation will be possible on our virtual reality glasses/ wearable/ smartphone within the next [choose a number] years.
It may sound like the Star Trek universal translator that is constantly referenced by journalists writing stories about translation technology, but the proposal was correct about personal devices. Apple launched the iPhone in 2007 and we all still half-believe that speech-tospeech translation will be possible on our virtual reality glasses/wearable/smartphone within the next [choose a number] years. Operational
MT for the masses has always been a five year plan embedded within yet another five year plan. And does anyone remember a multimillion euro project to beat Google as a search engine called Quaero? They only pulled the plug on the plan in 2013 after millions in funding. 2. What if individual governments took up the initiative to maintain an R&D agenda for their own national languages, exemplifying the subsidiarity principle whereby Member States rather than the EC should take charge of activities of purely national concern? This would be theoretically possible in numerous European countries because many already fund such programs. Yet they would never be near powerful or global enough to stimulate a quantum leap in language technology. One example of what can be achieved is Latvia, a country of 2.1 million people that has developed a flourishing MT culture and even built its own Latvian-Russian/ English government translation system called Hugo for its citizens. This is partly because Latvia found itself in the unusual situation of having
Latvia found itself in the unusual situation of having its national tongue statused as an ‘endangered language’, threatened numerically by a strategic big language (Russian). 9
Review of language business & technologies in Europe by Andrew Joscelyne
its national tongue statused as an ‘endangered language’, threatened numerically by a strategic big language (Russian). Hugo has even been nominated for the World Summit Award in the category Government and Open Data. But no other governments seem to have encouraged the development of effective MT systems to deal with their very local problems. Ireland, for example, could pinpoint English as a dominant, threatening presence and adopt the Latvian solution, even though Irish is not the “official” tongue. It also has a flourishing language technology innovation culture at the CNGL. France funded MT research from the 1960s and could easily have transformed (but didn’t) one of the many (rule-based) engines that were developed during the 1980s and 90s into a national MT “treasure.” As could Germany, Italy and others, which all benefited from funding under the EUROTRA MT project back in the 1990s. Spain has a fractal EU problem of its own – inter-translating between several in-country languages for administrative and cultural purposes among others. But is there a national “solution” to this obvious communication need? Several other countries have more than one official language and might in principle adapt some sort of human/MT system to automate certain translations requirements.
10
One can suspect that a number of (or all?) Member States must have developed/acquired automated translation systems to handle some of their security and intelligence requirements; yet these remain below the radar of public knowledge. Overall, in the current economic climate, it seems unlikely that many of Europe’s governments will find the spare euros needed to maintain work on national language translation technology, let alone speech recognition systems, knowledge processors and so on. Google or Bing will usually do the trick for the time being. 3. So how about something new, practical and market-driven from the grassroots of EU language technology? You’ve guessed right: those first two options were just idle sparring partners. The most interesting news for the language and technology community this year was the June announcement of the LTI Cloud (LTI stands for Language Te c h n o l o g y I n n o v a t e , an industry association for the language technology c o m m u n i t y, based in B r u s s e l s . Disclosure: I work with LT Innovate). This new venture, due to roll out in beta in February 2016, will be a platform for anyone with language technology
In the current economic climate, it seems unlikely that many of Europe’s governments will find the spare euros needed to maintain work on national language translation technology, let alone speech recognition systems, knowledge processors and so on.
Review of language business & technologies in Europe by Andrew Joscelyne
components to upload them in the form of Software as a Service to a potentially large user community. The idea, largely envisioned by Jochen Hummel, chairman of LT Innovate and serial language technology entrepreneur, is that startups, IT departments, system integrators, software companies, and any business large or small can access l a n g u a g e technology and plug ‘n’ play with the components to build solutions to their “language” problems. As has often been observed, l a n g u a g e technology is largely an enabler: it helps organizations do most of their core business more efficiently. By helping them evaluate, integrate and innovate with a much larger range of shrink-wrapped language technologies than they usually have access to, the Cloud should be able to speed up adoption, build useful data about use and quality, and generally stimulate practical, hands-on interest in the benefits of different language technologies in ways that old-style EU or even national projects failed to achieve in the past.
As has often been observed, language technology is largely an enabler: it helps organizations do most of their core business more efficiently.
The kind of products and technologies on offer via the Cloud will cover the entire range of NLP and speech technologies, from parsers to SMT systems, anonymization to semantic search, ontologies to report generators, language detection to speech recognition and synthesis. Among other organizations likely to promote their gear on LTI Cloud will be the well-known GATE general software architecture for text engineering from the UK University of Sheffield.
as a non-profit undertaking. It will enable entrepreneurs to run free trials, test and then buy the technology from the owner. Technology providers who pitch their products on the Cloud will maintain their own installations. And there will be an optional “private cloud” space for providers who wish to handle their clients under their own terms. In essence, then, the LTI Cloud will be a constantly up-todate catalogue of viable language technology services for everyone. Will it become known as “cloud-sourcing”? Coda: What will probably happen is that language technology as a necessary enabler of the famous Digital Single Market will receive some sort of “innovation” funding from the EC after a round of serious reflection; national governments will continue to dribble a little funding into their universities to keep their national languages up-to-scratch; and the LTI Cloud will hopefully open up a new chapter in language tech for Europe. Among other things, this means promoting technology that can handle strategic languages such as Russian, Arabic, Chinese, South American Spanish, and Swahili among others, not just EU’s oh-soprecious single market languages. So not quite business as usual…
Send your comments or questions to europe@taus.net
The LTI Cloud will be operated by LT Innovate
11
Review of language business & technologies in Asia by Mike Tian-Jian Jiang
A slanting viewpoint of translation in ACLIJCNLP 2015 The annual meeting of the Association for Computational Linguistics, ACL in short, is one of the top conferences in natural language processing. IJCNLP stands for International Joint Conference on Natural Language Processing and it is one of the most important activities among Asian language technology research society, held by the Asian Federation of Natural Language Processing. They just had a joint conference along with the co-located event CoNLL, the SIGNLL (the ACL Special Interest Group on Natural Language Learning) Conference on Computational Natural Language Learning, this July, and I quote, “For the first time, the annual meeting of the Association for Computational Linguistics (ACL) takes place in Mainland China,” so it may justify my Asia-specific perspectives.
While it’s refreshing to see a new paradigm other than phrasebased machine translation emerging, researchers have encountered limitations.
Figure 1
12
Speaking of translation, it might be worth mentioning that ACL was originally named the Association for Machine Translation and Computational Linguistics (AMTCL) in 1962 and renamed six years later
as it is now, so one may safely see (machine) translation remaining an influential aspect of it. Judging by the number of machine translation papers, it may not seem likely at first glance. After all, only about 29 out of 329 papers in total. However, if only looking at the two best papers, the fact that one of them is titled as “Improving Evaluation of Machine Translation Quality Estimation” couldn’t speak louder for the importance, and if looking at it closer with “Quality Estimation,” one may take it for something probably useful to the translation industry, which is one topic this article would like to address soon. Before that, let’s take a step back to have an outlook. You may be already aware of the recent trend
Review of language business & technologies in Asia by Mike Tian-Jian Jiang
Figure 2
of Deep Learning and predicting that it also buzzes in ACL-IJCNLP 2015. You are right. The first session is exactly about machine translation based on neural networks. Despite neural networks being studied for almost a decade, Deep Learning based on convolutional neural network or recurrent neural network is relatively amateur, not to mention neural machine translation and similar approaches. While it’s refreshing to see a new paradigm other than phrase-based machine translation emerging, researchers have encountered limitations. A major one is that neural machine translation has to trade the complexity with the size of target vocabulary. If you have ever played with the demo mentioned in my last article, one may have noticed that it worked well with sentences but didn’t work at all with single words (Figure 1). So the authors of the above system, and other pioneers, have been pushing the frontier a bit further “On Using Very Large Target Vocabulary for Neural Machine Translation” (Figure 2) and “Addressing the Rare Word Problem in Neural Machine Translation.” Give it time, neural machine translation may have a breakthrough, especially when some of these researchers are working with BOLT (DARPA, IBM), Matecat (EU), Cosmat (SYSTRAN, INRIA), etc.
• •
• •
• •
•
•
the readers to find insight by and for themselves: • “Statistical Machine Translation Features with Multitask Tensor Networks,” • “Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic Constituents,” • “Efficient Top-Down BTG Parsing for Machine Translation Preordering,” • “Online Multitask Learning for Machine Translation Quality Estimation,” “A Context-Aware Topic Model for Statistical Machine Translation,” “Evaluating Machine Translation Systems with Second Language Proficiency Tests,” “Representation Based Translation Evaluation Metrics,” “Exploring the Planet of the APEs: a Comparative Study of State-of-the-art Methods for MT Automatic Post-Editing,” “MT Quality Estimation for Computerassisted Translation: Does it Really Help?” “Context-Dependent Translation Selection Using Convolutional Neural Network,” “What’s in a Domain? Analyzing Genre and Topic Differences in Statistical Machine Translation,” “improving pivot translation by remembering the pivot”
Since the key phrase “quality estimation” popped up frequently, it’s about time to define it. According to the Workshop on Statistical
As for the outlook, it’s uneasy to let so many other interesting papers slip away, but in order to keep this article compact, I’m afraid that the best I can offer for the time being is to highlight some keywords from titles in hope for
13
Review of language business & technologies in Asia by Mike Tian-Jian Jiang
Machine Translation (WMT), it’s “for estimating (predicting) the quality (based on several quantities in terms of post-editing) of machine translation output at run-time, without relying on reference translations.” (Words in parentheses are mine.) The two critical conditions here are “no reference translations” and “in terms of post-editing.” Unlike typical automatic evaluation metrics of m a c h i n e translation, q u a l i t y estimation works like another predictive modeling (or data science, if one prefers): whenever a machine translation output produced, how confident are we do recommend it to the post-editors, or even the customers? Thanks to years of WMT shared tasks, we have just about sufficient data to observe and learn. For example, what would the correlation between machine translation quality and post-editing time be? The readers shall be able to find all the criteria at this link.
Imagine that there will be an oracle identifying the best machine translation result for you, so the burden and frustration to post-editors and customers can be minimize.
Imagine that there will be an oracle identifying the best machine translation result for you, so the burden and frustration to post-editors and customers can be minimize. Too good to be true? Here’s the best paper “Improving Evaluation of Machine Translation Quality Estimation” kicks in. As its introduction said, “this also produces conflicting results and raises the question which method of evaluation really identifies the system(s) or method(s) that best predicts translation quality.” WMT14 even faced a serious issue that English-toSpanish has so many tied matches resulted in 22 winning systems.
14
Feels like being back to square one. Fortunately, the paper figured out that the previous evaluation metrics are somewhat flawed, and pointed out a more appropriate choice should be the Pearson correlation. Since there are so many lessons can be learnt from the paper, the readers are encouraged to read it for themselves. This article will just provide a small (crazy?) idea as the final words. An estimation means not relying on reference translations doesn’t have to limit itself to predict the quality of machine translation. What if the target is crowd-translation? For example, Gengo has been doing quality assurance for a while and gathered some (open) data, yet it is probably not convincing to rest the customer assured. Could it be more trustworthy to apply quality estimation on crowd-translations, and then notify proofreaders when the prediction raises the red flag? I couldn’t speak for the customers, but I know I’m sold.
Send your comments or questions to asia@taus.net
ACCURACY IS IMPORTANT. WE BELIEVE IT’S VITAL.
We combine the expertise of professionals with cutting edge technology to deliver translations that take quality to new heights.
www.lingo24.com
Because there’s accuracy. And there’s Accuracy.
15
Review of language business & technologies in Africa by Amlaku Eshetie
Another Critical Challenge for Translation and Localization Agencies to Emerge and Sprout in Ethiopia – Banking System! My
readers will be familiar enough with the status
and condition of the translation and localization
Ethiopia TAUS Review. I
industry in in
African
translators
from my previous three articles was planning to invite fellow from
the
pool
of
freelance
I have: from Kenya, South Africa, and Cameroon. However, though all of them had given me promises, none of them turned their promises into practicalities. translators
As a result, I have to write this fourth article, too, myself and I will again inform you of some other issues of the translation and localization industry in Ethiopia. Compared to neighbouring African countries such as Egypt and Kenya, Ethiopia houses no proper translation agency. There are translation offices here and they are serving the local needs of translation of documents and small communication texts, but these offices could not be referred to as agencies. None of them holds any quality standardization, nor do they use tools for translation.
There are translation offices here and they are serving the local needs of translation of documents and small communication texts, but these offices could not be referred to as agencies.
Most translation and localization agencies elsewhere in the world qualify for the ISO and other organizations quality standards, have translation tools (CAT tools), apply standard process management practices, recruit trained management, translation and LQA linguists, and so forth.
16
On the contrary, I bet none of the translation “shops” or “offices” in Addis Ababa and some large cities and towns has any of the qualities required at a translation (and localization) agency level. This does not mean that there is nor the market or demand for translation and localization in Ethiopia. I have discussed that there is actually the market in one of my previous articles, but there are many bottlenecks for the industry to grow in Ethiopia. Some of them, such as lack of trained and experienced professionals and poor internet facilities, have already been discussed. So the notdiscussed point, which is the subject of this article, is the banking facility. Banking facility and procedure is at the heart of all sectors
Banking is a critical bottleneck that prevents translation and localization companies from opening their branches in Addis Ababa, the capital of Africa.
Review of language business & technologies in Africa by Amlaku Eshetie
However, when we look at the prospect, we can be optimists. The speed of the globalization process and the growth of the internet as well as the pressure from the Diaspora and the international community will gradually influence the government systems and, like other sectors, the banking sector will also open up. Ultimately, not only the translation and localization service companies, other IT and Software companies will consider branching out to the Horn, Addis Ababa.
and the economy of a nation at large. Where there is a developed banking system, there will be an enhanced economical transaction, and the vice-versa will be quite true. This is more sensitive in the localization and translation industry where much of the proportion of role players are freelancers or virtual officers and payments are made online and to everywhere.
Send your comments or questions to africa@taus.net
17
Check out the TAUS Quality Dashboard and sign up for a free subscription
18
The Language Perspective
by Nicholas Ostler
The Driving Beat of Regimented Language First an admission: one of the reasons why I set up the Foundation for Endangered Languages in 1995-6 was a reaction against my consultancy work, over the previous fifteen years, in Natural Language Processing. It seemed to me then that the tendency of most of the work in that field – especially if it attracted the support of industry – was to put pressure on the language used to be more predictable, restricted and controlled - in a word, regimented. If only people could be persuaded to use a finite (and small-ish) vocabulary, with simply constructed (and short-ish) sentences, on pre-defined (and dullish) topics of discourse, automatic processing had a chance of reliably getting the point of what was said, and so might attract serious investment. That was the story, and it may be true, but if so, it can only be discouraging to more creative researchers, and I looked for the opposite: use of language that was colourful and unpredictable. Well, I thought, isn’t that a major reason to be concerned for languages that are in danger of disappearing? We need to save these languages for the unique treasures they contain, and the unexpected depths and dimensions they can show us of human potential.
We need to save these languages for the unique treasures they contain, and the unexpected depths and dimensions they can show us of human potential.
It would be wrong to control the exuberance of these languages; to do so would defeat the point in struggling for their survival. Khoisan languages without their clicks, Hopi without its untensed system of verb aspects, Djirbal without its anti-passive use of ergativity, would all be easier to process and translate, but would lose their distinctive, if tantalizing, charms.
In ancient Greece and Rome, there was an unresolved debate on language: were rules or exceptions its true essence? As the contemporary philosophers put it, did analogy or anomaly rule? In the 1980s, machine translation, based on explicit programming of the syntactic and semantic rules detected by linguists, seemed defeated by the number of exceptions and variety in natural languages. Something of the same despair at the inadequacy of rules made primary school teachers in the Englishlanguage world give up on phonics, the philosophy of learning to read through phonetic rules, and fall back on “look and say”, which endeavoured to teach each new written word to learners as an unanalysed hieroglyph.
19
The Language Perspective by Nicholas Ostler
and predicate (or a topic and a comment) a condition on making sense, or a convention more like a spelling rule? Language loves law, but where does it fit in Gerry Mooney’s joke, comparing Newton’s discovery of gravity to speed limits on the highway: “Gravity: it’s not just a good idea – it’s the law!” < www. thegravityposter.com > Is grammar a nice idea, or intrinsic to language?
Statistical analysis – which looked for emergent generalizations about text structure – stepped up to fill the gap, looking for regularities in correlation that would emerge from translations already done, but might then be reused on new translations. It was a new way of stating faith in language’s analogical principle, but at a different level. Others turned to “controlled language”, deliberately writing texts with machine translation in mind, and so policing them in advance for the application of known, valid rules. This tended to have a rather deadening effect on the readability of the texts (whether in the original or in translation), suggesting something important was missing. Well, it was important to human beings, and who else were texts written for? Rules, it seemed, were inseparable from the use of language, but at what level? As constitutive rules which made possible the expression of any meaning at all, or rather as prescriptive rules, mainly to distinguish good style from bad? Is the requirement that a sentence should consist of a subject
Rules, it seemed, were inseparable from the use of language, but at what level?
20
Besides the regularities of language in use, to make sense, and impress the educated, there is another source of law in language – e.g. what the German linguists of the nineteenth century called Lautgesetz or “sound law”: the principles which govern language change, particularly in pronunciation. The “Great Vowel Shift” of the 13th to 17th centuries c h a n g e d systematically the sounds of the tense (aka ‘long’) vowels in English: this was a natural (and presumably unconscious) change, but the spelling of words remained unaffected.
There is another form of external order imposed on language which people seem to find not frightening but invigorating. This is rhythm and rhyme in poetry and song.
This meant that English pronunciation rules are now, ex post facto, radically different from those of most of the other European countries which got their writing from Latin. So ate, eat, mite, oat, out, which had been pronounced much as the same strings might be in French or Italian, were now to be read as [eyt, iyt, mayt, owt, awt]. (German and Dutch suffered a similar vowel transformation, but adjusted their spelling: hence their actual pronunciation rules did not need to change so much.) It is the systematic nature of these apparently meaningless variations in language over time that makes us see them as laws.
The Language Perspective by Nicholas Ostler
Such natural changes in language work against prescriptive norms. Why do we have to use special rules to spell “nation”, not the straightforward phonetics which would suggest “neyshun”? Now that any sense of case-marking is disappearing in English, how to defend “The King and he saw the Queen and me” – that is, to know the traditional use of “I” vs “me”, and “he” vs “him”? Ordinary people (not pedants, or linguists) seem to be intimidated or humiliated by prescriptive rules, but they cannot help following the rules of language change which happen to prevail in their era – something they do without any effort. And yet, and yet… There is another form of external order imposed on language which people seem to find not frightening but invigorating. This is rhythm and rhyme in poetry and song. It achieves different degrees of elaboration in different ages, but modern pop and rap music shows that it is alive and kicking
everywhere in the 21st century. Curiously enough, this regimentation is something which we see as no threat to languages, but essential to their survival. And this we shall be celebrating in FEL’s conference in New Orleans: The Music of Endangered Languages.
Send your comments or questions to language@taus.net
21
The Journalist’s Perspective by Lane Greene
Boldly going to where small languages live How
well are translation innovators serving the
world’s smaller languages?
Languages
and their
speakers are distributed along a classic power-law distribution.
A few big languages make up most of the Ethnologue, just 8 languages comprise 40% of the world’s population of first-language speakers, and 82 (including the big 8) languages account for 80% of the population. 82 might sound like a lot, but that is just 1.3% of the world’s languages. The other 7,020 languages account for the remaining 20% of the world’s people. world’s speakers—according to
Perhaps more important to the incentives of a technology company is another statistic that is even more lopsided. One 2008 calculation listed languages by the percentage of global GDP represented by their speakers. This was a wellmeaning attempt to steer language learners to where they could get the most bang for their study-hour buck: want to learn Japanese? Its speakers account for 7% of world GDP – a very big chunk.
But the depressing bit is how starkly these numbers show the world’s economic activity is skewed: the top 10 languages accounted for 77% of global GDP.
But the depressing bit is how starkly these numbers show the world’s economic activity is skewed: the top 10 languages accounted for 77% of global GDP. That is good for people learning a big language, but should surely be cause for concern for speakers of the world’s other 6,000odd languages, who must split the remaining 23% of the economic pie between them.
Big companies like Microsoft and Google have put impressive efforts into languages that don’t
22
leap out as ones of global importance, offering free translations in languages that many nonexperts or non-natives will never even have heard of—the likes of Cebuano (spoken in the Philippines), Chichewa (central and southern Africa) or Sundanese (Indonesia). These are not small languages; in fact, they have many millions of speakers; by the standards of truly endangered languages, they are huge. But they are not global languages, and it is commendable that the tech giants have given their speakers free translation tools. So what about truly small languages? Lori Thicke of Translators Without Borders calls them victims of the Language Last Mile: just as the telephone company will put a line in a reasonably sized town, but not necessarily spend the money to bury a cable extending to the houses lying a mile out of town, the companies offering language services have devoted impressive effort to not just big international languages, but big regional languages. But this still leaves thousands of languages at the end of that last mile without services. Thicke explains why these languages matter in compelling terms: when disaster strikes, it strikes speakers of small languages hardest.
The Journalist’s Perspective by Lane Greene
They live in poor countries. They are often scattered: large, co-extensive populations tend to speak bigger languages. They are more rural than urban, so if they suddenly find themselves in urgent need, they are far from the roads, airports, ports and railways that will bring relief. And though many such people are multilingual, their second language is more likely to be another local language than one of the big European heavy-hitters. Thicke’s Translators Without Borders does the angels’ work of getting language services to such people: they have translated information about the signs and the spread of Ebola into regional West African languages, for example, sending local preventative and treatment knowledge skyrocketing. But only so many people can work with groups like Translators Without Borders. Could the tech companies act like a “force multiplier”, helping over-stretched aid-workers help more speakers of small languages faster by putting automated translation tools in their hands?
The challenges are considerable. The most obvious is that most such languages are not written, whereas the data for building m a c h i n e translation engines is of course text. This would seem on its face to be a fatal problem: no text, no MT.
Companies offering language services have devoted impressive effort to not just big international languages, but big regional languages. But this still leaves thousands of languages at the end of that last mile without services.
But imagine where the resources of a huge tech giant might be put to solving such problems. These are not solutions available today, but a potential future. Those languages that have been studied, and for which a grammar has been written, have usually been described in the universal International Phonetic Alphabet. If speech-recognition technology could be tailored to output an IPA transcription, it could be matched, then there would be a text to work with. An MT engine then needs parallel texts to train on. This would be another difficult step, eager linguists studying small languages could help solve the problem by eliciting a set of crucial sentences (“Is anyone here in the village vomiting?”, etc) from a native speaker, with the linguist translating them into English. This would be a fairly scanty data-set, even assuming the most willing linguists. But disaster-relief is the kind of limited-domain work for which MT engines do their best; “Is anyone in the village vomiting?” is not the kind of “The pen is in the box” ambiguity (is it a writing pen or an animal pen) with which MT struggles. Even relatively limited data could
23
The Journalistâ&#x20AC;&#x2122;s Perspective by Lane Greene
none of the steps to get there requires the wormholes or teleportation devices of a science-fiction movie: all of these imaginings are based on existing technologies and knowledge. Microsoft has added Klingon to Bing Translate; it would be great to see it or another tech giant boldly go where no company has gone before in automated translation. In the next column, I hope to report on some of the journeys that have already begun. potentially prime an MT engine to be useful here. At the heart of the engine would be a challenge, getting the source- and output-text matched well enough to be productive (since the voice of a panicked disaster survivor is not likely to be the easiest data to work with) but not so fuzzily as to produce disastrously misleading translations. (The speech-recognition engine missing a single negative particle would result in a translation meaning the opposite of what it is supposed to mean.) Finally, some sort of speech synthesis would be needed: when the aid worker speaks English or French, some sort of recognizable version of the target language will need to come out of a gadget. This is perhaps the most tractable p r o b l e m . Rendering IPA into sound would be relatively straightforward.
The speechrecognition engine missing a single negative particle would result in a translation meaning the opposite of what it is supposed to mean.
Even if reproducing natural speech-rhythms is not, a robotic voice would be better than none at all. Such a product is well off in the future. But
24
Send your comments or questions to journalist@taus.net
The Translator’s Perspective by Jost Zetzsche
Are we really so far behind? It’s
become rather commonplace to mention that the
translation industry is still looking for technological breakthroughs and that it’s woefully behind compared to other industries. get—yes
I’ll
The
more that’s said, the more
I
admit it—a little defensive.
Really? Are we so parochial when it comes to innovation and development? I think we might instead be a little confused when we say this in too sweeping a manner. There is the business of translation. I’m not a business expert, but I can see that many in the industry and maybe even the industry as a whole (if there is such a thing as a “translation industry”) are still looking for better business models. It doesn’t really surprise me that this is a difficult process—after all, there are various stakeholders with very different stakes to represent—but in a sense it’s probably good and healthy that we’re continuing to negotiate best business practices and processes. But again, I’m not an expert in this. I’m more of an expert in the other part of the equation: translation technology as it relates to the actual act of translation. While there is certainly some overlap between that and business-related technology, there is plenty that’s not related (and maybe that’s part of the problem?). Here are three thoughts on the pace of language translation technology development: I’ve mentioned before that we—and with “we” I’m referring to translators—shot ourselves in the foot when we left the development of nascent translation technologies to other stakeholders. The unfortunate result was that the technology did not turn out to be what it could have been with our input and participation. Given our past experience, it’s essential now to find out how much of that is still going on with current translation technology. While it’s hard to put
any authoritative figures on it, it seems clear that there has been a real sea change among translators. This doesn’t necessarily mean that every new technology is being enthusiastically embraced—after all, not every technology is good just because it’s new. But new technology is no longer being rejected outright just because it’s new or just because it’s technological. Case in point: This November (2015) the ATA, the American Translators Association, and AMTA, the Association of Machine Translation in the Americas, are co-locating their conferences for the third time. And just like the previous two times, this will include a good amount of crossfertilization (and surprisingly little conflict!). The second thought is one that might sound precocious to some: Language translation is a hard nut to crack for t e c h n o l o g y. So hard, in fact, that it might never be completely “ c r a c k e d open.” There is no shortage of new startups in the translation technology
While it’s hard to put any authoritative figures on it, it seems clear that there has been a real sea change among translators.
25
The Translator’s Perspective by Jost Zetzsche
space, but only a few are really pushing translation technology forward. And the same is true for existing technology vendors. There are a lot of them, and they all come out with new sets of features for every release of their software. But the ratio between “catch-up” features designed to bring a product on par with competitors and really forward-driving technologies is heavily in favor of playing catchup. One reason for that is the end user’s mind set—we insist on having a feature set that is just as good as the “other” tool. But the other and probably more relevant reason is that it’s just hard and expensive to come up with new features, especially when there is some languagespecific development involved.
There is no shortage of new startups in the translation technology space, but only a few are really pushing translation technology forward.
Here is my third thought, though. Despite all this, we have made a lot of progress—even though that progress might not be so obvious to the outsider, nor yet have made it into the balance sheet. Here are some examples: • We know now that machine translation will play a role in many of our professional lives in the near future. It’s a transformation that is widely seen as positive as an additional tool for enhancing our productivity (as opposed to simply editing machine translation output). • We long ago entered the age of collaboration by sharing TMs and termbases when working for technology-savvy clients, but it’s now easier than ever to organize virtual workgroups with our peers, especially with the rapid increase in cloud computing. • With the advent of sub-segmentation, the concept of translation memory has changed.
26
Many of us used to throw everything into one mammoth translation memory. While some still do, most of us have turned our backs on that model to embrace translation memories of higher quality that in turn deliver higher quality sub-segments. • Sub-segmentation and the corresponding techniques of AutoWrite, AutoComplete, writing with the Muse, or however the tool of your choice might call it, have also changed our approach to terminology. It’s become more important because we get it delivered right to our cursor as we type. This comes at a time when term handling and maintenance has finally become less cumbersome in most tools, and when external tools are becoming more finetuned to building up terminology resources and using them productively. • Quality assurance for both machinetranslated and translator-translated texts has become a lot more hands-on and integrated with tools like the TAUS Quality Dashboard or translate5. • Web-based translation environments have become ubiquitous. While most of us agree that they still lack some of the productivity features we are used to from our desktop translation environments, they have made huge progress. Now even the most ardent desktop tool developers are working on a browser-based interface to complement or replace the older technology. Surveying this makes me proud and happy— and excited about what’s still to come. From my perspective, it’s safe to say that we’re better than our slightly dusty reputation.
Send your comments or questions to translator@taus.net
A thousand different workflows. One Solution. The Business Management System for the Translation Industry
www.plunet.com
Innovation and continuous improvement is embedded in our corporate culture • A leader in translation and interpretation services and your guarantee for quality at the most competitive rates
• Flexible solutions to accommodate your needs
• A committed and unparalleled customer service
Toll Free (US) 1-877-255-0717 www.trustedtranslations.com
sales@trustedtranslations.com
27
The Research Perspective by Luigi Muzii
Who’s Afraid of Frankenstein? Asimov’s androids
robots, were
Prometheus
Clarke’s HAL 9000,
someway
scary
in
and
Dick’s
enlivening
the
myth and outlining a looming apocalypse
ending the sacred and eternal human race, climaxing in
The Matrix
trilogy.
Unlike Mary Shelley’s creature, mature humanoids like those in Her and Ex Machina always have language proficiency. After all, neuroscientists have been debunking Noah Chomsky’s theory of a universal grammar for some time now, increasingly considering language a technology. Inversely, most linguists, especially translators, still firmly believe in the language instinct. To them, fictional humanoids with their language proficiency are the embodiment of the impending menace of technological
singularity, when “The AIs are going to look back on us the same way we look at fossil skeletons on the plains of Africa. An upright ape living in dust with crude language and tools, all set for extinction.” This future in Her and Ex Machina is nearer than foreseeable because we have learnt to exploit big data. Says Ex Machina director Alex Garland, “If somebody like Google or Apple announced tomorrow that they had made [an AI humanoid], we would all be surprised, but we wouldn’t be that surprised.”
Neuroscientists have been debunking Noah Chomsky’s theory of a universal grammar for some time now, increasingly considering language a technology.
In fact, the Guardian reports Apple’s selfdriving car project as further along than many suspected, with the company scouting for test locations. Tesla, BMW, Mercedes, Volkswagen and several other carmakers are pouring big money into self-driving cars. Yet, Nicholas Carr seems the only one being scared.
28
The Research Perspective by Luigi Muzii
Incidentally, Carr reminds that modern automation originated in 1914, when Lawrence Sperry successfully demonstrated the first autopilot. Possibly, many also ignore that the basic component of Sperry’s invention, virtualized, is part of our daily companion, the smartphone. The same device allows Atlas, the latest Boston Dynamics humanoid, to stay balanced, even on one foot. Boston Dynamics is a division of Google X, inventor of smart contact lens. Together with NASA and Universities Space Research Association, Google is involved also in quantum computing. Fearing automation is no foolishness since, according to a 2013 Oxford Martin School research, translation is one of the first ten jobs that will be automated. In fact, more and more people today believe that, together, image recognition, voice recognition and translation software will soon make translators and interpreters unnecessary. These technologies cannot properly address many cultural facets,
but will very soon be capable of handling most translation duties. The main reason behind translation automation is business. In a world where money is, most people are not fluent in another language. For example, Barbara Beskind recently solicited “a device where you speak into it in one language and it comes out in another” to help people from other countries who care about the elderly being able to speak better English.
Fearing automation is no foolishness since, according to a 2013 Oxford Martin School research, translation is one of the first ten jobs that will be automated.
I n c i d e n t a l l y, the U.S. Army’s FAST is testing a tool to improve the ability of troops in foreign languages, particularly in French dialects, which is becoming critical in Africa, a future frontier for technology in the next 10-15 years. In contrast, there is no imminent doom for human multi-linguists as, according to the US Department of Labor, “It is seldom, if ever, sufficient to use machine translation without having a human who is trained in translation available to review and correct the translation to ensure that it is conveying the intended message.” Indeed, sorters will be replaced before translators and interpreters by automated inspection technology. Wait! Isn’t translation quality assessment still entirely based on manual inspection? While translation scholars and industry players are still deeply involved in devising new ways to teach an old dog new tricks — possibly applying for improbable patents — others use big data analytics to detect indicators of deception and
29
The Research Perspective by Luigi Muzii
poor-quality information from text. Obviously, not everything that glitters is gold. 3D printing, for example, is definitely not the miracle we were promised, even by eminent theorists. The problem with it is design. While everybody can take a picture or write a text and print them on a 2D printer, to shape a whistle, hundreds of euros are necessary just for the software. Training could take months while making could take days, mostly to get a poor output. The game is just not worth the candle. Not yet.
Why, then, be alarmed if today’s journalists use Google Translate? If they know how to handle input and output correctly without harming facts, it cannot but be helpful.
U n d e n i a b l y, predictions are tricky and chancy, and even acclaimed economists could not be
good with them. Granted, Ray Kurzweil’s predictions mostly proved true and the Naples-based newspaper “Il Mattino” recently re-published a 1962 Sicilian magazine with an article titled “Nel 2000 i telefoni faranno tutto loro” (In 2000 Phones Will Do Everything). The article reported about three AT&T experts predicting that we would read newspapers and execute banking transactions through the telephone network. Why, then, be alarmed if today’s journalists use Google Translate? If they know how to handle input and output correctly without harming facts, it cannot but be helpful. Conversely, the journalist reproaching a football coach for allegedly using Google Translate to greet his followers should have known better: possibly, he only trusted his still uncertain Spanish. From the dancing mice experiments, we have learnt that we must adapt to reductions in
30
mental workload caused by automation, but the answer from translation scholars — let alone industry associations and self-appointed pundits — has been ludicrous. A recent study from a European think tank showed, despite the shortage of science and engineering graduates, enrolling in humanities courses could still be a better investment: some subjects are more demanding, and the number of years to graduation can vary according to country and subject. In some cases, the lower return is more the result of high costs than it is of low benefits. This could explain the steady enrolling to language-related courses. Also outside Europe. The study indirectly decrees the failure of the Bologna Process, calling for a deep redesign of university courses. This applies especially to translation courses, far beyond the wishful thinking of EMT and the pointless convolutions of academics who, after decades of detachment from reality, in a frantic attempt to catch up, have fallen into the opposite error, shaping their courses to meet the needs of the labor market, ignoring its frenzy. Indeed, another study of Oxford Martin School argues that speed of innovation might cause secular stagnation and pleas for a shift towards inclusive growth. Anyway, according to Erik Brynjolfsson and Andrew McAfee and Forrester Research, the future of jobs is not as gloomy, maybe because 60% of the best jobs in the next ten years have not been invented yet.
Send your comments or questions to research@taus.net
31
A Game Changer For The Translation Industry by Toos Stoker
An Interview about the Quality Dashboard The TAUS Quality Dashboard
is an industry-shared
platform in which evaluation and productivity data is visualized in a flexible reporting environment. the
Dashboard
In
Also, new technologies have called for new ways of measuring quality. Translators used to work from scratch.
you can create customized reports or
filter data to be reflected in the charts. and external benchmarking is
Both internal supported, offering
the possibility to monitor your own development and
Now, they are using CAT tools and post-editing MT output. These new content types and contexts require different quality levels.
compare results to industry averages.
I
sat down with
and
DQF
TAUS
them about the
Jaap van der Meer Attila Görög and asked
director
product manager
Quality Dashboard
and their plans to
Jaap: We released DQF in 2011 and people using the tools and TAUS knowledge base are satisfied. The integration of DQF into translation tools through an API is the next step forward.
develop the platform.
How does it work? You just released the Quality Dashboard. What’s the story behind it? Attila: The Quality Dashboard is a relatively recent idea. We started working on it in the first quarter of this year. But the fundamentals, the Dynamic Quality Framework (DQF), were of course already there. Jaap: When we started talking about translation quality with our TAUS enterprise members in 2010, we learned that most of them, if not all of them, struggled with the concept. If you asked a localization manager, for example, which part of their job had given them the most headaches, they said translation quality. No one really seemed to be able to define the cause of the problem. Nor were they able to explain to their vendors and translators what kind of quality they were looking for. That is when we realized that the industry had been struggling with the static approach of quality for a long time. And that there was a need for a more dynamic approach. Attila: DQF goes back to the idea that absolute translation quality does not exist. Absolute quality would mean that of a hundred different translations of the same text, just one is correct. In reality multiple, different translations can be equally good for purpose.
32
Attila: Once you’ve downloaded and installed the DQF plugin, you can connect with DQF tools from your TMS or CAT tool environment. In this way you can track your own productivity, efficiency and quality while in the translation production mode. In the reporting environment of DQF, the Quality Dashboard, you can view and benchmark your results. Jaap: The Quality Dashboard is a collaborative platform. This has multiple advantages. First of all, it allows you not only to measure scores but also to benchmark your scores with industry averages. This means you can report objective and meaningful results to whoever needs it. Another big advantage is the shared investment. Especially for small and medium enterprises, it becomes a lot more efficient and affordable to add quality measurement in their workflow. Attila: Of course we would like to have as many people as possible using the plugin. Therefore we have decided to offer the API under an open source license. We’re very encouraged by the rate of adoption so far. The SDL Trados Studio plugin is already in use, as
A Game Changer For The Translation Industry by Toos Stoker
is the SDL WorldServer plugin. Plugins for TMS, Memsource, XTM Cloud, Lingotek, Unbabel are currently being developed. Jaap: We want to make sure that we represent the industry’s best practices. Therefore we set up LinkedIn user groups for all the different translation technologies. Our aim is to collectively move the Quality Dashboard forward as the quality standard for the global translation industry.
We believe the translation industry is just scratching the surface.
NEXT STEPS In the current version of the Quality Dashboard you can filter data by language pair, time, industry type, content type and translation technology. In the next couple of weeks the team will add more advanced reporting features like project based reporting which enables you to see the productivity for a single, specific project. Last, but not least, reports and filtering options for efficiency will be released.
people in the translation industry do a better job every day. Attila: I’m also proud of the fact that with the Quality Dashboard we are able to serve not only big companies, but also individual translators. We’ve always wanted to be there for the whole industry and I now feel we have reached that point. What do you consider your ultimate goal for the Quality Dashboard? Attila: To become a game changer for the industry. Jaap: Yes, become a game changer for the industry, enabling better translation. This is the big vision. We believe the translation industry is just scratching the surface. If we can solve the quality issue, the translation industry will get a huge tailwind and grow faster than before.
Send your comments or questions to review@taus.net
What are you most proud of? Jaap: Our transition from a think tank to a platform for shared services. When we started, TAUS was mostly about ideas. We were a think tank. With our members, we talked about translation automation, collaboration and innovation. When the industry started adopting these ideas we felt that it was time to convert our thoughts into something tangible. I’m proud that we’ve been able to make that transition. We are offering an online, software-based service on our website that helps
Toos Stoker Digital Marketing Manager Toos
studied
Language Diversity of Africa, Asia and Native America at the University of Leiden. During and after her studies she worked as a copywriter and contentmarketeer. Now she is responsible for TAUS’s online marketing activities.
33
Contributors
Reviews Mike Tian-Jian Jiang Mike was the core developer of GOING (Natural Input Method, http://iasl.iis.
Andrew Joscelyne
sinica.edu.tw/goingime.htm), one of the most famous intelligent Chinese
Andrew Joscelyne has been reporting
phonetic
on language technology in Europe for
He was also one of the core committers of OpenVanilla,
well over 20 years now. He also been a market watcher
one of the most active text input method and processing
for European Commission support programs devoted
platform. He has over 12, 10, and 8 years experiences
to mapping language technology progress and needs.
on C++, Java, and C#, respectively. Also familiar with
Andrew has been especially interested in the changing
Lucene and Lemur/Indri. His most important skill set is
translation industry, and began working with TAUS from
natural language processing, especially for Chinese word
its beginnings as a part of the communication team.
segmentation based on pattern generation/matching,
Today he sees language technologies (and languages
n-gram statistical language modeling with SRILM, and
themselves) as a collection of silos â&#x20AC;&#x201C; translation, spoken
conditional random fields with CRF++ or Wapiti.
interaction, text analytics, semantics, NLP and so on.
Specialties: Natural Language Processing, especially for
Tomorrow, these will converge and interpenetrate,
pattern analysis and statistical language modeling.
releasing new energies and possibilities for human
Information Retrieval, especially for tuning Lucene and
communication.
Lemur/Indri. Text Entry (Input Method).
Brian McConnell
Amlaku Eshetie
Brian
McConnell
is
the
Head
of
input
method products.
Amlaku earned a BA degree in
Localization for Insightly, the leading
Foreign
small
for
(English & French) in 1997, and
Google Apps. He is also the publisher
an MA in Teaching English as a
of Translation Reports, a buyers guide
Foreign Language (TEFL) in 2005,
for translation and localization technology and services,
both at Addis Ababa University, Ethiopia. He had been a
as well as a frequent contributor to TAUS Review.
teacher of English at various levels until he switched to
Specialties: Telecommunications system and software
translation and localisation in 2009. Currently, Amlaku
design with emphasis on IVR, wireless and multi-modal
is the founder and manager of KHAABBA International
communications. Translation and localization technology.
Training and Language Services at which he has been
business
CRM
service
Languages
&
Literature
able to create a big base of clients for services, such as localisation, translation, editing & proofreading, interpretation, voiceovers, copy writing.
34
Contributors
Perspectives Jost Zetzsche Jost Zetzsche is a certified Englishto-German technical translator, a translation technology consultant, and a widely published author on various aspects of translation. Originally from Hamburg, Germany, he earned a Ph.D. in the field of Chinese translation history and linguistics. His computer guide for translators, A Translator’s Tool Box for the 21st Century, is now in its eleventh edition and his technical newsletter for translators goes out to more than 10,000 translation professionals. In 2012, Penguin published his co-authored Found in Translation, a book about translation and interpretation for the general public. His Twitter handle is @jeromobot.
Luigi Muzii Luigi Muzii has been working in the language industry for more than 30 years as a translator, localizer, technical
writer,
author,
trainer,
university teacher of terminology and localization, and consultant. He has authored books on technical writing and translation quality systems, and is a regular speaker at conferences.
Nicholas Ostler
Lane Greene
Nicholas Ostler is author of three
Lane Greene is a business and
books on language history, Empires
finance
of the Word (2005), Ad Infinitum (on
Economist based in Berlin, and
Latin - 2007), and The Last Lingua
he also writes frequently about
Franca (2010). He is also Chairman
language for the newspaper and
of the Foundation for Endangered Languages, a global
online. His book on the politics of language around
charitable organization registered in England and Wales.
the world, You Are What You Speak, was published by
A research associate at the School of Oriental and African
Random House in Spring 2011. He contributed a chapter
Studies, University of London, he has also been a visiting
on culture to the Economist book “Megachange”, and his
professor at Hitotsubashi University in Tokyo, and L.N.
writing has also appeared in many other publications. He
Gumilev University in Astana, Kazakhstan. He holds an
is an outside advisor to Freedom House, and from 2005
M.A. from Oxford University in Latin, Greek, philosophy
to 2009 was an adjunct assistant professor in the Center
and economics, and a 1979 Ph.D. in linguistics from
for Global Affairs at New York University.
correspondent
for
The
M.I.T. He is an academician in the Russian Academy of Linguistics.
35
Directory of Distributors Appen Appen is an award-winning, global leader in language, search and social technology. Appen helps leading technology companies expand into new global markets. BrauerTraining Training a new generation of translators & interpreters for the Digital Age using a web-based platform + cafeteriastyle modular workshops. Capita TI Capita TI offers translation and interpreting services in more than 150 languages to ensure that your marketing messages are heard - in any language. Cloudwords Cloudwords accelerates content globalization at scale, dramatically reducing the cost, complexity and turnaround time required for localization. Concorde Concorde is the largest LSP in the Netherlands. We believe in the empowering benefits of technology in multilingual services. CPSL Multilingual language provider for global strategies: translation, localization, interpreting, transcription, voice over & subtitling. Crestec Europe B.V. We provide complete technical documentation services in any language and format in a wide range of subjects. Whatever your needs are, we have the solution for you! Global Textware Expertise in many disciplines. From small quick turnaround jobs to complex translation. All you need to communicate about in any language. HCR HCR works in conjunction with language partners to deploy software products and linguistic services globally in core industries such as IT, Automotive and more. Hunnect Ltd. Hunnect Ltd. is an MLV with innovative thinking and a clear approach to translation automation and training post-editors. www.hunnect.hu Iconic Translation Machines Machine Translation with Subject Matter Expertise. We help companies adopt MT technology.
iDisc Established in 1987, iDISC is an ISO-9001 and EN-15038 certified language and software company based in Spain, Argentina, Mexico and Brazil.
36
InterTranslations Intertranslations LLC is based in Athens, London and Nicosia offering translation and localization services in all languages. IOLAR Founded in 1998, Iolar employs 40 highly-skilled linguists and engineers specialised in translation of highly demanding documentation and software localisation. Jensen Localisation Localization services for the IT, Health Care, Tourism and Automotive industries in European languages (mostly Nordic, Dutch and Spanish). KantanMT.com KantanMT.com is a leading SaaS based statistical machine translation platform that enables users to develop and manage customized MT engines in the cloud. Kawamura International Based in Tokyo, KI provides language services to companies around the world including MT and PE solutions to accelerate global business growth. KHAABBA International Training and Language Services KHAABBA is an LSP company for African languages based in Ethiopia. Larsen Globalization Ltd Larsen Globalization Ltd is a recruitment company dedicated to the localization industry since 2000 with offices in Europe, the US and Japan. Lingo24 Lingo24 delivers a range of professional language services, using technologies to help our clients & linguists work more effectively. Linguistic Systems LSI provides foreign language translation services in over 115 languages and unlimited subject matter. Contact us at 877-654-5006 or www.linguist.com Lionbridge Lionbridge is the largest translation company and #1 localization provider in marketing services in the world, ensuring global success for over 800 leading brands MateCat MateCat is a free web CAT tool for LSPs and translators. Use it to translate your projects or to outsource to over 120,000 professional translators in one click. Memsource Cloud An API-enabled translation platform that includes vendor management, translatio memory, integrated machine translation, and a translatorâ&#x20AC;&#x2122;s workbench.
Directory of Distributors Mirai Translate Mirai Translate will custom-build a translation A.I. which make innovation happen for your business and create an exciting “MIRAI (future)”. Moravia Flexible thinking. Reliable delivery. Under this motto, Moravia delivers multilingual language services for the world’s brand leaders. Morningside Translation We’re a leading translation services company partnering with the Am Law 100 and Fortune 500 companies around the globe. MorphoLogic Localisation MorphoLogic Localisation is the developer of Globalese, an SMT system that helps increase translation productivity, decrease costs and shorten delivery times. Pactera Pactera is a leading Globalization Services provider, partnering with our clients to offer localization, in-market solutions and speech recognition services. Plunet Plunet GmbH develops and markets the business and translation management solution Plunet BusinessManager for professional LSPs and translation departments. Rockant Consulting & Training We provide consulting, training and managed services that transform your career from “localization guy/girl,” to a strategic adviser to management. Safaba Translation Solutions, Inc. A technology leader providing automated translation solutions that deliver superior quality and simplify the path to global presence unlike any other solution. SeproTec SeproTec is a 25 years experience Multilingual Service Provider ranked among the Top 40 Language Service Companies in the world. Sovee Sovee is a premier provider of translation and video solutions. The Sovee Smart Engine “learns” translation preferences in 6800 languages. sQuid sQuid help companies integrate and exploit translation technologies in their workflows and maximize their use of their language data.
STP Nordic Translation STP is a technology-focused Regional Language Vendor specialising in English, French, German and the Nordic languages. See www.stptrans.com. SYSTRAN SYSTRAN is the market historic provider of language translation softwaresolutions for global corporations, public agencies and LSPs tauyou language technology Machine translation and natural language processing solutions for the translation industry text&form text&form is an LSP with expertise in software & multimedia localization, technical translation, terminology management and SAP consulting. Tilde Tilde develops custom MT systems and online terminology services, with special expertise in the Nordic, Baltic, Russian, and CEE languages. TraductaNET Traductanet is a linguistic service company specialising in translation, software and website localisation, terminology management and interpreting. Trusted Translations Internationally recognized leader in multilingual translation & interpretation services. Committed to providing clients with the highest quality service. UTH International UTH International is an innovative professional provider of globalization solutions and industry information, serving customers with advanced technologies. Welocalize Welocalize offers innovative translation & localization solutions helping global brands grow & reach audiences around the world. Win & Winnow Provider of translation, multimedia and desktop publishing services founded in 2004. We are one of the top ten language services providers in Latin America. XTRF XTRF is a platform for project management, quoting, invoicing, sales and quality management, integrated with CAT, accounting and CRM tools.
37
Get your insights, tools, metrics, data, benchmarking, contacts and knowledge from a neutral and independent industry organization. Join TAUS!
TAUS is a think tank and resource center for the global translation industry. Open to all translation buyers and providers, from individual translators, language service providers & buyers to governments and NGOs.
taus.net
38
Industry Agenda
Upcoming TAUS Events
Upcoming TAUS Webinars
TAUS Roundtable 6 October, 2015 Washington, DC (USA)
TAUS Translation Technology Showcase StyleScorer & Source Content Profiler 7 October, 2015, 5 PM CET
TAUS Annual Conference 12-13 October, 2015 San Jose, CA (USA)
CrowdIn & PhraseApp 4 November, 2015, 5 PM CET
TAUS QE Summit San Jose 14 October, 2015 San Jose, CA (USA) hosted by eBay check out the 2016 agenda!
TAUS Translation Quality Webinar Ephemeral translations 18 November, 2015, 5 PM CET Translation Automation Users Call Enterprise Translation Management Strategies 29 October, 2015, 5 PM CET TAUS Post-editing Webinar Spanish to English module 20 October, 2015, 5 PM CET
Industry Events LocWorld Silicon Valley 14-16 October, 2015 Santa Clara, CA (USA)
Translating and the Computer 37 26-27 Nov, 2015 London (United Kingdom)
ICT 2015 Innovate, Connect, Transform 20-22 Oct, 2015 Lison (Portugal) MT Summit XV 30 Oct - 3 Nov, 2015 Miami, FL (USA) tcworld conference 2015 10-12 Nov, 2015 Stuttgart (Germany)
Do you want to have your event listed here? Write to editor@taus.net for information.
39
memoQ.com
“memoQ is a very intuitive,
user-friendly, easy to learn, and easy to integrate translation tool. In addition, the level of support provided by their team is excellent.
”
Birgitte Bohnstedt FLSmidth A/S
Capita-TI-TAUS-Review-Advert-June-2015.pdf 1 22/06/2015 11:14:10
why risk it Your data has never been so important, so why risk its loss? Secure Localization Management
40
• • • • •
Secure integration Secure translation management system Capita IL3 data center hosting Secure machine translation solutions ISO 9001/27001 certified www.capitatranslationinterpreting.com