Idiom Treatment Experiments in Machine Translation

Author(s): Dimitra Anastasiou

In 1975, Searle stated that one should speak idiomatically unless there is some good reason not to do so. Fillmore, Kay, and O’Connor in 1988 defined an idiomatic expression or construction as something that a language user could fail to know while knowing everything else in the language. Our language is rich in conversational phrases, idioms, metaphors, and general expressions used in metaphorical meaning. These idiomatic expressions pose a particular challenge for Machine Translation (MT), because their translation for the most part does not work literally, but logically. The present book shows how idiomatic expressions can be recognized and correctly translated with the help of a bilingual idiom dictionary (English-German), a monolingual (German) corpus, and morphosyntactic rules. The work focuses on the field of Example-based Machine Translation (EBMT). A theory of idiomatic expressions with their syntactic and semantic properties is provided, followed by the practical part of the book which describes how the hybrid EBMT system METIS-II is able to correctly process idiomatic expressions. A comparison of METIS-II with three commercial systems shows that idioms are not impossible to translate as it was predicted in 1952: “The only way for a machine to treat idioms is—not to have idioms!”

This book furnishes plenty of examples of idiomatic phrases and provides the foundation for how MT systems can process and translate idioms by means of simple linguistic resources.


ISBN-13: 978-1-4438-2515-3
ISBN-10: 1-4438-2515-8
Date of Publication: 01/12/2010
Pages / Size: 265 / A5
Price: £39.99


Dimitra Anastasiou is a post-doctoral researcher at the departments of Computer Science and Languages and Literary Studies at the University of Bremen, Germany. Her research focuses on multimodal and crosslingual environments, particularly on the improvement of dialogue systems with relation to assisted living environments. Previously she worked at the Localisation Research Centre as part of the Centre for Next Generation Localisation project. There her research focused on multilingual digital content development, standards (e.g. XLIFF), and metadata. She has been supervising PhD and Master students and lecturing “computational linguistics methods,” “localisation tools and technologies,” and “translation technology and Machine Translation.” She is further interested in idiom processing as well as crowdsourcing and open-source translation tools. Some of her publications include Proceedings of the 1st XLIFF International Symposium, edited by D. Anastasiou and L. Morado Vázquez (2010); Proceedings of ACL/IJCNLP Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications, edited by D. Anastasiou, C. Hashimoto, P. Nakov, and S. N. Kim (2009); “Localisation Standards and Metadata,” by D. Anastasiou and L. Morado Vázquez, in Proceedings of the 4th Metadata and Semantics Research Conference (2010); “Translating Vital Information: Localisation, Internationalisation, and Globalisation,” by D. Anastasiou and R. Schäler, in Journal Synthèses (2009); “Lokalisierung. Lokalisierungskonzept, Internationalisierung und Übersetzung, Software-Lokalisierung,” by D. Anastasiou, M. Lenker, and R. Schäler, in Zeitschrift der Gesellschaft für Sprache und Sprachen (2009); and “Identification of idioms by MT’s hybrid research system vs. three commercial systems,” by D. Anastasiou, in Proceedings of the European Machine Translation Conference (2008).