Translating transliterations

dc.contributor.author Tiedemann, Jörg
dc.contributor.author Nabende, Peter
dc.date.accessioned 2012-02-03T16:50:15Z
dc.date.available 2012-02-03T16:50:15Z
dc.date.issued 2009-08-03
dc.description Book Chapter en_US
dc.description.abstract Translating new entity names is important for improving performance in Natural Language Processing (NLP) applications such as Machine Translation (MT) and Cross Language Information Retrieval (CLIR). Usually, transliteration is used to obtain phonetic equivalents in a target language for a given source language word. However, transliteration across different writing systems often results in different representations for a given source language entity name. In this paper, we address the problem of automatically translating transliterated entity names that originally come from a different writing system. These entity names are often spelled differently in languages using the same writing system. We train and evaluate various models based on finite state technology and Statistical Machine Translation (SMT) for a character-based translation of the transliterated entity names. In particular, we evaluate the models for translation of Russian person names between Dutch and English, and between English and French. From our experiments, the SMT models perform best with consistent improvements compared to a baseline method of copying strings. en_US
dc.description.sponsorship Nuffic en_US
dc.identifier.citation Tiedemann, J. & Nabende, P. (2009). Translating transliterations. In Kizza, J. M., Lynch, K., Ravi, N., Aisbett, J., & Phoha V. (eds.), Special topics in computing and ICT research: strengthening the role of ICT in development, Fountain Publishers, pages 97-108 en_US
dc.identifier.isbn 978-9970-02-738-5
dc.identifier.uri http://hdl.handle.net/10570/381
dc.language.iso en en_US
dc.publisher Fountain Publishers, Kampala. en_US
dc.subject Machine transliteration en_US
dc.subject Machine translation en_US
dc.subject Weighted finite state transducers en_US
dc.subject Phrase-based statistical machine translation en_US
dc.subject Character-based machine translation en_US
dc.title Translating transliterations en_US
dc.type Book Chapter en_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
ICCIR2009_63peternabende.pdf
Size:
129.62 KB
Format:
Adobe Portable Document Format
Description:
Book chapter
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: