Showing papers on "Computer-assisted translation published in 2003"

PDF

Open Access

Proceedings Article•DOI•

[...]

Philipp Koehn¹, Franz Josef Och¹, Daniel Marcu¹•Institutions (1)

27 May 2003

TL;DR: The empirical results suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translation.

...read moreread less

Abstract: We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models out-perform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. Surprisingly, learning phrases longer than three words and learning phrases from high-accuracy word-level alignment models does not have a strong impact on performance. Learning only syntactically motivated phrases degrades the performance of our systems.

...read moreread less

3,778 citations

Patent•

Language translation system and method using specialized dictionaries

[...]

Robert E. Levin

14 Nov 2003

TL;DR: In this paper, a system and method for translation of electronic communications automatically selects and deploys specialized dictionaries based upon context recognition and other factors, which can be used to translate electronic mail, instant messages, chat, SMS messages, electronic text and word processing files, Internet web pages, Internet search results, and other textual communications for a variety of device types, including wireless devices.

...read moreread less

Abstract: A system and method for translation of electronic communications automatically selects and deploys specialized dictionaries based upon context recognition and other factors. Software tools can be employed for continual dictionary enhancement. The invention can accept speech and text inputs and can be used to translate electronic mail, instant messages, chat, SMS messages, electronic text and word processing files, Internet web pages, Internet search results, and other textual communications for a variety of device types, including wireless devices. In one embodiment, language pairs are automatically determined in real-time.

...read moreread less

225 citations

Proceedings Article•

Creating corpora for speech-to-speech translation.

[...]

Genichiro Kikui, Eiichiro Sumita, Toshiyuki Takezawa, Seiichi Yamamoto

01 Jan 2003

TL;DR: This paper is collecting dialogue corpora by letting two people talk, each in his/her native language, through a speechto-speech translation system, to concentrate on translation modules, and has replaced speech recognition modules with human typists.

...read moreread less

Abstract: This paper presents three approaches to creating corpora that we are working on for speech-to-speech translation in the travel conversation task. The first approach is to collect sentences that bilingual travel experts consider useful for people goingto/coming-from another country. The resulting EnglishJapanese aligned corpora are collectively called the basic travel expression corpus (BTEC), which is now being translated into several other languages. The second approach tries to expand this corpus by generating many synonymous expressions for each sentence. Although we can create large corpora by the above two approaches relatively cheaply, they may be different from utterances in actual conversation. Thus, as the third approach, we are collecting dialogue corpora by letting two people talk, each in his/her native language, through a speechto-speech translation system. To concentrate on translation modules, we have replaced speech recognition modules with human typists. We will report some of the characteristics of these corpora as well.

...read moreread less

149 citations

Journal Article•DOI•

Embedding web-based statistical translation models in cross-language information retrieval

[...]

Wessel Kraaij, Jian-Yun Nie¹, Michel Simard¹•Institutions (1)

Université de Montréal¹

01 Sep 2003-Computational Linguistics

TL;DR: In this article, the problem of automatically mining parallel texts from the Web and different ways of integrating the translation models within the retrieval process was investigated, and the results showed that the Web-based translation models can surpass commercial MT systems in CLIR tasks.

...read moreread less

Abstract: Although more and more language pairs are covered by machine translation (MT) services, there are still many pairs that lack translation resources. Cross-language information retrieval (CLIR) is an application that needs translation functionality of a relatively low level of sophistication, since current models for information retrieval (IR) are still based on a bag of words. The Web provides a vast resource for the automatic construction of parallel corpora that can be used to train statistical translation models automatically. The resulting translation models can be embedded in several ways in a retrieval model. In this article, we will investigate the problem of automatically mining parallel texts from the Web and different ways of integrating the translation models within the retrieval process. Our experiments on standard test collections for CLIR show that the Web-based translation models can surpass commercial MT systems in CLIR tasks. These results open the perspective of constructing a fully automatic query translation device for CLIR at a very low cost.

...read moreread less

128 citations

Proceedings Article•DOI•

Feature-Rich Statistical Translation of Noun Phrases

[...]

Philipp Koehn¹, Kevin Knight¹•Institutions (1)

University of Southern California¹

07 Jul 2003

TL;DR: A dedicated noun phrase translation subsystem is built that improves over the currently best general statistical machine translation methods by incorporating special modeling and special features.

...read moreread less

Abstract: We define noun phrase translation as a subtask of machine translation This enables us to build a dedicated noun phrase translation subsystem that improves over the currently best general statistical machine translation methods by incorporating special modeling and special features We achieved 655% translation accuracy in a German-English translation task vs 532% with IBM Model 4

...read moreread less

117 citations

Proceedings Article•DOI•

SMT decoder dissected: word reordering

[...]

Stephan Vogel¹•Institutions (1)

Carnegie Mellon University¹

26 Oct 2003

TL;DR: A decoder for statistical machine translation which allows controlled reordering of the words generated in the target language and the effect of the length of this reordering window on the search space and the translation quality is analyzed.

...read moreread less

Abstract: We describe a decoder for statistical machine translation which allows controlled reordering of the words generated in the target language After a general discussion of the structure of a decoder a particular implementation is discussed which allows for word-to-word and phrase-to-phrase translation Word reordering is used to improve the translation quality We analyze the effect of the length of this reordering window on the search space and the translation quality Results for Chinese-to-English and Arabic-to-English translation tasks are presented

...read moreread less

80 citations

Patent•

Translation mediate system, translation mediate server and translation mediate method

[...]

Sukehiro Tatsuya¹•Institutions (1)

Oki Electric Industry¹

12 Mar 2003

TL;DR: In this paper, a translation result is sent to client terminal (1 A), for the client requesting for translation to select the best translator to translate the document from a client terminal to a translator.

...read moreread less

Abstract: The dictionaries which each of translators made are registered in translator dictionary database ( 11 ) of translation mediate server ( 4 ) being divided into each special field where corresponding translators belong. And, the client requesting for translation sends document to be translated, from for example, client terminal ( 1 A) to translation mediate server ( 4 ). There, automatic translating means ( 10 ) reads out translation dictionary corresponding to this document field of translator dictionary database ( 11 ). And, it performs automatic translation of the document by using this dictionary. This translation result is sent to client terminal ( 1 A), for the client requesting for translation to select the best translator. The selected translator is informed of translation request at his or her terminal, for example, translator terminal ( 2 A).

...read moreread less

75 citations

Proceedings Article•DOI•

Efficient search for interactive statistical machine translation

[...]

Franz Josef Och, Richard Zens, Hermann Ney

12 Apr 2003

TL;DR: A fully fledged translation system is used to ensure the quality of the proposed extensions of the interactive machine translation system and word hypotheses graphs are used as an efficient search space representation to achieve fast response times.

...read moreread less

Abstract: The goal of interactive machine translation is to improve the productivity of human translators. An interactive machine translation system operates as follows: the automatic system proposes a translation. Now, the human user has two options: to accept the suggestion or to correct it. During the post-editing process, the human user is assisted by the interactive system in the following way: the system suggests an extension of the current translation prefix. Then, the user either accepts this extension (completely or partially) or ignores it. The two most important factors of such an interactive system are the quality of the proposed extensions and the response time. Here, we will use a fully fledged translation system to ensure the quality of the proposed extensions. To achieve fast response times, we will use word hypotheses graphs as an efficient search space representation. We will show results of our approach on the Verbmobil task and on the Canadian Hansards task.

...read moreread less

61 citations

Book Chapter•DOI•

A Hybrid Rule and Example-Based Method for Machine Translation

[...]

Francis Bond, Satoshi Shirai

01 Jan 2003

TL;DR: A new example-based method of machine translation in which the examples need not be direct translations, allowing the use of currently available sentence-aligned corpora as data.

...read moreread less

Abstract: This paper introduces a new example-based method of machine translation in which the examples need not be direct translations. The system will weed out strange examples during translation, allowing the use of currently available sentence-aligned corpora as data. Rule-based modules are used where appropriate. A prototype Japanese-to-English system has been implemented that allows multiple users to share corpora.

...read moreread less

60 citations

Syntax for Statistical Machine Translation

[...]

Franz Josef Och, Daniel Gildea, Sanjeev Khudanpur, Kenji Yamada, Alexander Fraser, Shankar Kumar, David A. Smith, Katherine Eng, Viren Jain, Zhen Jin, Dragomir R. Radev - Show less +7 more

01 Jan 2003

TL;DR: This workshop incrementally add new features representing syntactic knowledge that deal with specific problems of the underlying baseline, and extends previous tree-based alignment models by allowing partial tree alignments when the two syntactic structures are not isomorphic.

...read moreread less

Abstract: In recent evaluations of machine translation systems, statistical systems have outperformed classical approaches based on interpretation, transfer, and generation. Nonetheless, the output of statistical systems often contains obvious grammatical errors. This can be attributed to the fact that the syntactic well-formedness is only influenced by local n-gram language models and simple alignment models. We aim to integrate syntactic structure into statistical models to address this problem. In the workshop we start with a very strong baseline – the alignment template statistical machine translation system that obtained the best results in the 2002 and 2003 DARPA MT evaluations. This model is based on a log-linear modeling framework, which allows for the easy integration of many different knowledge sources (i.e. feature functions) into an overall model and to train the feature function combination weights discriminatively. During the workshop, we incrementally add new features representing syntactic knowledge that deal with specific problems of the underlying baseline. We want to investigate a broad range of possible feature functions, from very simple binary features to sophisticated treeto-tree translation models. Simple feature functions test if a certain constituent occurs in the source and the target language parse tree. More sophisticated features are derived from an alignment model where whole sub-trees in source and target can be aligned node by node. We also plan to investigate features based on projection of parse trees from one language onto strings of another, a useful technique when parses are available for only one of the two languages. We extend previous tree-based alignment models by allowing partial tree alignments when the two syntactic structures are not isomorphic. We work with the Chinese-English data from the recent evaluations, as large amounts of sentence-aligned training corpora, as well as multiple reference translations are available. This will also allow to compare results with the various systems participating in the evaluations. In addition, an annotated Chinese-English parallel tree-bank is available. We evaluate the improvement of our system using the BLEU metric. Using the additional feature functions developed during the workshop the BLEU score improved from 31.6% for the baseline MT system to 33.2% using rescoring of a 1000-best list.

...read moreread less

60 citations

Noun phrase translation

[...]

Kevin Knight, Philipp Koehn

01 Jan 2003

...read moreread less

Abstract: We define noun phrase translation as a subtask of statistical machine translation. This enables us to build a dedicated noun phrase translation subsystem that improves over the currently best general statistical machine translation methods by incorporating special modeling and special features. We integrate such a system into a state-of-the-art statistical machine translation system with novel methods and show overall improvement in translation quality. We also carry out empirical linguistic studies on noun phrase translatability and the sources of translation errors.

...read moreread less

Book Chapter•DOI•

What is Example-Based Machine Translation?

[...]

Davide Turcato, Fred Popowich

01 Jan 2003

TL;DR: It is shown that Example-Based Machine Translation, as long as it is linguistically principled, significantly overlaps with other linguologically principled approaches to Machine Translation.

...read moreread less

Abstract: We maintain that the essential feature that characterizes a Machine Translation approach and sets it apart from other approaches is the kind of knowledge it uses. From this perspective, we argue that Example-Based Machine Translation is sometimes characterized in terms of nonessential features. We show that Example-Based Machine Translation, as long as it is linguistically principled, significantly overlaps with other linguistically principled approaches to Machine Translation. We make a proposal for translation knowledge bases that make such an overlap explicit. We relate our proposal to translation by analogy, which stands out as an inherently example-based technique.

...read moreread less

14. Controlled language for authoring and translation

[...]

Eric Nyberg¹, Teruko Mitamura¹, Willem-Olaf Huijsen²•Institutions (2)

Carnegie Mellon University¹, Utrecht University²

28 May 2003

Book Chapter•DOI•

Combining Query Translation and Document Translation in Cross-Language Retrieval

[...]

Aitao Chen¹, Fredric C. Gey¹•Institutions (1)

University of California, Berkeley¹

21 Aug 2003

TL;DR: The monolingual, bilingual, and multilingual retrieval experiments using the CLEF 2003 test collection show that document translation- based retrieval is slightly better than the query translation-based retrieval on the CLEFs.

...read moreread less

Abstract: This paper describes monolingual, bilingual, and multilingual retrieval experiments using the CLEF 2003 test collection. The paper compares query translation-based multilingual retrieval with document translation-based multilingual retrieval where the documents are translated into the query language by translating the document words individually using machine translation systems or statistical translation lexicons derived from parallel texts. The multilingual retrieval results show that document translation-based retrieval is slightly better than the query translation-based retrieval on the CLEF 2003 test collection. Furthermore, combining query translation and document translation in multilingual retrieval achieves even better performance.

...read moreread less

Patent•

System and method for real-time generation of software translation

[...]

Ines Antje Dahne-Steuber, Marcos Garcia

23 Sep 2003

TL;DR: A system and method for generating language-translated versions of software include a parsing engine to scan original-language versions and detect textual string or other expressions which may require translation for other countries or markets as mentioned in this paper.

...read moreread less

Abstract: A system and method for generating language-translated versions of software include a parsing engine to scan original-language versions of software, and detect textual string or other expressions which may require translation for other countries or markets. After testing for prior translation, those strings may be converted to appropriate expressions in other languages, and for instance stored in paired-memory or other format. Users may download the original version of the software, and then install run-time, language-specific resources to tailor the software to their market or country. The run-time, language-specific resources may be or include resource-only dynamic link libraries (dlls). In embodiments the target language into which translation may be made may be automatically detected using the regional settings of the user's machine, or otherwise. Because translation resources for various sets of languages may be generated before the release of the original code, software products may be deployed in various markets and countries at the same time as the original code. Staggered release of localized versions of a software product in one country after the other is therefore not necessary, and software maintenance is made more efficient.

...read moreread less

Proceedings Article•

Domain Specific Speech Acts for Spoken Language Translation

[...]

Lori Levin, Chad Langley, Alon Lavie, Donna Gates, Dorcas Wallace, Kay Peterson - Show less +2 more

01 Jan 2003

TL;DR: It is shown that, although domain actions are domain specific, the approach scales up to large domains without an explosion of domain actions and can be coded with high inter-coder reliability across research sites.

...read moreread less

Abstract: We describe a coding scheme for machine translation of spoken taskoriented dialogue. The coding scheme covers two levels of speaker intention − domain independent speech acts and domain dependent domain actions. Our database contains over 14,000 tagged sentences in English, Italian, and German. We argue that domain actions, and not speech acts, are the relevant discourse unit for improving translation quality. We also show that, although domain actions are domain specific, the approach scales up to large domains without an explosion of domain actions and can be coded with high inter-coder reliability across research sites. Furthermore, although the number of domain actions is on the order of ten times the number of speech acts, sparseness is not a problem for the training of classifiers for identifying the domain action. We describe our work on developing high accuracy speech act and domain action classifiers, which is the core of the source language analysis module of our NESPOLE machine translation system.

...read moreread less

Posted Content•

Anusaaraka: Machine Translation in Stages

[...]

Akshar Bharati, Vineet Chaitanya, Amba Kulkarni, Rajeev Sangal

25 Jun 2003-arXiv: Computation and Language

TL;DR: Since, the machine is not capable of interpreting a general text with sufficient accuracy automatically at present - let alone re-expressing it for a given audience, it fails to perform as FGH-MT.

...read moreread less

Abstract: Fully-automatic general-purpose high-quality machine translation systems (FGH-MT) are extremely difficult to build. In fact, there is no system in the world for any pair of languages which qualifies to be called FGH-MT. The reasons are not far to seek. Translation is a creative process which involves interpretation of the given text by the translator. Translation would also vary depending on the audience and the purpose for which it is meant. This would explain the difficulty of building a machine translation system. Since, the machine is not capable of interpreting a general text with sufficient accuracy automatically at present - let alone re-expressing it for a given audience, it fails to perform as FGH-MT. FOOTNOTE{The major difficulty that the machine faces in interpreting a given text is the lack of general world knowledge or common sense knowledge.}

...read moreread less

Proceedings Article•DOI•

Translation spotting for translation memories

[...]

Michel Simard¹•Institutions (1)

Université de Montréal¹

31 May 2003

TL;DR: This article examines the task of identifying the target-language words that correspond to a given set of source- language words in a pair of text segments known to be mutual translations within the context of a sub-sentential translation-memory system, i.e. a translation support tool capable of proposing translations for portions of a SL sentence, extracted from an archive of existing translations.

...read moreread less

Abstract: The term translation spotting (TS) refers to the task of identifying the target-language (TL) words that correspond to a given set of source-language (SL) words in a pair of text segments known to be mutual translations. This article examines this task within the context of a sub-sentential translation-memory system, i.e. a translation support tool capable of proposing translations for portions of a SL sentence, extracted from an archive of existing translations. Different methods are proposed, based on a statistical translation model. These methods take advantage of certain characteristics of the application, to produce TL segments submitted to constraints of contiguity and compositionality. Experiments show that imposing these constraints allows important gains in accuracy, with regard to the most probable alignments predicted by the model.

...read moreread less

Proceedings Article•

Speechalator: two-way speech-to-speech translation on a consumer PDA

[...]

Alex Waibel, Ahmed Badran, Alan W. Black, Robert E. Frederking, Donna Gates, Alon Lavie, Lori Levin, Kevin A. Lenzo, Laura Mayfield Tomokiyo, Jürgen Reichert, Tanja Schultz, Dorcas Wallace, Monika Woszczyna, Jing Zhang - Show less +10 more

01 Jan 2003

TL;DR: A working two-way speech-to-speech translation system that runs in near real-time on a consumer handheld computer that can translate from English to Arabic and Arabic to English in the domain of medical interviews is described.

...read moreread less

Abstract: This paper describes a working two-way speech-to-speech translation system that runs in near real-time on a consumer handheld computer. It can translate from English to Arabic and Arabic to English in the domain of medical interviews. We describe the general architecture and frameworks within which we developed each of the components: HMM-based recognition, interlingua translation (both rule and statistically based), and unit selection synthesis.

...read moreread less

Proceedings Article•DOI•

Chunk-Based Statistical Translation

[...]

Taro Watanabe, Eiichiro Sumita, Hiroshi G. Okuno¹•Institutions (1)

Kyoto University¹

07 Jul 2003

TL;DR: An alternative translation model based on a text chunk under the framework of statistical machine translation is described, which has experimented on a broad-coverage Japanese-English traveling corpus and achieved improved performance.

...read moreread less

Abstract: This paper describes an alternative translation model based on a text chunk under the framework of statistical machine translation. The translation model suggested here first performs chunking. Then, each word in a chunk is translated. Finally, translated chunks are reordered. Under this scenario of translation modeling, we have experimented on a broad-coverage Japanese-English traveling corpus and achieved improved performance.

...read moreread less

Book•

Translation: The Interpretive Model

[...]

Marianne Lederer

31 Jul 2003

TL;DR: This chapter discusses translation into the foreign language and its cultural adaptation to the reader, as well as the role of language skills in this process.

...read moreread less

Abstract: Introduction to English Translation Foreword Part I: The Theoretical Aspects of Translation Chapter 1: Translation through Interpretation 1.1. The three levels of translation 1.2. Interpreting 1.3. The oral and the written 1.4. The oral origins of the interpretive explanation of translation 1.5. What is interpretation? 1.5.1. Deverbalization 1.5.2. Sense 1.5.3. The immediate grasp of sense 1.5.4. Units of sense 1.6. The written form 1.7. Understanding 1.7.1. Understanding the linguistic component 1.7.2. Understanding what is implicit 1.7.3. Cognitive inputs 1.8. Expression 1.8.1. Reverbalization 1.8.2. The verification stage 1.8.3. Identical contents, equivalent forms Chapter 2: Equivalence and correspondence 2.1. Equivalence and correspondence 2.1.1. What is equivalence? 2.1.2. What is correspondence? 2.2. Translation by equivalence 2.2.1. Cognitive equivalence 2.2.2. Affective equivalence 2.2.3. The global nature of equivalence 2.2.4. Explicit or synecdoche 2.2.5. The spirit of a language and the creation of equivalents 2.2.6. How to evaluate equivalence? 2.3. Correspondences which are appropriate when translating texts 2.3.1. Words chosen deliberately 2.3.2. Enumerations 2.3.3. Technical terms 2.3.4. Polysemy and actualization 2.3.5. The various forms of translation by correspondence 2.4. Faithfulness and freedom Chapter 3: Language and Translation 3.1. Linguistics and translation 3.1.1. Structural linguistics 3.1.2. Generative linguistics 3.1.3. Communication and the interactionist approach 3.2. Langue, parole and text: some definitions 3.3. Macro-signs and hypotheses of senses 3.4. Interpretation 3.5. Two demonstrations of interpretation 3.5.1. Interpretation from the actor 3.5.2. Interpretation made explicit Part II: The Practice of Translation Chapter 4: The Practical Problems of Translation 4.1. A few problems observed in practice 4.1.1. The absence of deverbalization 4.1.2. Deverbalization, a methodological issue 4.1.3. The translation unit 4.1.4. Faithfulness 4.1.5. The transfer of culture Chapter 5: Translation and the Teaching of Languages 5.1. The natural tendency of all learners 5.2. Comparative studies and the teaching of translation 5.3. The awkward position of translation 5.4. Translation into the foreign language (thAme) and translation into the mother tongue (version 5.4.1. Translation into the foreign language (thAme) 5.4.2. Translation into the mother tongue (version) 5.5. How to improve the language skills of the would-be-translator 5.5.1 The language skills course 5.5.2. The self-study brochure 5.6.The teaching of translation Chapter 6: Translation into the Foreign Language 6.1. Into which language should one translate? 6.2. The limits of translation into the foreign language 6.3. Acceptability in translation 6.3.1. The complementarity between the specialist reader and the foreign language translation 6.3.2. Foreign language translation and its cultural adaptation to the reader 6.3.3. The general public and translation into a foreign language Chapter 7 Machine Translation versus Human Translation 7.1. An historical overview of machine translation 7.2. Machine translation today 7.2.1. Fully automatic machine translation 7.2.2. Human intervention 7.3. How the machine understands languages 7.3.1. Lexical data 7.3.2. Transformational rules 7.3.3. Parsing 7.4. Comparing humans and machines 7.4.1. The differences 7.4.2. The similarities 7.4.3. Real world knowledge and contextual knowledge 7.5. Machines move closer to humans 7.5.1. Knowledge bases 7.5.2. Neural networks 7.6. Machine-aided human translation Afterword Appendix 1 Cannery Row Appendix 2 The Woman behind the Woman

...read moreread less

Journal Article•DOI•

Machine Translation and Global English

[...]

Rita Raley

25 Nov 2003-Yale Journal of Criticism

TL;DR: The authors argue for a homology between machine translation and global English, arguing that both exist in the technocratic mode and abide by the principle of instrumental rationality, and argue that basic language, with its privileging of communicability and immediate legibility, is the precondition for the global network of programming languages.

...read moreread less

Abstract: This essay critiques Warren Weaver's articulation of machine translation as a problem of cryptography and his analogizing of the treatment of language within the context of machine translation to C. K. Ogden and I. A. Richards's Basic English project. Basic language, with its privileging of communicability and immediate legibility, is the precondition for the global network of programming languages. Focusing on the underlying principles of machine translation, functionality, and performativity, this essay argues for a homology between machine translation and global English: both exist in the technocratic mode and abide by the principle of instrumental rationality.

...read moreread less

Dissertation•DOI•

Metrics for Evaluating Translation Memory Software

[...]

Francie Gow

01 Jan 2003

TL;DR: This paper presents a meta-analysis of how translation memory tools help human translators recycle portions of their previous work by storing previously translated material in the context of source texts.

...read moreread less

Abstract: Acknowledgments First and foremost, I would like to thank my thesis supervisor Dr. Lynne Bowker for her prompt and insightful feedback, her talent for transforming the overwhelming into the manageable and her perpetual good nature. Working with you has truly been one of the highlights of this whole adventure! Thanks to Dr. Ingrid Meyer for steering me toward graduate work in the first place and for awakening my interest in translation technology. Thanks also to Lucie Langlois, who helped me define my project in its earliest stages and who has been providing me with interesting work experience and useful contacts ever since. Association of Translators and Interpreters of Ontario (ATIO) generously provided me with financial support. The Translation Bureau has been very supportive of my research by providing me with access to software, hardware and technical support. I am especially grateful for being granted access to the Central Archiving System, which allowed me to build the corpora I needed to test my methodology. Another benefit of my time at the Translation Bureau was the opportunity to work with André Guyon of IT Strategies, who was very generous with his knowledge of both translation memory and evaluation. Our exchanges of ideas in the later stages of the project stimulated me to solve problems creatively and certainly resulted in an improved methodology. iii Last but not least, many thanks to my friends and family who have been very supportive during the past two years. Thanks to Vanessa for walking this path ahead of me and letting me know some of what to expect. Thanks to my fellow translation students for walking this path alongside me. Thanks to my parents and sisters in Newfoundland for listening to my whoops of joy and wails of despair whenever I needed an ear. Finally, thanks to my " Ottawa family " , Inge, Julien, Anne, Alphonse and their (our) friends, for keeping me well fed and (relatively) well adjusted all this time. Additional thanks go to Alphonse for translating my abstract into French. I am privileged to have you all in my life; thanks for the " memories " ! iv Abstract Translation memory (TM) tools help human translators recycle portions of their previous work by storing previously translated material. This material is aligned, i.e. segments of the source texts are linked with their equivalents in the corresponding target texts. When a translator uses a TM …

...read moreread less

Proceedings Article•DOI•

Speechalator: two-way speech-to-speech translation in your hand

[...]

Alex Waibel¹, Ahmed Badran¹, Alan W. Black¹, Robert E. Frederking¹, Donna Gates¹, Alon Lavie¹, Lori Levin¹, Kevin A. Lenzo, Laura Mayfield Tomokiyo, Juergen Reichert, Tanja Schultz¹, Dorcas Wallace¹, Monika Woszczyna, Jing Zhang - Show less +10 more•Institutions (1)

Carnegie Mellon University¹

27 May 2003

TL;DR: The development of the Speechalator software-based translation system required addressing a number of hard issues, including a new language for the team, close integration on a small device, computational efficiency on a limited platform, and scalable coverage for the domain.

...read moreread less

Abstract: This demonstration involves two-way automatic speech-to-speech translation on a consumer off-the-shelf PDA. This work was done as part of the DARPA-funded Babylon project, investigating better speech-to-speech translation systems for communication in the field. The development of the Speechalator software-based translation system required addressing a number of hard issues, including a new language for the team (Egyptian Arabic), close integration on a small device, computational efficiency on a limited platform, and scalable coverage for the domain.

...read moreread less

Proceedings Article•DOI•

A corpus-centered approach to spoken language translation

[...]

Eiichiro Sumita, Yasuhiro Akiba, Takao Doi, Andrew Finch, Kenji Imamura, Michael Paul, Mitsuo Shimohata, Taro Watanabe - Show less +4 more

12 Apr 2003

TL;DR: The latest performance of components and features of a project named Corpus-Centered Computation (C3), which targets a translation technology suitable for spoken language translation, are reported.

...read moreread less

Abstract: This paper reports the latest performance of components and features of a project named Corpus-Centered Computation (C3), which targets a translation technology suitable for spoken language translation. C3 places corpora at the center of the technology. Translation knowledge is extracted from corpora by both EBMT and SMT methods, translation quality is gauged by referring to corpora, the best translation among multiple-engine outputs is selected based on corpora and the corpora themselves are paraphrased or filtered by automated processes.

...read moreread less

Patent•

Real-time generation of software translation

[...]

Ines Antje Dahne-Steuber, Marcos Garcia

23 Sep 2003

TL;DR: In this article, a parsing engine is used to scan original-language versions of software, and detect textual string or other expressions which may require translation for other countries or markets, after testing for prior translation, those strings may be converted to appropriate expressions in other languages and for instance stored in paired-memory or other format.

...read moreread less

Abstract: Generating language-translated versions of software include a parsing engine to scan original-language versions of software, and detect textual string or other expressions which may require translation for other countries or markets. After testing for prior translation, those strings may be converted to appropriate expressions in other languages, and for instance stored in paired-memory or other format. Users may download the original version of the software, and then install run-time, language-specific resources to tailor the software to their market or country. The run-time, language-specific resources may be or include resource-only dynamic link libraries (dlls). In embodiments the target language into which translation may be made may be automatically detected using the regional settings of the user's machine, or otherwise. Because translation resources for various sets of languages may be generated before the release of the original code, software products may be deployed in various markets and countries at the same time as the original code. Staggered release of localized versions of a software product in one country after the other is therefore not necessary, and software maintenance is made more efficient.

...read moreread less

Proceedings Article•DOI•

What's new in statistical machine translation

[...]

Kevin Knight¹, Philipp Koehn¹•Institutions (1)

Information Sciences Institute¹

27 May 2003

TL;DR: A technical overview of the state-of-the-art machine translation projects, including several statistical MT projects have appeared in North America, Europe, and Asia, and the literature is growing substantially.

...read moreread less

Abstract: Automatic translation from one human language to another using computers, better known as machine translation (MT), is a long-standing goal of computer science. Accurate translation requires a great deal of knowledge about the usage and meaning of words, the structure of phrases, the meaning of sentences, and which real-life situations are plausible. For general-purpose translation, the amount of required knowledge is staggering, and it is not clear how to prioritize knowledge acquisition efforts.Recently, there has been a fair amount of research into extracting translation-relevant knowledge automatically from bilingual texts. In the early 1990s, IBM pioneered automatic bilingual-text analysis. A 1999 workshop at Johns Hopkins University saw a re-implementation of many of the core components of this work, aimed at attracting more researchers into the field. Over the past years, several statistical MT projects have appeared in North America, Europe, and Asia, and the literature is growing substantially. We will provide a technical overview of the state-of-the-art.

...read moreread less

3. Translation memory systems

[...]

Harold L. Somers¹•Institutions (1)

University of Manchester¹

28 May 2003

Proceedings Article•DOI•

TotalRecall: A Bilingual Concordance for Computer Assisted Translation and Language Learning

[...]

Jian-Cheng Wu¹, Kevin C. Yeh¹, Thomas C. Chuang, Wen-Chi Shei¹, Jason S. Chang¹ - Show less +1 more•Institutions (1)

National Tsing Hua University¹

07 Jul 2003

TL;DR: A Web-based English-Chinese concordance system developed to promote translation reuse and encourage authentic and idiomatic use in second language writing and to provide high-precision bilingual alignment on the sentence, phrase and word levels is described.

...read moreread less

Abstract: This paper describes a Web-based English-Chinese concordance system, Total-Recall, developed to promote translation reuse and encourage authentic and idiomatic use in second language writing. We exploited and structured existing high-quality translations from the bilingual Sinorama Magazine to build the concordance of authentic text and translation. Novel approaches were taken to provide high-precision bilingual alignment on the sentence, phrase and word levels. A browser-based user interface (UI) is also developed for ease of access over the Internet. Users can search for word, phrase or expression in English or Chinese. The Web-based user interface facilitates the recording of the user actions to provide data for further research.

...read moreread less

Journal Article•

Translation and Notes

[...]

LI Si-long

01 Jan 2003-Journal of Social Science of Jiamusi University

TL;DR: Three kinds-notes in the sentences (or phrases), after the sentences and after the texts, are discussed and their usages in translation by adopting a great number of examples.

...read moreread less

Abstract: The impediments of translation are inevitably brought by language and culture factors in the process of translation from the source language to the target language, during which the "note" plays a typical role. After dividing it into three kinds-notes in the sentences (or phrases),after the sentences and after the texts, the paper discusses their usages in translation by adopting a great number of examples.

...read moreread less