Showing papers on "Computer-assisted translation published in 1999"

PDF

Open Access

Proceedings Article•DOI•

Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web

[...]

Jian-Yun Nie¹, Michel Simard¹, Pierre Isabelle¹, Richard Durand¹•Institutions (1)

01 Aug 1999

TL;DR: It is shown that using a probabilistic model, it is able to obtain performances close to those using an MT system, and the possibility of automatically gather parallel texts from the Web in an attempt to construct a reasonable training corpus is investigated.

...read moreread less

Abstract: This paper describes the use of a probabilistic translation model to cross-language IR (CLIR). The performance of this approach is compared with that using machine translation (MT). It is shown that using a probabilistic model, we are able to obtain performances close to those using an MT system. In addition, we also investigated the possibility of automatically gather parallel texts from the Web in an attempt to construct a reasonable training corpus. The result is very encouraging. We showed that in several tests, such a training corpus is as good as a manually constructed one for CLIR purposes.

...read moreread less

332 citations

The World Wide Web as a Resource for Example-Based Machine Translation Tasks

[...]

Gregory Grefenstette

01 Jan 1999

TL;DR: This article illustrates this by showing that an Example-Based approach to lexical choice for machine translation can use the Web as an adequate and free resource.

...read moreread less

Abstract: The WWW is two orders of magnitude larger than the largest corpora. Although noisy, web text presents language as it is used, and statistics derived from the Web can have practical uses in many NLP applications. For this reason, the WWW should be seen and studied as any other computationally available linguistic resource. In this article, we illustrate this by showing that an Example-Based approach to lexical choice for machine translation can use the Web as an adequate and free resource.

...read moreread less

175 citations

Proceedings Article•DOI•

Should we Translate the Documents or the Queries in Cross-language Information Retrieval?

[...]

J. Scott McCarley¹•Institutions (1)

IBM¹

20 Jun 1999

TL;DR: This work investigates information retrieval between English and French, incorporating both translations directions into both document translation and query translation-based information retrieval, as well as into hybrid systems.

...read moreread less

Abstract: Previous comparisons of document and query translation suffered difficulty due to differing quality of machine translation in these two opposite directions. We avoid this difficulty by training identical statistical translation models for both translation directions using the same training data. We investigate information retrieval between English and French, incorporating both translations directions into both document translation and query translation-based information retrieval, as well as into hybrid systems. We find that hybrids of document and query translation-based systems out-perform query translation systems, even human-quality query translation systems.

...read moreread less

135 citations

Book•

Translation Engines: Techniques for Machine Translation

[...]

Arturo Trujillo

08 Oct 1999

TL;DR: This book discusses Machine Translation, Computational Morphology and the Two-level Model, and other Approaches to MT, which focuses on Computational Linguistics Techniques.

...read moreread less

Abstract: 1 Background.- 1 Introduction.- 1.1 Computers in Translation.- 1.2 History of Machine Translation.- 1.3 Strategies for Machine Translation.- 1.4 Artificial Intelligence.- 1.5 Conclusion.- 2 Basic Terminology and Background.- 2.1 Linguistics.- 2.2 Formal Background.- 2.3 Review of Prolog.- 2.4 Conclusion.- 2 Machine-Aided Translation.- 3 Text Processing.- 3.1 Format Preservation.- 3.2 Character Sets and Typography.- 3.3 Input Methods.- 3.4 Conclusion.- 4 Translator's Workbench and Translation Aids.- 4.1 Translator's Workbench.- 4.2 Translation Memory.- 4.3 Bilingual Alignment.- 4.4 Subsentential Alignment.- 4.5 Conclusion.- 3 Machine Translation.- 5 Computational Linguistics Techniques.- 5.1 Introduction.- 5.2 Computational Morphology and the Two-level Model.- 5.3 Syntactic Analysis.- 5.4 Parsing.- 5.5 Generation.- 5.6 Conclusion.- 6 Transfer Machine Translation.- 6.1 Syntactic Transfer MT.- 6.2 Semantic Transfer MT.- 6.3 Lexicalist MT.- 6.4 Conclusion.- 7 Interlingua Machine Translation.- 7.1 Lexical Conceptual Structure MT.- 7.2 Knowledge-Based Machine Translation.- 7.3 Conclusion.- 8 Other Approaches to MT.- 8.1 Example-Based Machine Translation.- 8.2 Statistical Machine Translation.- 8.3 Minimal Recursion Semantics.- 8.4 Constraint Systems.- 8.5 Conclusion.- 4 Common Issues.- 9 Disambiguation.- 9.1 POS Tagging.- 9.2 Disambiguation of Syntactic Analysis.- 9.3 Word Sense Disambiguation.- 9.4 Transfer Disambiguation.- 9.5 Conclusion.- 10 Evaluation.- 10.1 Evaluation Participants.- 10.2 Evaluation Strategies.- 10.3 Quality Measures.- 10.4 Software Evaluation.- 10.5 Software User Needs.- 10.6 Conclusion.- 11 Conclusion.- 11.1 Trends.- 11.2 Further Reading.- Appendix: Useful Resources.

...read moreread less

119 citations

Understanding and enhancing translation by parallel text processing

[...]

Magnus Merkel

01 Jan 1999

TL;DR: In recent years the fields of translation studies, natural language processing and corpus linguistics have come to share one object of study, namely parallel text corpora, and more specifically tra ...

...read moreread less

Abstract: In recent years the fields of translation studies, natural language processing and corpus linguistics have come to share one object of study, namely parallel text corpora, and more specifically tra ...

...read moreread less

118 citations

Proceedings Article•

ALE for speech: a translation prototype.

[...]

Gerald Penn¹, Bob Carpenter•Institutions (1)

University of Tübingen¹

01 Jan 1999

TL;DR: The Attribute Logic Engine is described and enhancements to it are described that enable it to serve as a complete grammatical infrastructure for applications such as spoken language translation.

...read moreread less

Abstract: In this paper, we describe The Attribute Logic Engine (ALE) and enhancements to it that enable it to serve as a complete grammatical infrastructure for applications such as spoken language translation. We indicate how ALE was expanded and combined with off-the-shelf speech components to develop an application that translates English speech to German speech. The translation operates by way of a semantic representation based on typed feature structures, with information on thematic roles (“who did what to whom”) and agreement information that can be used to guide search in less restricted domains and map expressions to more felicitous (but semantically equivalent) constructions in the target language than a more literal, surface-oriented method would admit.

...read moreread less

114 citations

Patent•

Translation system and a multifunction computer, particularly for treating texts and translation on paper

[...]

D Agostini Giovanni

19 Feb 1999

TL;DR: An interactive paper translation using alternate dictionaries includes: a first storage for storing words and strings of more words with respective correct translations so that it forms a dictionary of words and sentences or sentence portions.

...read moreread less

Abstract: An interactive paper translation using alternate dictionaries includes: a first storage for storing words and strings of more words with respective correct translations so that it forms a dictionary of words and sentences or sentence portions; a second receiver for receiving a text to be translated; a third storage for storing the translated text in the second screen field; and a fourth searcher for searching in progression for the words of the text to be translated. The method compares translated words with the words of the first storage to obtain a progressive translation and forms a completely automatic translation or an interactive translation or vice versa, before beginning the translation. During the option of interactive translation, there are further displays and windows. The method may also involve a scanner integrated with OCR for direct loading of the sheets to be translated.

...read moreread less

52 citations

Controlled Language for Multilingual Machine Translation

[...]

Teruko Mitamura¹•Institutions (1)

Carnegie Mellon University¹

13 Sep 1999

TL;DR: An overview of the issues in designing acontrolled language, the implementation of a controlled language checker, and the deployment of KANT Controlled English for multilingual machine translation is presented.

...read moreread less

Abstract: In this paper, we present an overview of the issues in designing a controlled language, the implementation of a controlled language checker, and the deployment of KANT Controlled English for multilingual machine translation. We also discuss some success criteria for introducing controlled language. Finally, future vision of KANT controlled language development is discussed.

...read moreread less

51 citations

Inducing Translation Templates for Example-Based Machine Translation

[...]

Michael Carl¹•Institutions (1)

Saarland University¹

13 Sep 1999

TL;DR: An example-based machine translation (EBMT) system which relays on various knowledge resources and investigates the possibilities and limits of the translation template induction process.

...read moreread less

Abstract: This paper describes an example-based machine translation (EBMT) system which relays on various knowledge resources. Morphologic analyses abstract the surface forms of the languages to be translated. A shallow syntactic rule formalism is used to percolate features in derivation trees. Translation examples serve the decomposition of the text to be translated and determine the transfer of lexical values into the target language. Translation templates determine the word order of the target language and the type of phrases (e.g. noun phrase, prepositional phase, ...) to be generated in the target language. An induction mechanism generalizes translation templates from translation examples. The paper outlines the basic idea underlying the EBMT system and investigates the possibilities and limits of the translation template induction process.

...read moreread less

51 citations

The development and use of machine translation systems and computer-based translation tools

[...]

John Hutchins

01 Jan 1999

TL;DR: This survey of the present demand and use of computer-based translation software concentrates on systems designed for the production of translations of publishable quality, including developments in controlled language systems, translator workstations, and localisation.

...read moreread less

Abstract: This survey of the present demand and use of computer-based translation software concentrates on systems designed for the production of translations of publishable quality, including developments in controlled language systems, translator workstations, and localisation; but it covers also the developments of software for non-translators, in particular for use with Web pages and other Internet applications, and it looks at future needs and systems under development. The final section compares the types of translations that can be met most appropriately by human and by machine (and computer-aided) translation respectively.

...read moreread less

47 citations

Patent•

Translation supporting apparatus and method and computer-readable recording medium, wherein a translation example useful for the translation task is searched out from within a translation example database

[...]

Shinichi Ando¹, Kiyoshi Yamabana¹, Sato Kenji¹•Institutions (1)

NEC¹

27 Dec 1999

TL;DR: In this article, a translation supporting apparatus which searches out a translation example useful for a translation task from within translation example database is disclosed, which stores character strings of a first language and translation results of a second language corresponding to the character strings in a unit of a document.

...read moreread less

Abstract: A translation supporting apparatus which searches out a translation example useful for a translation task from within a translation example database is disclosed. The translation example database stores character strings of a first language and translation results of a second language corresponding to the character strings in a unit of a document. A retrieval request inputting apparatus inputs a translation target sentence. A similarity retrieval apparatus determines, for each translation example, a similarity to the translation target sentence, a similarity to a translation example context which is another translation example having such a predetermined relationship that it is included in the same document and is present within one sentence before or after the translation example, a similarity to a retrieval request context which is another translation target character string having such a predetermined relationship that it is included in the same document as the translation target character string and is present within the range of one sentence before or after the translation target character string, and a similarity between the translation example context and the retrieval request context, and integrates the four similarities. A similar example outputting apparatus refers to the integrated similarities and outputs those translation examples similar to the translation target character string.

...read moreread less

Proceedings Article•

Preliminary experience with the use of the UQBT binary translation framework

[...]

Cristina Cifuentes, Mike Van Emmerik, David Ung, Doug Simon, Trent Waddington - Show less +1 more

01 Jan 1999

Example-Based Machine Translation: An Adaptation-Guided Retrieval Approach

[...]

Brona Collins

01 Sep 1999

TL;DR: This thesis advances the state of the art in example-based machine translation by proposing techniques for predicting the adaptation requirements of a retrieval episode, and a new EBMT scheme is proposed in which the cases encode knowledge about their own reusability, determined by cross-linguistic mappings.

...read moreread less

Abstract: Example-Based Machine Translation Br ona Collins Supervisor: P adraig Cunningham Translation can be viewed as a problem-solving process where a source language text is transformed into its target language equivalent. A machine translation system, solving the problem from rst-principles, requires more knowledge than has ever been successfully encoded in any system. An alternative approach is to reuse past translation experience encoded in a set of exemplars, or cases. A case which is similar to the input problem will be retrieved and a solution produced by adapting its target language component. This thesis advances the state of the art in example-based machine translation by proposing techniques for predicting the adaptation requirements of a retrieval episode. An Adaptation-Guided Retrieval policy increases the e ciency of the retriever, which will now search for adaptable cases, and relieves the knowledge-acquisition bottleneck of the adaptation component. A exible case-storage scheme also allows all knowledge required for adaptation to be deduced from the case-base itself. The rst part of the thesis contrasts such a CBR-motivated approach with current EBMT systems which are either data-intensive or knowledge-intensive. A new EBMT scheme is proposed in which the cases encode knowledge about their own reusability, determined by cross-linguistic mappings. The information allows cases to be generalised carefully, to the degree that is necessitated by the data. Linguistic and translational divergences | the obstacles to reusability | are investigated in the domain of software-manual translation, and on this basis, a suitable case representation scheme is proposed. The second and third parts of the thesis describe the on-line and o -line processes of an EBMT system in which the case-base is the only knowledge source. Cases are deduced from texts automatically, and at run-time, the matching and retrieval tasks exploit the adaptability information in the cases in order to maximise coverage without compromising on accuracy. The multi-tiered case representation scheme allows adaptation at the sub-sentential and word levels, when necessary. The general performance of the system is shown to degrade gracefully and to improve as the case-base size increases.

...read moreread less

Super-function based machine translation

[...]

Ren Fuji

01 Jan 1999

TL;DR: Safety apparatus intended to safeguard the operation of a machine and to discourage overloading of an electric motor powering the machine.

...read moreread less

Abstract: Safety apparatus intended to safeguard the operation of a machine and to discourage overloading of an electric motor powering the machine. The machine is one which requires the operator to wear an eye or face protector and incorporated in the protector is switch means connected by an electric circuit to the machine motor. Only when the protector is in normal position of use will the machine operate or the work be properly illuminated by a floodlight also forming part of the circuit. A heat sensitive switch in the circuit aids in preventing the operator from overloading the machine motor.

...read moreread less

Example-based machine translation based on the synchronous SSTC annotation schema

[...]

Mosleh Hmoud Al-Adhaileh¹, Tang Enya Kong•Institutions (1)

Universiti Sains Malaysia¹

13 Sep 1999

TL;DR: In this article, an example-based machine translation (EBMT) system for English-Malay translation is described. But it is based on a Bilingual Knowledge Bank (BKB).

...read moreread less

Abstract: In this paper, we describe an Example-Based Machine Translation (EBMT) system for English-Malay translation. Our approach is an example-based approach which relies sorely on example translations kept in a Bilingual Knowledge Bank (BKB). In our approach, a flexible annotation schema called Structured String-Tree Correspondence (SSTC) is used to annotate both the source and target sentences of a translation pair. Each SSTC describes a sentence, a representation tree as well as the correspondences between substrings in the sentence and subtrees in the representation tree. With both the source and target SSTCs established, a translation example in the BKB can then be represented effectively in terms of a pair of synchronous SSTCs. In the process of translation, we first try to build the representation tree for the source sentence (English) based on the example-based parsing algorithm as presented in [1]. By referring to the resultant source parse tree, we then proceed to synthesis the target sentence (Malay) based on the target SSTCs as pointed to by the synchronous SSTCs which encode the relationship between source and target SSTCs.

...read moreread less

Book•

Incremental Speech Translation

[...]

Jan W. Amtrup¹•Institutions (1)

New Mexico State University¹

01 Jan 1999

TL;DR: Graph Theory and Natural Language Processing Unification-Based Formalisms for Translation in Natural Language processing and MILC: Structure and Implementation are presented.

...read moreread less

Abstract: Graph Theory and Natural Language Processing.- Unification-Based Formalisms for Translation in Natural Language Processing.- MILC: Structure and Implementation.- Experiments and Results.- Conclusion and Outlook.

...read moreread less

Computer Assisted Translation System- An Indian Perspective

[...]

Hemant Darbari

13 Sep 1999

TL;DR: C-DAC took up this challenge, as it felt that India, being a multi-lingual and multi-cultural country with a population of approximately 950 million people and 18 constitutionally recognized languages, needs a translation system for instant transfer of information and knowledge.

...read moreread less

Abstract: Work in the area of Machine Translation has been going on for several decades and it was only during the early 90s that a promising translation technology began to emerge with advanced researches in the field of Artificial Intelligence and Computational Linguistics. This held the promise of successfully developing usable Machine Translation Systems in certain well-defined domains. C-DAC took up this challenge, as we felt that India, being a multi-lingual and multi-cultural country with a population of approximately 950 million people and 18 constitutionally recognized languages, needs a translation system for instant transfer of information and knowledge. The other groups who are working in this area of English to Hindi Translation are National Center for Software Technology (NCST), who are working on translation of News Stories and Electronics Research & Development Center of India (ER & DCI). who have developed the Machine Assisted Translation System for the Health Domain. A major project on Indian Languages to Indian Languages Translation (Anusaaraka) is also under development at University of Hyderabad.

...read moreread less

Patent•

Automatic translation of text files during assembly of a computer system

[...]

Robert G. Nadon, John C. Nunn

30 Nov 1999

TL;DR: A translation script operates to select a translation routine from a set of available translation routines, the selection being based on the nature of the text file, the operating system, and the desired language translation as mentioned in this paper.

...read moreread less

Abstract: A method of providing a desired language version of textual portions of a source code program for a computer system. During the system assembly process, a system description record (SDR) is read that identifies the operating system, including the desired language version thereof, and other software programs. A text file corresponding to at least one of the programs is read and a native-language version of the program is installed on the computer system. A translation script operates to select a translation routine from a set of available translation routines, the selection being based on the nature of the text file, the operating system, and the desired language translation. The translation routine locates native-language text strings in the text file and substitutes the desired language translations of those strings. The translation process takes place substantially concurrently with installation of the program in the computer system.

...read moreread less

A Valency Dictionary Architecture for Machine Translation

[...]

Francis Bond

01 Jan 1999

TL;DR: This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly.

...read moreread less

Abstract: This research is aimed at developing a valency dictionary architecture to comprehensively list the full range of alternations associated with a given predicate sense, both efficiently and robustly. The architecture is designed to incorporate all information available in current on-line resources, as well as additional features such as argument status, grammatical relations, and an augmented case-role representation. Words are divided into senses, which are distinguished on semantic grounds, depending on the core lexical meaning of the verb. Each sense may have one or more alternations, thus keeping the number of senses manageable, while allowing for systematic variation in the lexical realization. Individual syntactic case frames are indexed back to the basic semantic argument component of the given predicate sense.

...read moreread less

Controlled Languages for Machine Translation: State of the Art

[...]

Hiroyuki Kaji

13 Sep 1999

TL;DR: A controlled language brings out the maximum performance of machine translation systems at the cost of the burden on source text authors, so the controlled language approach is suitable for translation for dissemination of information.

...read moreread less

Abstract: A controlled language is a subset of a natural language with artificially restricted vocabulary, grammar, and style. Texts written in a controlled language are usually less complex and less ambiguous than those written in an uncontrolled language. The use of a controlled language therefore produces better results in machine translation. On the other hand, a controlled language reduces the power of expression and decreases the writing speed. In short, a controlled language brings out the maximum performance of machine translation systems at the cost of the burden on source text authors. So the controlled language approach is suitable for translation for dissemination of information. And a controlled language becomes more beneficial when texts are translated into multiple target languages. We should note the distinction between a controlled language and a sublanguage, which are sometimes confused. The term ‘sublanguage’, which means literally a subset of a language, is used when focus is put on a language used in a specific domain (for example, weather forecasting) rather than on the whole of a language. ‘Sublanguage’ does not imply artificially imposed restrictions. We should also mention pre-editing. Preediting is a form of human assistance in machine translation. It includes not only rewriting a source text but also inserting special symbols or tags within the text. Preediting is not always done by the authors of source texts, but the controlled language is originally expected to be used by the authors themselves.

...read moreread less

Journal Article•DOI•

Semantic processing performance of Internet machine translation systems

[...]

Paul A. Watters¹, Malti Patel¹•Institutions (1)

Macquarie University¹

01 May 1999-Internet Research

TL;DR: An iterative paradigm is used to examine errors associated with interlingual divergence in meaning arising from the automated machine translation of English proverbs and the need for the development of Web‐based translation systems, which have an explicit cross‐linguistic representation of meaning for successful intercultural communication is discussed.

...read moreread less

Abstract: The Internet has the potential to facilitate understanding across cultures and languages by removing the physical barriers to intercultural communication. One possible contributor to this development has been the recent release of freely‐available automated direct machine translation systems, such as AltaVista with SYSTRAN, which translates from English to five other European languages (French, German, Italian, Spanish and Portuguese), and vice versa. However, concerns have recently been raised over the performance of these systems, and the potential for confusion that can be created when the intended meaning of sentences is not correctly translated (i.e. semantic processing errors). In this paper, we use an iterative paradigm to examine errors associated with interlingual divergence in meaning arising from the automated machine translation of English proverbs. The need for the development of Web‐based translation systems, which have an explicit cross‐linguistic representation of meaning for successful intercultural communication, is discussed.

...read moreread less

A Building Blocks Approach to Translation Memory

[...]

Kevin McTait¹, Arturo Trujillo¹, Maeve Olohan•Institutions (1)

University of Manchester¹

01 Jan 1999

TL;DR: A building blocks approach (a term borrowed from the theoretical framework discussed in Lange et al (1997)), is advantageous in that it extracts fragments of text, from a traditional TM database, that more closely represent those with which a human translator works.

...read moreread less

Abstract: Traditional Translation Memory systems that find the best match between a SL input sentence and SL sentences in a database of previously translated sentences are not ideal. Studies in the cognitive processes underlying human translation reveal that translators very rarely process SL text at the level of the sentence. The units with which translators work are usually much smaller i.e. word, syntactic unit, clause or group of meaningful words. A building blocks approach (a term borrowed from the theoretical framework discussed in Lange et al (1997)), is advantageous in that it extracts fragments of text, from a traditional TM database, that more closely represent those with which a human translator works. The text fragments are combined with the intention of producing TL translations that are more accurate, thus requiring less post- editing on the part of the translator.

...read moreread less

Example-Based Machine Translation of Part-Of-Speech Tagged Sentences by Recursive Division

[...]

Tantely Andriamanankasina, Kenji Araki, Koji Tochinai

13 Sep 1999

TL;DR: A translation method which recursively divides a sentence and translates each part separately and an analogy-based word-level alignment method which predicts word correspondences between source and translation sentences of new translation examples are evaluated.

...read moreread less

Abstract: Example-Based Machine Translation can be applied to languages whose resources like dictionaries, reliable syntactic analyzers are hardly available because it can learn from new translation examples. However, difficulties still remain in translation of sentences which are not fully covered by the matching sentence. To solve that problem, we present in this paper a translation method which recursively divides a sentence and translates each part separately. In addition, we evaluate an analogy-based word-level alignment method which predicts word correspondences between source and translation sentences of new translation examples. The translation method was implemented in a French-Japanese machine translation system and spoken language text were used as examples. Promising translation results were earned and the effectiveness of the alignment method in the translation was confirmed.

...read moreread less

Proceedings Article•DOI•

Machine learning of language translation rules

[...]

J. Tenni, A. Lehtola, C. Bounsaythip, K. Jaaranen

12 Oct 1999

TL;DR: Learning methods presented here enable a supervised, human-assisted learning of generalised translation rules, thus making it faster and easier to adapt the machine translation system to new languages.

...read moreread less

Abstract: The purpose of this paper is to present learning methods for creating language translation rules from multilingual text samples The languages concerned are controlled languages, ie, they are domain specific sublanguages with ambiguities eliminated by restricting the vocabulary and syntax Learning methods presented here enable a supervised, human-assisted learning of generalised translation rules, thus making it faster and easier to adapt our machine translation system to new languages

...read moreread less

Rapid development of translation tools

[...]

Jan W. Amtrup, Karine Megerdoomian, Rémi Zajac

13 Sep 1999

TL;DR: The technology the authors demonstrate has first been applied to Persian-English machine translation within the Shiraz project and is currently extended to cover languages such as Arabic, Japanese, Korean and others.

...read moreread less

Abstract: The Computing Research Laboratory is currently developing technologies that allow rapid deployment of automatic translation capabilities. These technologies are designed to handle low-density languages for which resources, be that human informants or data in electronically readable form, are scarce. All tools are built in an incremental fashion, such that some simple tools (a bilingual dictionary or a glosser) can be delivered early in the development to support initial analysis tasks. More complex applications can be fielded in successive functional versions. The technology we demonstrate has first been applied to PersianEnglish machine translation within the Shiraz project and is currently extended to cover languages such as Arabic, Japanese, Korean and others.

...read moreread less

Automatic domain recognition for machine translation

[...]

Elke D. Lange, Jin Yang

13 Sep 1999

TL;DR: Results of the implementation show that the approach of using terminology categorization already existing in the machine translation system is very promising.

...read moreread less

Abstract: This paper describes an ongoing project which has the goal of improving machine translation quality by increasing knowledge about the text to be translated. A basic piece of such knowledge is the domain or subject field of the text. When this is known, it is possible to improve meaning selection appropriate to that domain. Our current effort consists in automating both recognition of the text’s domain and the assignment of domain-specific translations. Results of our implementation show that the approach of using terminology categorization already existing in the machine translation system is very promising.

...read moreread less

Interactive translation of conversational speech

[...]

Alex Waibel¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 1999

TL;DR: The Janus-II system uses paraphrasing and interactive error correction to boost performance, and now accepts English, German, Japanese, Spanish, and Korean input, which it translates into any other of these languages.

...read moreread less

Abstract: We present JANUS-II, a large scale system effort aimed at interactive spoken language translation. JANUS-II now accepts spontaneous conversational speech in a limited domain in English, German or Spanish and produces output in German, English, Spanish, Japanese and Korean. The challenges of coarticulated, disfluent, ill-formed speech are manifold, and have required advances in acoustic modeling, dictionary learning, language modeling, semantic parsing and generation, to achieve acceptable performance. A semantic interlingua that represents the intended meaning of an input sentence, facilitates the generation of culturally and contextually appropriate translation in the presence of irrelevant or erroneous information. Application of statistical, contextual, prosodic and discourse constraints permits a progressively narrowing search for the most plausible interpretation of an utterance. During translation, JANUS-II produces paraphrases that are used for interactive correction of translation errors. Beyond our continuing efforts to improve robustness and accuracy, we have also begun to study possible forms of deployment. Several system prototypes have been implemented to explore translation needs in different settings: speech translation in one-on-one video conferencing, as portable mobile interpreter, or as passive simultaneous conversation translator. We will discuss their usability and performance.

...read moreread less

Book•

The automatic translation of idioms: machine translation vs. translation memory systems

[...]

Martin Volk

01 Jun 1999

TL;DR: In this paper, the authors investigate the requirements for automatically recognizing idioms and check whether idiom recognition is possible within current translation systems, i.e. machine translation and translation memory systems.

...read moreread less

Abstract: Translating idioms is one of the most difficult tasks for human translators and translation machines alike. The main problems consist in recognizing an idiom and in distinguishing idiomatic from non-idiomatic usage. Recognition is difficult since many idioms can be modified and others can be discontinuously spread over a clause. But with the help of systematic idiom collections and special rules the recognition of an idiom candidate is always possible. The distinction between idiomatic and non-idiomatic usage is more problematic. Sometimes this can be done by means of special words that are only used in an idiom. But in general this distinction is a question of semantics and pragmatics and therefore beyond the abilities of current translation systems. In this paper we investigate the requirements for automatically recognizing idioms and we check whether idiom recognition is possible within current translation systems, i.e. machine translation and translation memory systems. This is of current interest since the developers of translation systems have started to include huge idiom collections in their products.

...read moreread less

Programming Language Translation

[...]

Lili Qiu

11 May 1999

TL;DR: The first half of the paper describes the details of automating most of the translation from C to C++, as well as the difficulties encountered, and identifies some of the issues and challenges in automating this translation process.

...read moreread less

Abstract: As programming languages become more and more diversified, there is an increasing demand to translate programs written in one high-level language into another. Such translation can help us more effectively reuse the existing code, especially when automating translation is possible. However due to many subtle distinctions between different languages, usually only a subset of translation can be automated. The first half of the paper describes the details of automating most of the translation from C to C++, as well as the difficulties encountered. The second half of the paper talks about the experience of manually porting Java programs to C++, and identifies some of the issues and challenges in automating this translation process. Through the discussions, it is evident that translation is heavily language specific. Comprehensive knowledge about the languages and their subtle distinctions is essential. On the other hand, designing tools to allow high level specification of translation rules and effectively incorporate human interaction is a generic approach to any language translation problem, which is an interesting research problem to explore.

...read moreread less

Journal Article•

The Influence of Information Society on Translation Teaching

[...]

Mu Lei

01 Jan 1999-Shanghai Journal of Translators For Science and Technology

TL;DR: This paper analyses the influence of information society on translation teaching from the following aspects: the traditional translation criterion of “be faithful to the original author” being challenged, diversification of translation object, and change in translation ways and means.

...read moreread less

Abstract: Information blooming and Internet bring opportunities and challenges to translation teaching.This paper analyses the influence of information society on translation teaching from the following aspects: 1. The traditional translation criterion of “be faithful to the original author” being challenged; 2. Diversification of translation object; 3. The change in translation ways and means; 4. “machine translation system” and translation teaching; 5. The development of electronic textbooks. Therefore, as teachers of translation, we should continuously study new knowledge and new technology as well as use computer proficiently. [

...read moreread less