scispace - formally typeset
T

Tiago Luís

Researcher at INESC-ID

Publications -  12
Citations -  1141

Tiago Luís is an academic researcher from INESC-ID. The author has contributed to research in topics: Machine translation & Phrase. The author has an hindex of 6, co-authored 11 publications receiving 1071 citations.

Papers
More filters
Proceedings ArticleDOI

Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation

TL;DR: A model for constructing vector representations of words by composing characters using bidirectional LSTMs that requires only a single vector per character type and a fixed set of parameters for the compositional model, which yields state- of-the-art results in language modeling and part-of-speech tagging.
Posted Content

Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation

Abstract: We introduce a model for constructing vector representations of words by composing characters using bidirectional LSTMs. Relative to traditional word representation models that have independent vectors for each word type, our model requires only a single vector per character type and a fixed set of parameters for the compositional model. Despite the compactness of this model and, more importantly, the arbitrary nature of the form-function relationship in language, our "composed" word representations yield state-of-the-art results in language modeling and part-of-speech tagging. Benefits over traditional baselines are particularly pronounced in morphologically rich languages (e.g., Turkish).
Journal ArticleDOI

A linguistically motivated taxonomy for Machine Translation error analysis

TL;DR: This paper significantly extends previous error taxonomies so that translation errors associated with Romance language specificities can be accommodated and carries out an extensive analysis of the errors generated by four different systems.

Towards a General and Extensible Phrase-Extraction Algorithm

TL;DR: This paper presents a general and extensible phrase extraction algorithm, where several control points are highlighted, which allows the simulation of previous approaches and proposes alternative heuristics, showing their impact on the final translation results.