scispace - formally typeset
X

Xavier Carreras

Researcher at Xerox

Publications -  78
Citations -  4270

Xavier Carreras is an academic researcher from Xerox. The author has contributed to research in topics: Parsing & Dependency (UML). The author has an hindex of 29, co-authored 76 publications receiving 4154 citations. Previous affiliations of Xavier Carreras include Polytechnic University of Catalonia & Massachusetts Institute of Technology.

Papers
More filters
Proceedings Article

Simple Semi-supervised Dependency Parsing

TL;DR: This work focuses on the problem of lexical representation, introducing features that incorporate word clusters derived from a large unannotated corpus, and shows that the cluster-based features yield substantial gains in performance across a wide range of conditions.
Posted Content

Boosting Trees for Anti-Spam Email Filtering

TL;DR: The boosting-based methods clearly outperform the baseline learning algorithms on the PU1 corpus, achieving very high levels of the F1 measure and obtaining better ``high-precision'' classifiers, which is a very important issue when misclassification costs are considered.
Proceedings Article

Introduction to the CoNLL-2004 Shared Task: Semantic Role Labeling

TL;DR: The specification and goal of the task are introduced, the data sets and evaluation methods are described, and a general overview of the systems that have contributed to the task is presented, providing comparative description.
Proceedings Article

FreeLing: An Open-Source Suite of Language Analyzers

TL;DR: This work presents a suite of analysis tools based on the object architecture that is currently using, which enables the quick and easy integration of basic language analyzers in any NLP application, and is distributed under Lesser General Public License (LGPL) (Free Software Foundation, 1999).
Proceedings Article

Experiments with a Higher-Order Projective Dependency Parser

TL;DR: In the multilingual exercise of the CoNLL-2007 shared task (Nivre et al., 2007), the system obtains the best accuracy for English, and the second best accuracies for Basque and Czech.