scispace - formally typeset
Search or ask a question

Showing papers by "Kai-Wei Chang published in 2013"


Proceedings Article
01 Oct 2013
TL;DR: It is demonstrated that by integrating multiple relations from both homogeneous and heterogeneous information sources, MRLSA achieves state-of-the-art performance on existing benchmark datasets for two relations, antonymy and is-a.
Abstract: We present Multi-Relational Latent Semantic Analysis (MRLSA) which generalizes Latent Semantic Analysis (LSA). MRLSA provides an elegant approach to combining multiple relations between words by constructing a 3-way tensor. Similar to LSA, a lowrank approximation of the tensor is derived using a tensor decomposition. Each word in the vocabulary is thus represented by a vector in the latent semantic space and each relation is captured by a latent square matrix. The degree of two words having a specific relation can then be measured through simple linear algebraic operations. We demonstrate that by integrating multiple relations from both homogeneous and heterogeneous information sources, MRLSA achieves stateof-the-art performance on existing benchmark datasets for two relations, antonymy and is-a.

70 citations


Proceedings Article
01 Aug 2013
TL;DR: The University of Illinois system that participated in the CoNLL-2013 shared task consists of five components and targets five types of common grammatical mistakes made by English as Second Language writers.
Abstract: The CoNLL-2013 shared task focuses on correcting grammatical errors in essays written by non-native learners of English. In this paper, we describe the University of Illinois system that participated in the shared task. The system consists of five components and targets five types of common grammatical mistakes made by English as Second Language writers. We describe our underlying approach, which relates to our previous work, and describe the novel aspects of the system in more detail. Out of 17 participating teams, our system is ranked first based on both the original annotation and on the revised annotation.

58 citations


Proceedings Article
01 Oct 2013
TL;DR: The authors proposed the Latent Left Linking model (L 3 M), a linguistically motivated latent structured prediction approach to coreference resolution, which admits efficient inference and can be augmented with knowledge-based constraints; they also present a fast stochastic gradient based learning.
Abstract: Coreference resolution is a well known clustering task in Natural Language Processing. In this paper, we describe the Latent Left Linking model (L 3 M), a novel, principled, and linguistically motivated latent structured prediction approach to coreference resolution. We show that L 3 M admits efficient inference and can be augmented with knowledge-based constraints; we also present a fast stochastic gradient based learning. Experiments on ACE and Ontonotes data show that L 3 M and its constrained version, CL 3 M, are more accurate than several state-of-the-art approaches as well as some structured prediction models proposed in the literature.

55 citations


Book ChapterDOI
23 Sep 2013
TL;DR: This paper proposes a new learning algorithm for structural SVMs called DEMIDCD that extends the dual coordinate descent approach by decoupling the model update and inference phases into different threads and proves that the algorithm not only converges but also fully utilizes all available processors to speed up learning.
Abstract: Many problems in natural language processing and computer vision can be framed as structured prediction problems. Structural support vector machines (SVM) is a popular approach for training structured predictors, where learning is framed as an optimization problem. Most structural SVM solvers alternate between a model update phase and an inference phase (which predicts structures for all training examples). As structures become more complex, inference becomes a bottleneck and thus slows down learning considerably. In this paper, we propose a new learning algorithm for structural SVMs called DEMIDCD that extends the dual coordinate descent approach by decoupling the model update and inference phases into different threads. We take advantage of multicore hardware to parallelize learning with minimal synchronization between the model update and the inference phases.We prove that our algorithm not only converges but also fully utilizes all available processors to speed up learning, and validate our approach on two real-world NLP problems: part-of-speech tagging and relation extraction. In both cases, we show that our algorithm utilizes all available processors to speed up learning and achieves competitive performance. For example, it achieves a relative duality gap of 1% on a POS tagging problem in 192 seconds using 16 threads, while a standard implementation of a multi-threaded dual coordinate descent algorithm with the same number of threads requires more than 600 seconds to reach a solution of the same quality.

16 citations


Book ChapterDOI
23 Sep 2013
TL;DR: An approximate semi-supervised learning method that uses piecewise training for estimating the model weights and a dual decomposition approach for solving the inference problem of finding the labels of unlabeled data subject to domain specific constraints is proposed.
Abstract: Semi-supervised learning has been widely studied in the literature. However, most previous works assume that the output structure is simple enough to allow the direct use of tractable inference/learning algorithms (e.g., binary label or linear chain). Therefore, these methods cannot be applied to problems with complex structure. In this paper, we propose an approximate semi-supervised learning method that uses piecewise training for estimating the model weights and a dual decomposition approach for solving the inference problem of finding the labels of unlabeled data subject to domain specific constraints. This allows us to extend semi-supervised learning to general structured prediction problems. As an example, we apply this approach to the problem of multi-label classification (a fully connected pairwise Markov random field). Experimental results on benchmark data show that, in spite of using approximations, the approach is effective and yields good improvements in generalization performance over the plain supervised method. In addition, we demonstrate that our inference engine can be applied to other semi-supervised learning frameworks, and extends them to solve problems with complex structure.

11 citations


Journal Article
TL;DR: The University of Illinois (UI CCG) submission to the 2013 TAC KBP English Entity Linking and Slot Filler Validation tasks is described and two separate systems are developed.
Abstract: In this paper, we describe the University of Illinois (UI CCG) submission to the 2013 TAC KBP English Entity Linking (EL) and Slot Filler Validation (SFV) tasks. We developed two separate systems. Our Entity Linking system integrates an improved version of the Illinois Wikifier with additional functionality to identify and cluster entity mentions that do not correspond to entries in the reference knowledge base. Our Slot Filler Validation system follows an entailment formulation that evaluates each candidate answer based on the evidence present in the source document it refers to.

9 citations


21 Apr 2013
TL;DR: A latent variable structured prediction model, called the Latent Left-linking Model (L3M), for discriminative supervised clustering of items that follow a streaming order is presented and it is shown that L 3 M outperforms several existing structured predictionbased techniques for coreference as well as several state-of-the-art, albeit ad hoc, approaches.
Abstract: We present a latent variable structured prediction model, called the Latent Left-linking Model (L3M), for discriminative supervised clustering of items that follow a streaming order. L 3 M admits efficient inference and we present a learning framework for L 3 M that smoothly interpolates between latent structural SVMs and hidden variable CRFs. We present a fast stochastic gradientbased learning technique for L 3 M. We apply L 3 M to coreference resolution, which is a well known clustering task in Natural Language Processing, and experimentally show that L 3 M outperforms several existing structured predictionbased techniques for coreference as well as several state-of-the-art, albeit ad hoc, approaches.