scispace - formally typeset
Search or ask a question

Showing papers by "Ashish Vaswani published in 2016"


Proceedings ArticleDOI
01 Jun 2016
TL;DR: This paper presents new state-of-the-art performance on CCG supertagging and parsing and demonstrates that while feed-forward architectures can compete with bidirectional LSTMs on POS tagging, models that encode the complete sentence are necessary for the long range syntactic information encoded in supertags.
Abstract: In this paper we present new state-of-the-art performance on CCG supertagging and parsing. Our model outperforms existing approaches by an absolute gain of 1.5%. We analyze the performance of several neural models and demonstrate that while feed-forward architectures can compete with bidirectional LSTMs on POS tagging, models that encode the complete sentence are necessary for the long range syntactic information encoded in supertags.

92 citations


Proceedings ArticleDOI
28 Sep 2016
TL;DR: This paper presented the first results for neuralizing an unsupervised Hidden Markov Model (HMM) and evaluated their approach on tag in-duction, which outperforms existing generative models and is competitive with the state-of-the-art.
Abstract: In this work, we present the first results for neuralizing an Unsupervised Hidden Markov Model. We evaluate our approach on tag in- duction. Our approach outperforms existing generative models and is competitive with the state-of-the-art though with a simpler model easily extended to include additional context.

49 citations


Proceedings ArticleDOI
01 Jun 2016
TL;DR: The authors' NCE-trained language models achieve significantly lower perplexity on the One Billion Word Benchmark language modeling challenge, and contain one sixth of the parameters in the best single model in Chelba et al. (2013).
Abstract: We present a simple algorithm to efficiently train language models with noise-contrastive estimation (NCE) on graphics processing units (GPUs). Our NCE-trained language models achieve significantly lower perplexity on the One Billion Word Benchmark language modeling challenge, and contain one sixth of the parameters in the best single model in Chelba et al. (2013). When incorporated into a strong Arabic-English machine translation system they give a strong boost in translation quality. We release a toolkit so that others may also train large-scale, large vocabulary LSTM language models with NCE, parallelizing computation across multiple GPUs.

48 citations


Posted Content
TL;DR: The first results for neuralizing an Unsupervised Hidden Markov Model are presented, which outperforms existing generative models and is competitive with the state-of-the-art though with a simpler model easily extended to include additional context.
Abstract: In this work, we present the first results for neuralizing an Unsupervised Hidden Markov Model. We evaluate our approach on tag in- duction. Our approach outperforms existing generative models and is competitive with the state-of-the-art though with a simpler model easily extended to include additional context.

41 citations


Proceedings ArticleDOI
01 Jun 2016
TL;DR: This paper tackles a challenging name tagging problem in an emergent setting the tagger needs to be complete within a few hours for a new incident language (IL) using very few resources and proposes a new expectation-driven learning framework that rapidly acquire, categorize, structure and zoom in on ILspecific expectations.
Abstract: In this paper we tackle a challenging name tagging problem in an emergent setting the tagger needs to be complete within a few hours for a new incident language (IL) using very few resources. Inspired by observing how human annotators attack this challenge, we propose a new expectation-driven learning framework. In this framework we rapidly acquire, categorize, structure and zoom in on ILspecific expectations (rules, features, patterns, gazetteers, etc.) from various non-traditional sources: consulting and encoding linguistic knowledge from native speakers, mining and projecting patterns from both mono-lingual and cross-lingual corpora, and typing based on cross-lingual entity linking. We also propose a cost-aware combination approach to compose expectations. Experiments on seven low-resource languages demonstrate the effectiveness and generality of this framework: we are able to setup a name tagger for a new IL within two hours, and achieve 33.8%-65.1% F-score 1.

32 citations


Journal ArticleDOI
TL;DR: This paper proposes a new approach for approximate structured inference for transition-based parsing that produces scores suitable for global scoring using local models with the introduction of error states in local training.
Abstract: Transition-based approaches based on local classification are attractive for dependency parsing due to their simplicity and speed, despite producing results slightly below the state-of-the-art. In this paper, we propose a new approach for approximate structured inference for transition-based parsing that produces scores suitable for global scoring using local models. This is accomplished with the introduction of error states in local training, which add information about incorrect derivation paths typically left out completely in locally-trained models. Using neural networks for our local classifiers, our approach produces the highest accuracy for transition-based dependency parsing in English.

15 citations