scispace - formally typeset

Empirical Methods in Natural Language Processing

About: Empirical Methods in Natural Language Processing is an academic conference. The conference publishes majorly in the area(s): Machine translation & Language model. Over the lifetime, 7948 publication(s) have been published by the conference receiving 400513 citation(s). more

Topics: Machine translation, Language model, Parsing more

Proceedings ArticleDOI: 10.3115/V1/D14-1162
01 Oct 2014-
Abstract: Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque. We analyze and make explicit the model properties needed for such regularities to emerge in word vectors. The result is a new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with meaningful substructure, as evidenced by its performance of 75% on a recent word analogy task. It also outperforms related models on similarity tasks and named entity recognition. more

Topics: Word2vec (64%), Word embedding (56%), Sparse matrix (54%) more

23,307 Citations

Open accessProceedings ArticleDOI: 10.3115/V1/D14-1179
01 Jan 2014-
Abstract: In this paper, we propose a novel neural network model called RNN Encoder‐ Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixedlength vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of a target sequence given a source sequence. The performance of a statistical machine translation system is empirically found to improve by using the conditional probabilities of phrase pairs computed by the RNN Encoder‐Decoder as an additional feature in the existing log-linear model. Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases. more

Topics: Encoder (54%), Recurrent neural network (52%), Artificial neural network (52%) more

14,140 Citations

Open accessProceedings ArticleDOI: 10.3115/V1/D14-1181
Yoon Kim1Institutions (1)
25 Aug 2014-
Abstract: We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks. Learning task-specific vectors through fine-tuning offers further gains in performance. We additionally propose a simple modification to the architecture to allow for the use of both task-specific and static vectors. The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification. more

7,176 Citations

Open accessProceedings ArticleDOI: 10.18653/V1/D15-1166
17 Aug 2015-
Abstract: An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on parts of the source sentence during translation. However, there has been little work exploring useful architectures for attention-based NMT. This paper examines two simple and effective classes of attentional mechanism: a global approach which always attends to all source words and a local one that only looks at a subset of source words at a time. We demonstrate the effectiveness of both approaches on the WMT translation tasks between English and German in both directions. With local attention, we achieve a significant gain of 5.0 BLEU points over non-attentional systems that already incorporate known techniques such as dropout. Our ensemble model using different attention architectures yields a new state-of-the-art result in the WMT’15 English to German translation task with 25.9 BLEU points, an improvement of 1.0 BLEU points over the existing best system backed by NMT and an n-gram reranker. 1 more

6,374 Citations

Open accessProceedings ArticleDOI: 10.3115/1118693.1118704
06 Jul 2002-
Abstract: We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three machine learning methods we employed (Naive Bayes, maximum entropy classification, and support vector machines) do not perform as well on sentiment classification as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classification problem more challenging. more

Topics: Sentiment analysis (67%), Multiclass classification (61%), Relevance vector machine (59%) more

6,353 Citations

No. of papers from the Conference in previous years

Top Attributes

Show by:

Conference's top 5 most impactful authors

Noah A. Smith

53 papers, 3.9K citations

Mirella Lapata

40 papers, 4.5K citations

Christopher D. Manning

32 papers, 39.9K citations

Yejin Choi

31 papers, 3.6K citations

Mohit Bansal

29 papers, 1.4K citations

Network Information
Related Conferences (5)
Meeting of the Association for Computational Linguistics

11.7K papers, 590K citations

98% related
International Joint Conference on Natural Language Processing

1.6K papers, 58.9K citations

96% related
Conference on Computational Natural Language Learning

908 papers, 36.4K citations

96% related
International Conference on Computational Linguistics

7.6K papers, 169K citations

95% related