Conference

North American Chapter of the Association for Computational Linguistics

About: North American Chapter of the Association for Computational Linguistics is an academic conference. The conference publishes majorly in the area(s): SemEval & Task (project management). Over the lifetime, 5404 publications have been published by the conference receiving 297017 citations.

...read moreread less

Topics: SemEval, Task (project management), Computer science, Machine translation, Parsing ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

[...]

Jacob Devlin¹, Ming-Wei Chang¹, Kenton Lee¹, Kristina Toutanova¹•Institutions (1)

Google¹

11 Oct 2018

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Abstract: We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models (Peters et al., 2018a; Radford et al., 2018), BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5 (7.7 point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).

...read moreread less

24,672 citations

Proceedings Article•DOI•

Deep contextualized word representations

[...]

Matthew E. Peters¹, Mark Neumann¹, Mohit Iyyer², Matt Gardner¹, Christopher Clark¹, Kenton Lee³, Luke Zettlemoyer⁴ - Show less +3 more•Institutions (4)

Allen Institute for Artificial Intelligence¹, University of Massachusetts Amherst², Google³, University of Washington⁴

15 Feb 2018

TL;DR: This paper introduced a new type of deep contextualized word representation that models both complex characteristics of word use (e.g., syntax and semantics), and how these uses vary across linguistic contexts (i.e., to model polysemy).

...read moreread less

Abstract: We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus. We show that these representations can be easily added to existing models and significantly improve the state of the art across six challenging NLP problems, including question answering, textual entailment and sentiment analysis. We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.

...read moreread less

7,412 citations

Proceedings Article•DOI•

Hierarchical Attention Networks for Document Classification

[...]

Zichao Yang¹, Diyi Yang¹, Chris Dyer¹, Xiaodong He², Alexander J. Smola¹, Eduard Hovy¹ - Show less +2 more•Institutions (2)

Carnegie Mellon University¹, Microsoft²

13 Jun 2016

TL;DR: Experiments conducted on six large scale text classification tasks demonstrate that the proposed architecture outperform previous methods by a substantial margin.

...read moreread less

Abstract: We propose a hierarchical attention network for document classification. Our model has two distinctive characteristics: (i) it has a hierarchical structure that mirrors the hierarchical structure of documents; (ii) it has two levels of attention mechanisms applied at the wordand sentence-level, enabling it to attend differentially to more and less important content when constructing the document representation. Experiments conducted on six large scale text classification tasks demonstrate that the proposed architecture outperform previous methods by a substantial margin. Visualization of the attention layers illustrates that the model selects qualitatively informative words and sentences.

...read moreread less

4,282 citations

Proceedings Article•DOI•

Neural Architectures for Named Entity Recognition

[...]

Guillaume Lample¹, Miguel Ballesteros², Sandeep Subramanian³, Kazuya Kawakami⁴, Chris Dyer³ - Show less +1 more•Institutions (4)

Facebook¹, Pompeu Fabra University², Carnegie Mellon University³, Google⁴

04 Mar 2016

TL;DR: Comunicacio presentada a la 2016 Conference of the North American Chapter of the Association for Computational Linguistics, celebrada a San Diego (CA, EUA) els dies 12 a 17 of juny 2016.

...read moreread less

Abstract: Comunicacio presentada a la 2016 Conference of the North American Chapter of the Association for Computational Linguistics, celebrada a San Diego (CA, EUA) els dies 12 a 17 de juny 2016.

...read moreread less

3,960 citations

Proceedings Article•DOI•

Statistical phrase-based translation

[...]

Philipp Koehn¹, Franz Josef Och¹, Daniel Marcu¹•Institutions (1)

University of Southern California¹

27 May 2003

TL;DR: The empirical results suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translation.

...read moreread less

Abstract: We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models out-perform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. Surprisingly, learning phrases longer than three words and learning phrases from high-accuracy word-level alignment models does not have a strong impact on performance. Learning only syntactically motivated phrases degrades the performance of our systems.

...read moreread less

3,778 citations

Collapse

Performance

Metrics

5,404

Papers

297,017

Citations

No. of papers from the Conference in previous years
Year	Papers
2023	1
2022	300
2021	612
2019	767
2018	622
2016	534