Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition
Leon Derczynski,Eric Nichols,Marieke van Erp,Nut Limsopatham +3 more
- pp 140-147
TLDR
The goal of this task is to provide a definition of emerging and of rare entities, and based on that, also datasets for detecting these entities and to evaluate the ability of participating entries to detect and classify novel and emerging named entities in noisy text.Citations
More filters
Proceedings ArticleDOI
Challenges of protecting confidentiality in social media data and their ethical import
TL;DR: An open-source package developed to pseudonymize personal and confidential information contained in unstructured, noisy social media data to facilitate compliance with EU data protection obligations and the upholding of research ethics principles.
Journal ArticleDOI
Geographic Named Entity Recognition by Employing Natural Language Processing and an Improved BERT Model
TL;DR: Wang et al. as mentioned in this paper described a Chinese toponym identification model based on a hybrid neural network that was created with these linguistic inconsistencies in mind, adding a number of improvements to a standard bidirectional recurrent neural network model to help with location detection in social media messages.
Proceedings ArticleDOI
UC3M-PUCPR at SemEval-2022 Task 11: An Ensemble Method of Transformer-based Models for Complex Named Entity Recognition
Elisa Terumi Rubel Schneider,Renzo M. Rivera Zavala,Paloma Martínez,Claudia Moro,Emerson Cabrera Paraiso +4 more
TL;DR: The preliminary results suggest that contextualized language models ensembles can, even if modestly, improve the results in extracting information from unstructured data.
Effectively Leveraging BERT for Legal Document Classification
TL;DR: In this paper, the authors investigate how to deal with long documents, and how is the importance of pre-training on documents from the same domain as the target task, and compare different models pre-trained on the legal and other domains.
DreamDrug - A crowdsourced NER dataset for detecting drugs in darknet markets.
TL;DR: In this paper, the authors present a crowdsourced dataset for detecting mentions of drugs in noisy user-generated item listings from darknet markets, which contains nearly 15,000 manually annotated drug entities.
References
More filters
Proceedings ArticleDOI
The Stanford CoreNLP Natural Language Processing Toolkit
Christopher D. Manning,Mihai Surdeanu,John Bauer,Jenny Rose Finkel,Steven Bethard,David McClosky +5 more
TL;DR: The design and use of the Stanford CoreNLP toolkit is described, an extensible pipeline that provides core natural language analysis, and it is suggested that this follows from a simple, approachable design, straightforward interfaces, the inclusion of robust and good quality analysis components, and not requiring use of a large amount of associated baggage.
Book
Naming and Necessity
TL;DR: In this paper, the authors make a connection between the mind-body problem and the so-called "identity thesis" in analytic philosophy, which has wide-ranging implications for other problems in philosophy that traditionally might be thought far-removed.
Proceedings ArticleDOI
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
TL;DR: The CoNLL-2003 shared task on NER as mentioned in this paper was the first NER task with language-independent named entity recognition (NER) data sets and evaluation method, and a general overview of the systems that participated in the task and their performance.
Posted Content
NLTK: The Natural Language Toolkit
Edward Loper,Steven Bird +1 more
TL;DR: NLTK, the Natural Language Toolkit, is a suite of open source program modules, tutorials and problem sets, providing ready-to-use computational linguistics courseware that covers symbolic and statistical natural language processing.