scispace - formally typeset
Search or ask a question
Author

Isabelle Augenstein

Bio: Isabelle Augenstein is an academic researcher from University of Copenhagen. The author has contributed to research in topics: Computer science & Task (project management). The author has an hindex of 30, co-authored 147 publications receiving 3446 citations. Previous affiliations of Isabelle Augenstein include University of Sheffield & Heidelberg University.

Papers published on a yearly basis

Papers
More filters
Posted Content
TL;DR: This paper used conditional LSTM encoding to detect the attitude expressed in a text towards a target such as Hillary Clinton to be "positive", negative, or neutral, and achieved state-of-the-art performance.
Abstract: Stance detection is the task of classifying the attitude expressed in a text towards a target such as Hillary Clinton to be "positive", negative" or "neutral". Previous work has assumed that either the target is mentioned in the text or that training data for every target is given. This paper considers the more challenging version of this task, where targets are not always mentioned and no training data is available for the test targets. We experiment with conditional LSTM encoding, which builds a representation of the tweet that is dependent on the target, and demonstrate that it outperforms encoding the tweet and the target independently. Performance is improved further when the conditional model is augmented with bidirectional encoding. We evaluate our approach on the SemEval 2016 Task 6 Twitter Stance Detection corpus achieving performance second best only to a system trained on semi-automatically labelled tweets for the test target. When such weak supervision is added, our approach achieves state-of-the-art results.

248 citations

Posted Content
TL;DR: This paper presents their stance detection system, which claimed third place in Stage 1 of the Fake News Challenge, and proposes it as the 'simple but tough-to-beat baseline' for the FakeNews Challenge stance detection task.
Abstract: Identifying public misinformation is a complicated and challenging task. An important part of checking the veracity of a specific claim is to evaluate the stance different news sources take towards the assertion. Automatic stance evaluation, i.e. stance detection, would arguably facilitate the process of fact checking. In this paper, we present our stance detection system which claimed third place in Stage 1 of the Fake News Challenge. Despite our straightforward approach, our system performs at a competitive level with the complex ensembles of the top two winning teams. We therefore propose our system as the 'simple but tough-to-beat baseline' for the Fake News Challenge stance detection task.

217 citations

Proceedings ArticleDOI
01 Aug 2017
TL;DR: The SemEval task of extracting keyphrases and relations between them from scientific documents as discussed by the authors was a new task, which is crucial for understanding which publications describe which processes, tasks and materials.
Abstract: We describe the SemEval task of extracting keyphrases and relations between them from scientific documents, which is crucial for understanding which publications describe which processes, tasks and materials. Although this was a new task, we had a total of 26 submissions across 3 evaluation scenarios. We expect the task and the findings reported in this paper to be relevant for researchers working on understanding scientific content, as well as the broader knowledge base population and information extraction communities.

216 citations

Proceedings ArticleDOI
01 Nov 2016
TL;DR: In this paper, a conditional LSTM encoder-decoder model was used to detect the attitude expressed in a text towards a target in a tweet, where targets are not always mentioned and no training data is available for test targets.
Abstract: Stance detection is the task of classifying the attitude expressed in a text towards a target such as “Climate Change is a Real Concern” to be “positive”, “negative” or “neutral”. Previous work has assumed that either the target is mentioned in the text or that training data for every target is given. This paper considers the more challenging version of this task, where targets are not always mentioned and no training data is available for the test targets. We experiment with conditional LSTM encoding, which builds a representation of the tweet that is dependent on the target, and demonstrate that it outperforms the independent encoding of tweet and target. Performance improves even further when the conditional model is augmented with bidirectional encoding. The method is evaluated on the SemEval 2016 Task 6 Twitter Stance Detection corpus and achieves performance second best only to a system trained on semi-automatically labelled tweets for the test target. When such weak supervision is added, our approach achieves state–of-the-art results.

213 citations

Proceedings ArticleDOI
01 Nov 2016
TL;DR: The authors proposed a pre-trained embeddings for all Unicode emoji which are learned from their description in the Unicode emoji standard, which can be readily used in downstream social natural language processing applications alongside word2vec.
Abstract: Many current natural language processing applications for social media rely on representation learning and utilize pre-trained word embeddings. There currently exist several publicly-available, pre-trained sets of word embeddings, but they contain few or no emoji representations even as emoji usage in social media has increased. In this paper we release emoji2vec, pre-trained embeddings for all Unicode emoji which are learned from their description in the Unicode emoji standard. The resulting emoji embeddings can be readily used in downstream social natural language processing applications alongside word2vec. We demonstrate, for the downstream task of sentiment analysis, that emoji embeddings learned from short descriptions outperforms a skip-gram model trained on a large collection of tweets, while avoiding the need for contexts in which emoji need to appear frequently in order to estimate a representation.

191 citations


Cited by
More filters
Proceedings ArticleDOI
01 Nov 2018
TL;DR: The gluebenchmark as mentioned in this paper is a benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models.
Abstract: Human ability to understand language is general, flexible, and robust. In contrast, most NLU models above the word level are designed for a specific task and struggle with out-of-domain data. If we aspire to develop models with understanding beyond the detection of superficial correspondences between inputs and outputs, then it is critical to develop a unified model that can execute a range of linguistic tasks across different domains. To facilitate research in this direction, we present the General Language Understanding Evaluation (GLUE, gluebenchmark.com): a benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models. For some benchmark tasks, training data is plentiful, but for others it is limited or does not match the genre of the test set. GLUE thus favors models that can represent linguistic knowledge in a way that facilitates sample-efficient learning and effective knowledge-transfer across tasks. While none of the datasets in GLUE were created from scratch for the benchmark, four of them feature privately-held test data, which is used to ensure that the benchmark is used fairly. We evaluate baselines that use ELMo (Peters et al., 2018), a powerful transfer learning technique, as well as state-of-the-art sentence representation models. The best models still achieve fairly low absolute scores. Analysis with our diagnostic dataset yields similarly weak performance over all phenomena tested, with some exceptions.

3,225 citations

Journal ArticleDOI

3,181 citations

Posted Content
TL;DR: This article seeks to help ML practitioners apply MTL by shedding light on how MTL works and providing guidelines for choosing appropriate auxiliary tasks, particularly in deep neural networks.
Abstract: Multi-task learning (MTL) has led to successes in many applications of machine learning, from natural language processing and speech recognition to computer vision and drug discovery. This article aims to give a general overview of MTL, particularly in deep neural networks. It introduces the two most common methods for MTL in Deep Learning, gives an overview of the literature, and discusses recent advances. In particular, it seeks to help ML practitioners apply MTL by shedding light on how MTL works and providing guidelines for choosing appropriate auxiliary tasks.

2,202 citations