Home
/
Authors
/
Christos Baziotis

Author

Christos Baziotis

Other affiliations: Athens University of Economics and Business, National Technical University of Athens

Bio: Christos Baziotis is an academic researcher from University of Edinburgh. The author has contributed to research in topics: Language model & Word2vec. The author has an hindex of 11, co-authored 23 publications receiving 753 citations. Previous affiliations of Christos Baziotis include Athens University of Economics and Business & National Technical University of Athens.

Topics: Language model, Word2vec, SemEval, Transfer of learning, Emotion classification ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis.

[...]

Christos Baziotis¹, Nikos Pelekis², Christos Doulkeridis²•Institutions (2)

National Technical University of Athens¹, University of Piraeus²

01 Aug 2017

TL;DR: Two deep-learning systems that competed at SemEval-2017 Task 4 “Sentiment Analysis in Twitter” are presented, which use Long Short-Term Memory networks augmented with two kinds of attention mechanisms, on top of word embeddings pre-trained on a big collection of Twitter messages.

...read moreread less

Abstract: In this paper we present two deep-learning systems that competed at SemEval-2017 Task 4 “Sentiment Analysis in Twitter”. We participated in all subtasks for English tweets, involving message-level and topic-based sentiment polarity classification and quantification. We use Long Short-Term Memory (LSTM) networks augmented with two kinds of attention mechanisms, on top of word embeddings pre-trained on a big collection of Twitter messages. Also, we present a text processing tool suitable for social network messages, which performs tokenization, word normalization, segmentation and spell correction. Moreover, our approach uses no hand-crafted features or sentiment lexicons. We ranked 1st (tie) in Subtask A, and achieved very competitive results in the rest of the Subtasks. Both the word embeddings and our text processing tool are available to the research community.

...read moreread less

449 citations

Proceedings Article•DOI•

NTUA-SLP at SemEval-2018 Task 1: Predicting Affective Content in Tweets with Deep Attentive RNNs and Transfer Learning.

[...]

Christos Baziotis¹, Athanasiou Nikolaos, Alexandra Chronopoulou², Athanasia Kolovou³, Georgios Paraskevopoulos⁴, Nikolaos Ellinas⁴, Shrikanth S. Narayanan⁵, Alexandros Potamianos⁴ - Show less +4 more•Institutions (5)

Athens University of Economics and Business¹, University of Illinois at Urbana–Champaign², National and Kapodistrian University of Athens³, National Technical University of Athens⁴, University of Southern California⁵

01 Jun 2018

TL;DR: This paper proposed a Bi-LSTM architecture equipped with a multi-layer self attention mechanism, which improves the model performance and allows to identify salient words in tweets, as well as gain insight into the models making them more interpretable.

...read moreread less

Abstract: In this paper we present deep-learning models that submitted to the SemEval-2018 Task 1 competition: “Affect in Tweets”. We participated in all subtasks for English tweets. We propose a Bi-LSTM architecture equipped with a multi-layer self attention mechanism. The attention mechanism improves the model performance and allows us to identify salient words in tweets, as well as gain insight into the models making them more interpretable. Our model utilizes a set of word2vec word embeddings trained on a large collection of 550 million Twitter messages, augmented by a set of word affective features. Due to the limited amount of task-specific training data, we opted for a transfer learning approach by pretraining the Bi-LSTMs on the dataset of Semeval 2017, Task 4A. The proposed approach ranked 1st in Subtask E “Multi-Label Emotion Classification”, 2nd in Subtask A “Emotion Intensity Regression” and achieved competitive results in other subtasks.

...read moreread less

98 citations

Proceedings Article•DOI•

An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models

[...]

Alexandra Chronopoulou¹, Christos Baziotis¹, Alexandros Potamianos², Alexandros Potamianos¹•Institutions (2)

National Technical University of Athens¹, University of Southern California²

01 Jun 2019

TL;DR: This paper combines the task-specific optimization function with an auxiliary language model objective, which is adjusted during the training process, that preserves language regularities captured by language models, while enabling sufficient adaptation for solving the target task.

...read moreread less

Abstract: A growing number of state-of-the-art transfer learning methods employ language models pretrained on large generic corpora. In this paper we present a conceptually simple and effective transfer learning approach that addresses the problem of catastrophic forgetting. Specifically, we combine the task-specific optimization function with an auxiliary language model objective, which is adjusted during the training process. This preserves language regularities captured by language models, while enabling sufficient adaptation for solving the target task. Our method does not require pretraining or finetuning separate components of the network and we train our models end-to-end in a single step. We present results on a variety of challenging affective and text classification tasks, surpassing well established transfer learning methods with greater level of complexity.

...read moreread less

89 citations

Proceedings Article•DOI•

SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression

[...]

Christos Baziotis¹, Ion Androutsopoulos², Ioannis Konstas, Alexandros Potamianos¹•Institutions (2)

National Technical University of Athens¹, Athens University of Economics and Business²

01 Apr 2019

TL;DR: The proposed sequence-to-sequence- to-sequence autoencoder (SEQˆ3), consisting of two chained encoder-decoder pairs, with words used as a sequence of discrete latent variables, achieves promising results in unsupervised sentence compression on benchmark datasets.

...read moreread less

Abstract: Neural sequence-to-sequence models are currently the dominant approach in several natural language processing tasks, but require large parallel corpora. We present a sequence-to-sequence-to-sequence autoencoder (SEQˆ3), consisting of two chained encoder-decoder pairs, with words used as a sequence of discrete latent variables. We apply the proposed model to unsupervised abstractive sentence compression, where the first and last sequences are the input and reconstructed sentences, respectively, while the middle sequence is the compressed sentence. Constraining the length of the latent word sequences forces the model to distill important information from the input. A pretrained language model, acting as a prior over the latent sequences, encourages the compressed sentences to be human-readable. Continuous relaxations enable us to sample from categorical distributions, allowing gradient-based optimization, unlike alternatives that rely on reinforcement learning. The proposed model does not require parallel text-summary pairs, achieving promising results in unsupervised sentence compression on benchmark datasets.

...read moreread less

76 citations

Posted Content•

An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models

[...]

Alexandra Chronopoulou¹, Christos Baziotis¹, Alexandros Potamianos¹•Institutions (1)

National Technical University of Athens¹

27 Feb 2019-arXiv: Computation and Language

TL;DR: The authors combine the task-specific optimization function with an auxiliary language model objective, which is adjusted during the training process to preserve language regularities captured by language models, while enabling sufficient adaptation for solving the target task.

...read moreread less

41 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

[...]

Suchin Gururangan¹, Ana Marasović², Ana Marasović¹, Swabha Swayamdipta¹, Kyle Lo¹, Iz Beltagy¹, Doug Downey¹, Noah A. Smith², Noah A. Smith¹ - Show less +5 more•Institutions (2)

Allen Institute for Artificial Intelligence¹, University of Washington²

23 Apr 2020

TL;DR: It is consistently found that multi-phase adaptive pretraining offers large gains in task performance, and it is shown that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable.

...read moreread less

Abstract: Language models pretrained on text from a wide variety of sources form the foundation of today’s NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target task. We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains, under both high- and low-resource settings. Moreover, adapting to the task’s unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining. Finally, we show that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable. Overall, we consistently find that multi-phase adaptive pretraining offers large gains in task performance.

...read moreread less

1,532 citations

Proceedings Article•DOI•

SemEval-2017 Task 4: Sentiment Analysis in Twitter

[...]

Sara Rosenthal¹, Noura Farra¹, Preslav Nakov²•Institutions (2)

Columbia University¹, Qatar Computing Research Institute²

01 Aug 2017

TL;DR: Crowdourcing on Amazon Mechanical Turk was used to label a large Twitter training dataset along with additional test sets of Twitter and SMS messages for both subtasks, which included two subtasks: A, an expression-level subtask, and B, a message level subtask.

...read moreread less

Abstract: This paper describes the fifth year of the Sentiment Analysis in Twitter task. SemEval-2017 Task 4 continues with a rerun of the subtasks of SemEval-2016 Task 4, which include identifying the overall sentiment of the tweet, sentiment towards a topic with classification on a two-point and on a five-point ordinal scale, and quantification of the distribution of sentiment towards a topic across a number of tweets: again on a two-point and on a five-point ordinal scale. Compared to 2016, we made two changes: (i) we introduced a new language, Arabic, for all subtasks, and (ii) we made available information from the profiles of the Twitter users who posted the target tweets. The task continues to be very popular, with a total of 48 teams participating this year.

...read moreread less

1,107 citations

Journal Article•DOI•

Pre-trained Models for Natural Language Processing: A Survey

[...]

Xipeng Qiu¹, Tianxiang Sun¹, Yige Xu¹, Yunfan Shao¹, Ning Dai¹, Xuanjing Huang¹ - Show less +2 more•Institutions (1)

Fudan University¹

18 Mar 2020-Science China-technological Sciences

TL;DR: Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era as mentioned in this paper, and a comprehensive review of PTMs for NLP can be found in this survey.

...read moreread less

Abstract: Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy from four different perspectives. Next, we describe how to adapt the knowledge of PTMs to downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.

...read moreread less

755 citations

Proceedings Article•DOI•

SemEval-2016 Task 4: Sentiment Analysis in Twitter

[...]

Preslav Nakov¹, Alan Ritter², Sara Rosenthal³, Fabrizio Sebastiani⁴, Veselin Stoyanov⁵ - Show less +1 more•Institutions (5)

Qatar Foundation¹, Ohio State University², Columbia University³, Qatar Computing Research Institute⁴, Facebook⁵

01 Jun 2016

TL;DR: The SemEval-2016 Task 4 comprises five subtasks, three of which represent a significant departure from previous editions. as mentioned in this paper discusses the fourth year of the Sentiment Analysis in Twitter Task and discusses the three new subtasks focus on two variants of the basic sentiment classification in Twitter task.

...read moreread less

Abstract: This paper discusses the fourth year of the ”Sentiment Analysis in Twitter Task”. SemEval-2016 Task 4 comprises five subtasks, three of which represent a significant departure from previous editions. The first two subtasks are reruns from prior years and ask to predict the overall sentiment, and the sentiment towards a topic in a tweet. The three new subtasks focus on two variants of the basic “sentiment classification in Twitter” task. The first variant adopts a five-point scale, which confers an ordinal character to the classification task. The second variant focuses on the correct estimation of the prevalence of each class of interest, a task which has been called quantification in the supervised learning literature. The task continues to be very popular, attracting a total of 43 teams.

...read moreread less

702 citations

Proceedings Article•DOI•

VIBE: Video Inference for Human Body Pose and Shape Estimation

[...]

Muhammed Kocabas¹, Nikos Athanasiou¹, Michael J. Black¹•Institutions (1)

Max Planck Society¹

14 Jun 2020

TL;DR: This work defines a novel temporal network architecture with a self-attention mechanism and shows that adversarial training, at the sequence level, produces kinematically plausible motion sequences without in-the-wild ground-truth 3D labels.

...read moreread less

Abstract: Human motion is fundamental to understanding behavior. Despite progress on single-image 3D pose and shape estimation, existing video-based state-of-the-art methods fail to produce accurate and natural motion sequences due to a lack of ground-truth 3D motion data for training. To address this problem, we propose "Video Inference for Body Pose and Shape Estimation'' (VIBE), which makes use of an existing large-scale motion capture dataset (AMASS) together with unpaired, in-the-wild, 2D keypoint annotations. Our key novelty is an adversarial learning framework that leverages AMASS to discriminate between real human motions and those produced by our temporal pose and shape regression networks. We define a novel temporal network architecture with a self-attention mechanism and show that adversarial training, at the sequence level, produces kinematically plausible motion sequences without in-the-wild ground-truth 3D labels. We perform extensive experimentation to analyze the importance of motion and demonstrate the effectiveness of VIBE on challenging 3D pose estimation datasets, achieving state-of-the-art performance. Code and pretrained models are available at https://github.com/mkocabas/VIBE

...read moreread less

687 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182

Collapse