Home
/
Authors
/
Jennifer Foster

Author

Jennifer Foster

Other affiliations: University of North Texas, Trinity College, Dublin, University College Dublin

Bio: Jennifer Foster is an academic researcher from Dublin City University. The author has contributed to research in topics: Parsing & Treebank. The author has an hindex of 24, co-authored 116 publications receiving 2448 citations. Previous affiliations of Jennifer Foster include University of North Texas & Trinity College, Dublin.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2005
2004

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Code Mixing: A Challenge for Language Identification in the Language of Social Media

[...]

Utsab Barman¹, Amitava Das¹, Joachim Wagner¹, Jennifer Foster²•Institutions (2)

Dublin City University¹, University of North Texas²

01 Oct 2014

TL;DR: A new dataset is described, which contains Facebook posts and comments that exhibit code mixing between Bengali, English and Hindi, and it is found that the dictionary-based approach is surpassed by supervised classification and sequence labelling, and that it is important to take contextual clues into consideration.

...read moreread less

Abstract: In social media communication, multilingual speakers often switch between languages, and, in such an environment, automatic language identification becomes both a necessary and challenging task. In this paper, we describe our work in progress on the problem of automatic language identification for the language of social media. We describe a new dataset that we are in the process of creating, which contains Facebook posts and comments that exhibit code mixing between Bengali, English and Hindi. We also present some preliminary word-level language identification experiments using this dataset. Different techniques are employed, including a simple unsupervised dictionary-based approach, supervised word-level classification with and without contextual clues, and sequence labelling using Conditional Random Fields. We find that the dictionary-based approach is surpassed by supervised classification and sequence labelling, and that it is important to take contextual clues into consideration.

...read moreread less

273 citations

Proceedings Article•DOI•

DCU: Aspect-based Polarity Classification for SemEval Task 4

[...]

Joachim Wagner¹, Piyush Arora¹, Santiago Cortes¹, Utsab Barman¹, Dasha Bogdanova¹, Jennifer Foster¹, Lamia Tounsi¹ - Show less +3 more•Institutions (1)

Dublin City University¹

24 Aug 2014

TL;DR: The DCU team submitted one constrained run for the restaurant domain and one for the laptop domain for sub-task B (aspect term polarity prediction), ranking highest out of 36 systems on the restaurant test set and joint highest on the laptop test set.

...read moreread less

Abstract: We describe the work carried out by DCU on the Aspect Based Sentiment Analysis task at SemEval 2014. Our team submitted one constrained run for the restaurant domain and one for the laptop domain for sub-task B (aspect term polarity prediction), ranking highest out of 36 systems on the restaurant test set and joint highest out of 32 systems on the laptop test set.

...read moreread less

216 citations

Proceedings Article•

Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages

[...]

Djamé Seddah, Reut Tsarfaty, Sandra Kübler, Marie Candito, Jinho D. Choi, Richárd Farkas, Jennifer Foster, Iakes Goenaga, Koldo Gojenola Galletebeitia, Yoav Goldberg¹, Spence Green², Nizar Habash, Marco Kuhlmann, Wolfgang Maier, Joakim Nivre, Adam Przepiórkowski, Ryan M. Roth, Wolfgang Seeker, Yannick Versley, Veronika Vincze, Marcin Woliński³, Alina Wróblewska, Éric Villemonte de la Clergerie - Show less +19 more•Institutions (3)

Bar-Ilan University¹, Stanford University², Polish Academy of Sciences³

18 Oct 2013

TL;DR: This paper presents and analyzes parsing results obtained by the task participants, and provides an analysis and comparison of the parsers across languages and frameworks, reported for gold input as well as more realistic parsing scenarios.

...read moreread less

Abstract: This paper reports on the first shared task on statistical parsing of morphologically rich languages (MRLs). The task features data sets from nine languages, each available both in constituency and dependency annotation. We report on the preparation of the data sets, on the proposed parsing scenarios, and on the evaluation metrics for parsing MRLs given different representation types. We present and analyze parsing results obtained by the task participants, and then provide an analysis and comparison of the parsers across languages and frameworks, reported for gold input as well as more realistic parsing scenarios.

...read moreread less

155 citations

Proceedings Article•

Statistical Parsing of Morphologically Rich Languages (SPMRL) What, How and Whither

[...]

Reut Tsarfaty¹, Djamé Seddah², Yoav Goldberg³, Sandra Kuebler⁴, Yannick Versley⁵, Marie Candito⁶, Jennifer Foster⁷, Ines Rehbein, Lamia Tounsi⁷ - Show less +5 more•Institutions (7)

Uppsala University¹, Paris-Sorbonne University², Ben-Gurion University of the Negev³, Indiana University⁴, University of Tübingen⁵, French Institute for Research in Computer Science and Automation⁶, Dublin City University⁷

05 Jun 2010

TL;DR: This paper synthesizes the contributions of researchers working on parsing Arabic, Basque, French, German, Hebrew, Hindi and Korean to point out shared solutions across languages and suggests itself as a source of directions for future investigations.

...read moreread less

Abstract: The term Morphologically Rich Languages (MRLs) refers to languages in which significant information concerning syntactic units and relations is expressed at word-level. There is ample evidence that the application of readily available statistical parsing models to such languages is susceptible to serious performance degradation. The first workshop on statistical parsing of MRLs hosts a variety of contributions which show that despite language-specific idiosyncrasies, the problems associated with parsing MRLs cut across languages and parsing frameworks. In this paper we review the current state-of-affairs with respect to parsing MRLs and point out central challenges. We synthesize the contributions of researchers working on parsing Arabic, Basque, French, German, Hebrew, Hindi and Korean to point out shared solutions across languages. The overarching analysis suggests itself as a source of directions for future investigations.

...read moreread less

132 citations

Proceedings Article•

#hardtoparse: POS tagging and parsing the twitterverse

[...]

Jennifer Foster¹, Özlem Çetinoǧlu¹, Joachim Wagner¹, Joseph Le Roux², Stephen Hogan¹, Joakim Nivre³, Deirdre Hogan¹, Josef van Genabith¹ - Show less +4 more•Institutions (3)

Dublin City University¹, Centre national de la recherche scientifique², Uppsala University³

01 Jan 2011

TL;DR: Retraining Malt on dependency trees produced by a state-of-the-art phrase structure parser, which has itself been self-trained on Twitter material, results in a significant improvement and is analysed by examining in detail the effect of the retraining on individual dependency types.

...read moreread less

Abstract: We evaluate the statistical dependency parser, Malt, on a new dataset of sentences taken from tweets. We use a version of Malt which is trained on gold standard phrase structure Wall Street Journal (WSJ) trees converted to Stanford labelled dependencies. We observe a drastic drop in performance moving from our in-domain WSJ test set to the new Twitter dataset, much of which has to do with the propagation of part-of-speech tagging errors. Retraining Malt on dependency trees produced by a state-of-the-art phrase structure parser, which has itself been self-trained on Twitter material, results in a significant improvement. We analyse this improvement by examining in detail the effect of the retraining on individual dependency types.

...read moreread less

125 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Phd by thesis

[...]

Richard Lathe¹•Institutions (1)

French Institute of Health and Medical Research¹

01 Apr 1988-Nature

TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.

...read moreread less

Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

...read moreread less

9,929 citations

Journal Article•DOI•

Neural Network Acceptability Judgments

[...]

Alex Warstadt¹, Amanpreet Singh, Samuel R. Bowman¹•Institutions (1)

New York University¹

30 Sep 2019-Transactions of the Association for Computational Linguistics

TL;DR: This article used a corpus of 10,657 English sentences labeled as grammatical or ungrammatical from published linguistics literature to test the ability of artificial neural networks to judge the grammatical acceptability of a sentence, with the goal of testing their linguistic competence.

...read moreread less

Abstract: This paper investigates the ability of artificial neural networks to judge the grammatical acceptability of a sentence, with the goal of testing their linguistic competence. We introduce the Corpus of Linguistic Acceptability (CoLA), a set of 10,657 English sentences labeled as grammatical or ungrammatical from published linguistics literature. As baselines, we train several recurrent neural network models on acceptability classification, and find that our models outperform unsupervised models by Lau et al. (2016) on CoLA. Error-analysis on specific grammatical phenomena reveals that both Lau et al.’s models and ours learn systematic generalizations like subject-verb-object order. However, all models we test perform far below human level on a wide range of grammatical constructions.

...read moreread less

903 citations

Proceedings Article•

Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters

[...]

Olutobi Owoputi, Brendan O'Connor¹, Chris Dyer¹, Kevin Gimpel², Nathan Schneider, Noah A. Smith¹ - Show less +2 more•Institutions (2)

Carnegie Mellon University¹, Toyota Technological Institute at Chicago²

01 Jun 2013

TL;DR: This work systematically evaluates the use of large-scale unsupervised word clustering and new lexical features to improve tagging accuracy on Twitter and achieves state-of-the-art tagging results on both Twitter and IRC POS tagging tasks.

...read moreread less

Abstract: We consider the problem of part-of-speech tagging for informal, online conversational text. We systematically evaluate the use of large-scale unsupervised word clustering and new lexical features to improve tagging accuracy. With these features, our system achieves state-of-the-art tagging results on both Twitter and IRC POS tagging tasks; Twitter tagging is improved from 90% to 93% accuracy (more than 3% absolute). Qualitative analysis of these word clusters yields insights about NLP and linguistic phenomena in this genre. Additionally, we contribute the first POS annotation guidelines for such text and release a new dataset of English language tweets annotated using these guidelines. Tagging software, annotation guidelines, and large-scale word clusters are available at: http://www.ark.cs.cmu.edu/TweetNLP This paper describes release 0.3 of the “CMU Twitter Part-of-Speech Tagger” and annotated data. [This paper is forthcoming in Proceedings of NAACL 2013; Atlanta, GA, USA.]

...read moreread less

780 citations

Journal Article•DOI•

Language and the Internet

[...]

Jean Aitchison

01 Sep 2002-Literary and Linguistic Computing

764 citations

Proceedings Article•DOI•

Aspect Level Sentiment Classification with Deep Memory Network

[...]

Duyu Tang¹, Bing Qin², Ting Liu²•Institutions (2)

Microsoft¹, Harbin Institute of Technology²

01 Nov 2016

TL;DR: The authors proposed a deep memory network for aspect level sentiment classification, which explicitly captures the importance of each context word when inferring the sentiment polarity of an aspect, such importance degree and text representation are calculated with multiple computational layers, each of which is a neural attention model over an external memory.

...read moreread less

Abstract: We introduce a deep memory network for aspect level sentiment classification. Unlike feature-based SVM and sequential neural models such as LSTM, this approach explicitly captures the importance of each context word when inferring the sentiment polarity of an aspect. Such importance degree and text representation are calculated with multiple computational layers, each of which is a neural attention model over an external memory. Experiments on laptop and restaurant datasets demonstrate that our approach performs comparable to state-of-art feature based SVM system, and substantially better than LSTM and attention-based LSTM architectures. On both datasets we show that multiple computational layers could improve the performance. Moreover, our approach is also fast. The deep memory network with 9 layers is 15 times faster than LSTM with a CPU implementation.

...read moreread less

731 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse