Home
/
Authors
/
Gülşen Eryiğit

Author

Gülşen Eryiğit

Bio: Gülşen Eryiğit is an academic researcher from Istanbul Technical University. The author has contributed to research in topics: Treebank & Turkish. The author has an hindex of 18, co-authored 59 publications receiving 2485 citations.

Topics: Treebank, Turkish, Dependency (UML), Parsing, Syntax (programming languages) ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2008
2007
2006
2005
2004

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

SemEval-2016 task 5 : aspect based sentiment analysis

[...]

Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Ion Androutsopoulos¹, Suresh Manandhar², Mohammad AL-Smadi³, Mahmoud Al-Ayyoub³, Yanyan Zhao⁴, Bing Qin⁴, Orphée De Clercq⁵, Veronique Hoste⁶, Marianna Apidianaki, Xavier Tannier, Natalia V. Loukachevitch⁷, Evgeniy V. Kotelnikov, Núria Bel⁸, Salud María Jiménez-Zafra⁹, Gülşen Eryiğit¹⁰ - Show less +14 more•Institutions (10)

Athens University of Economics and Business¹, University of York², Jordan University of Science and Technology³, Harbin Institute of Technology⁴, Ghent University⁵, Hogeschool Gent⁶, Moscow State University⁷, Pompeu Fabra University⁸, University of Jaén⁹, Istanbul Technical University¹⁰

01 Jan 2016

TL;DR: This paper describes the SemEval 2016 shared task on Aspect Based Sentiment Analysis (ABSA), a continuation of the respective tasks of 2014 and 2015, which attracted 245 submissions from 29 teams and provided 19 training and 20 testing datasets.

...read moreread less

Abstract: This paper describes the SemEval 2016 shared task on Aspect Based Sentiment Analysis (ABSA), a continuation of the respective tasks of 2014 and 2015. In its third year, the task provided 19 training and 20 testing datasets for 8 languages and 7 domains, as well as a common evaluation procedure. From these datasets, 25 were for sentence-level and 14 for text-level ABSA; the latter was introduced for the first time as a subtask in SemEval. The task attracted 245 submissions from 29 teams.

...read moreread less

1,139 citations

Journal Article•DOI•

MaltParser: A language-independent system for data-driven dependency parsing

[...]

Joakim Nivre¹, Johan Hall, Jens Nilsson, Atanas Chanev², Gülşen Eryiğit³, Sandra Kübler⁴, Svetoslav Marinov⁵, Erwin Marsi⁶ - Show less +4 more•Institutions (6)

Uppsala University¹, University of Trento², Istanbul Technical University³, University of Tübingen⁴, University of Skövde⁵, Tilburg University⁶

01 Jan 2005-Natural Language Engineering

TL;DR: Experimental evaluation confirms that MaltParser can achieve robust, efficient and accurate parsing for a wide range of languages without language-specific enhancements and with rather limited amounts of training data.

...read moreread less

Abstract: Parsing unrestricted text is useful for many language technology applications but requires parsing methods that are both robust and efficient. MaltParser is a language-independent system for data-driven dependency parsing that can be used to induce a parser for a new language from a treebank sample in a simple yet flexible manner. Experimental evaluation confirms that MaltParser can achieve robust, efficient and accurate parsing for a wide range of languages without language-specific enhancements and with rather limited amounts of training data.

...read moreread less

801 citations

Journal Article•DOI•

Multiword expression processing: A survey

[...]

Mathieu Constant¹, Gülşen Eryiğit², Johanna Monti³, Lonneke van der Plas⁴, Carlos Ramisch⁵, Mike Rosner⁴, Amalia Todirascu⁶ - Show less +3 more•Institutions (6)

University of Lorraine¹, Istanbul Technical University², University of Naples Federico II³, University of Malta⁴, Aix-Marseille University⁵, University of Strasbourg⁶

01 Dec 2017-Computational Linguistics

TL;DR: A shared understanding of what is meant by “MWE processing” is offered, distinguishing the subtasks of MWE discovery and identification, and the interactions between MWE processing and two use cases: Parsing and machine translation are elucidated.

...read moreread less

Abstract: Multiword expressions MWEs are a class of linguistic forms spanning conventional word boundaries that are both idiosyncratic and pervasive across different languages. The structure of linguistic processing that depends on the clear distinction between words and phrases has to be re-thought to accommodate MWEs. The issue of MWE handling is crucial for NLP applications, where it raises a number of challenges. The emergence of solutions in the absence of guiding principles motivates this survey, whose aim is not only to provide a focused review of MWE processing, but also to clarify the nature of interactions between MWE processing and downstream applications. We propose a conceptual framework within which challenges and research contributions can be positioned. It offers a shared understanding of what is meant by "MWE processing," distinguishing the subtasks of MWE discovery and identification. It also elucidates the interactions between MWE processing and two use cases: Parsing and machine translation. Many of the approaches in the literature can be differentiated according to how MWE processing is timed with respect to underlying use cases. We discuss how such orchestration choices affect the scope of MWE-aware systems. For each of the two MWE processing subtasks and for each of the two use cases, we conclude on open issues and research perspectives.

...read moreread less

158 citations

Journal Article•DOI•

Dependency parsing of turkish

[...]

Gülşen Eryiğit¹, Joakim Nivre², Kemal Oflazer³•Institutions (3)

Istanbul Technical University¹, Uppsala University², Carnegie Mellon University³

01 Dec 2008-Computational Linguistics

TL;DR: An investigation of data-driven dependency parsing of Turkish, an agglutinative, free constituent order language that can be seen as the representative of a wider class of languages of similar type, shows that morphological structure plays an essential role in finding syntactic relations in such a language.

...read moreread less

Abstract: The suitability of different parsing methods for different languages is an important topic in syntactic parsing. Especially lesser-studied languages, typologically different from the languages for which methods have originally been developed, pose interesting challenges in this respect. This article presents an investigation of data-driven dependency parsing of Turkish, an agglutinative, free constituent order language that can be seen as the representative of a wider class of languages of similar type. Our investigations show that morphological structure plays an essential role in finding syntactic relations in such a language. In particular, we show that employing sublexical units called inflectional groups, rather than word forms, as the basic parsing units improves parsing accuracy. We test our claim on two different parsing methods, one based on a probabilistic model with beam search and the other based on discriminative classifiers and a deterministic parsing strategy, and show that the usefulness of sublexical units holds regardless of the parsing method. We examine the impact of morphological and lexical information in detail and show that, properly used, this kind of information can improve parsing accuracy substantially. Applying the techniques presented in this article, we achieve the highest reported accuracy for parsing the Turkish Treebank.

...read moreread less

102 citations

Universal Dependencies 2.2

[...]

Joakim Nivre, Mitchell Abrams¹, Željko Agić², Lars Ahrenberg +261 more•Institutions (28)

01 Jul 2018

61 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

LIBSVM: A library for support vector machines

[...]

Chih-Chung Chang¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

06 May 2011-ACM Transactions on Intelligent Systems and Technology

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

...read moreread less

40,826 citations

Proceedings Article•

Open information extraction from the web

[...]

Michele Banko¹, Michael Cafarella¹, Stephen Soderland¹, Matt Broadhead¹, Oren Etzioni¹ - Show less +1 more•Institutions (1)

University of Washington¹

06 Jan 2007

TL;DR: Open Information Extraction (OIE) as mentioned in this paper is a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input.

...read moreread less

Abstract: Traditionally, Information Extraction (IE) has focused on satisfying precise, narrow, pre-specified requests from small homogeneous corpora (e.g., extract the location and time of seminars from a set of announcements). Shifting to a new domain requires the user to name the target relations and to manually create new extraction rules or hand-tag new training examples. This manual labor scales linearly with the number of target relations. This paper introduces Open IE (OIE), a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input. The paper also introduces TEXTRUNNER, a fully implemented, highly scalable OIE system where the tuples are assigned a probability and indexed to support efficient extraction and exploration via user queries. We report on experiments over a 9,000,000 Web page corpus that compare TEXTRUNNER with KNOWITALL, a state-of-the-art Web IE system. TEXTRUNNER achieves an error reduction of 33% on a comparable set of extractions. Furthermore, in the amount of time it takes KNOWITALL to perform extraction for a handful of pre-specified relations, TEXTRUNNER extracts a far broader set of facts reflecting orders of magnitude more relations, discovered on the fly. We report statistics on TEXTRUNNER's 11,000,000 highest probability tuples, and show that they contain over 1,000,000 concrete facts and over 6,500,000 more abstract assertions.

...read moreread less

1,574 citations

Proceedings Article•

Parallel Data, Tools and Interfaces in OPUS

[...]

J"org Tiedemann¹•Institutions (1)

Uppsala University¹

01 May 2012

TL;DR: New data sets and their features, additional annotation tools and models provided from the website and essential interfaces and on-line services included in the OPUS project are reported.

...read moreread less

Abstract: This paper presents the current status of OPUS, a growing language resource of parallel corpora and related tools. The focus in OPUS is to provide freely available data sets in various formats together with basic annotation to be useful for applications in computational linguistics, translation studies and cross-linguistic corpus studies. In this paper, we report about new data sets and their features, additional annotation tools and models provided from the website and essential interfaces and on-line services included in the project.

...read moreread less

1,559 citations

Proceedings Article•DOI•

SemEval-2016 task 5 : aspect based sentiment analysis

[...]

01 Jan 2016

...read moreread less

1,139 citations

Proceedings Article•DOI•

SemEval-2017 Task 4: Sentiment Analysis in Twitter

[...]

Sara Rosenthal¹, Noura Farra¹, Preslav Nakov²•Institutions (2)

Columbia University¹, Qatar Computing Research Institute²

01 Aug 2017

TL;DR: Crowdourcing on Amazon Mechanical Turk was used to label a large Twitter training dataset along with additional test sets of Twitter and SMS messages for both subtasks, which included two subtasks: A, an expression-level subtask, and B, a message level subtask.

...read moreread less

Abstract: This paper describes the fifth year of the Sentiment Analysis in Twitter task. SemEval-2017 Task 4 continues with a rerun of the subtasks of SemEval-2016 Task 4, which include identifying the overall sentiment of the tweet, sentiment towards a topic with classification on a two-point and on a five-point ordinal scale, and quantification of the distribution of sentiment towards a topic across a number of tweets: again on a two-point and on a five-point ordinal scale. Compared to 2016, we made two changes: (i) we introduced a new language, Arabic, for all subtasks, and (ii) we made available information from the profiles of the Twitter users who posted the target tweets. The task continues to be very popular, with a total of 48 teams participating this year.

...read moreread less

1,107 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse