Home
/
Authors
/
Fabio Crestani

Author

Fabio Crestani

Other affiliations: University UCINF, University of Glasgow, Leiden University ...read more

Bio: Fabio Crestani is an academic researcher from University of Lugano. The author has contributed to research in topics: Relevance (information retrieval) & Ranking (information retrieval). The author has an hindex of 40, co-authored 365 publications receiving 6237 citations. Previous affiliations of Fabio Crestani include University UCINF & University of Glasgow.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Like It or Not: A Survey of Twitter Sentiment Analysis Methods

[...]

Anastasia Giachanou¹, Fabio Crestani¹•Institutions (1)

University of Lugano¹

30 Jun 2016-ACM Computing Surveys

TL;DR: Fields related to sentiment analysis in Twitter including Twitter opinion retrieval, tracking sentiments over time, irony detection, emotion detection, and tweet sentiment quantification, tasks that have recently attracted increasing attention are discussed.

...read moreread less

Abstract: Sentiment analysis in Twitter is a field that has recently attracted research interest. Twitter is one of the most popular microblog platforms on which users can publish their thoughts and opinions. Sentiment analysis in Twitter tackles the problem of analyzing the tweets in terms of the opinion they express. This survey provides an overview of the topic by investigating and briefly describing the algorithms that have been proposed for sentiment analysis in Twitter. The presented studies are categorized according to the approach they follow. In addition, we discuss fields related to sentiment analysis in Twitter including Twitter opinion retrieval, tracking sentiments over time, irony detection, emotion detection, and tweet sentiment quantification, tasks that have recently attracted increasing attention. Resources that have been used in the Twitter sentiment analysis literature are also briefly presented. The main contributions of this survey include the presentation of the proposed approaches for sentiment analysis in Twitter, their categorization according to the technique they use, and the discussion of recent research trends of the topic and its related fields.

...read moreread less

406 citations

Journal Article•DOI•

“Is this document relevant?…probably”: a survey of probabilistic models in information retrieval

[...]

Fabio Crestani¹, Mounia Lalmas¹, Cornelis J. van Rijsbergen¹, Iain Campbell¹•Institutions (1)

University of Glasgow¹

01 Dec 1998-ACM Computing Surveys

TL;DR: The basic concepts of probabilistic approaches to information retrieval are outlined and the principles and assumptions upon which the approaches are based are presented as mentioned in this paper, and various models proposed in the development of IR are described, classified, and compared using a common formalism.

...read moreread less

Abstract: This article surveys probablistic approaches to modeling information retrieval. The basic concepts of probabilistic approaches to information retrieval are outlined and the principles and assumptions upon which the approaches are based are presented. The various models proposed in the development of IR are described, classified, and compared using a common formalism. New approaches that constitute the basis of future research are described.

...read moreread less

244 citations

Book Chapter•DOI•

A Test Collection for Research on Depression and Language Use

[...]

David E. Losada¹, Fabio Crestani²•Institutions (2)

University of Santiago de Compostela¹, University of Lugano²

05 Sep 2016

TL;DR: A novel early detection task is proposed and a novel effectiveness measure is defined to systematically compare early detection algorithms that takes into account both the accuracy of the decisions taken by the algorithm and the delay in detecting positive cases.

...read moreread less

Abstract: Several studies in the literature have shown that the words people use are indicative of their psychological states. In particular, depression was found to be associated with distinctive linguistic patterns. However, there is a lack of publicly available data for doing research on the interaction between language and depression. In this paper, we describe our first steps to fill this gap. We outline the methodology we have adopted to build and make publicly available a test collection on depression and language use. The resulting corpus includes a series of textual interactions written by different subjects. The new collection not only encourages research on differences in language between depressed and non-depressed individuals, but also on the evolution of the language use of depressed individuals. Further, we propose a novel early detection task and define a novel effectiveness measure to systematically compare early detection algorithms. This new measure takes into account both the accuracy of the decisions taken by the algorithm and the delay in detecting positive cases. We also present baseline results with novel detection methods that process users’ interactions in different ways.

...read moreread less

199 citations

Proceedings Article•DOI•

Asking Clarifying Questions in Open-Domain Information-Seeking Conversations

[...]

Mohammad Aliannejadi¹, Hamed Zamani², Fabio Crestani¹, W. Bruce Croft²•Institutions (2)

University of Lugano¹, University of Massachusetts Amherst²

18 Jul 2019

TL;DR: This paper proposed a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval, which takes into account the original query and previous question-answer interactions while selecting the next question.

...read moreread less

Abstract: Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience. Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Asking clarifying questions is especially important in conversational systems since they can only return a limited number of (often only one) result(s). In this paper, we formulate the task of asking clarifying questions in open-domain information-seeking conversational systems. To this end, we propose an offline evaluation methodology for the task and collect a dataset, called Qulac, through crowdsourcing. Our dataset is built on top of the TREC Web Track 2009-2012 data and consists of over 10K question-answer pairs for 198 TREC topics with 762 facets. Our experiments on an oracle model demonstrate that asking only one good question leads to over 170% retrieval performance improvement in terms of P@1, which clearly demonstrates the potential impact of the task. We further propose a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval. In particular, our question selection model takes into account the original query and previous question-answer interactions while selecting the next question. Our model significantly outperforms competitive baselines. To foster research in this area, we have made Qulac publicly available.

...read moreread less

193 citations

Book Chapter•DOI•

eRISK 2017: CLEF Lab on Early Risk Prediction on the Internet: Experimental Foundations

[...]

David E. Losada¹, Fabio Crestani², Javier Parapar³•Institutions (3)

University of Santiago de Compostela¹, University of Lugano², University of A Coruña³

11 Sep 2017

TL;DR: This paper provides an overview of eRisk 2017, the main purpose of which was to explore issues of evaluation methodology, effectiveness metrics and other processes related to early risk detection.

...read moreread less

Abstract: This paper provides an overview of eRisk 2017. This was the first year that this lab was organized at CLEF. The main purpose of eRisk was to explore issues of evaluation methodology, effectiveness metrics and other processes related to early risk detection. Early detection technologies can be employed in different areas, particularly those related to health and safety. The first edition of eRisk included a pilot task on early risk detection of depression.

...read moreread less

122 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75

Collapse

Cited by

PDF

Open Access

More filters

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•

Data Mining Practical Machine Learning Tools and Techniques

[...]

อนิรุธ สืบสิงห์

01 Jan 2014-Journal of management science

9,185 citations

Journal Article•DOI•

Machine learning in automated text categorization

[...]

Fabrizio Sebastiani

01 Mar 2002-ACM Computing Surveys

TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.

...read moreread less

Abstract: The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last 10 years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.

...read moreread less

7,539 citations

Book•

Opinion Mining and Sentiment Analysis

[...]

Bo Pang¹, Lillian Lee²•Institutions (2)

Yahoo!¹, Cornell University²

08 Jul 2008

TL;DR: This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis.

...read moreread less

Abstract: An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.

...read moreread less

7,452 citations

Book•

社会研究方法基础 = The Basics of Social Research

[...]

Earl R. Babbie, 沢奇邱

01 Jan 2002

TL;DR: This chapter discusses the construction of Inquiry, the science of inquiry, and the role of data in the design of research.

...read moreread less

Abstract: Part I: AN INTRODUCTION TO INQUIRY. 1. Human Inquiry and Science. 2. Paradigms, Theory, and Research. 3. The Ethics and Politics of Social Research. Part II: THE STRUCTURING OF INQUIRY: QUANTITATIVE AND QUALITATIVE. 4. Research Design. 5. Conceptualization, Operationalization, and Measurement. 6. Indexes, Scales, and Typologies. 7. The Logic of Sampling. Part III: MODES OF OBSERVATION: QUANTITATIVE AND QUALITATIVE. 8. Experiments. 9. Survey Research. 10. Qualitative Field Research. 11. Unobtrusive Research. 12. Evaluation Research. Part IV: ANALYSIS OF DATA:QUANTITATIVE AND QUALITATIVE . 13. Qualitative Data Analysis. 14. Quantitative Data Analysis. 15. Reading and Writing Social Research. Appendix A. Using the Library. Appendix B. Random Numbers. Appendix C. Distribution of Chi Square. Appendix D. Normal Curve Areas. Appendix E. Estimated Sampling Error.

...read moreread less

2,884 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse