Home
/
Authors
/
Ismail Sengor Altingovde

Author

Ismail Sengor Altingovde

Bio: Ismail Sengor Altingovde is an academic researcher from Middle East Technical University. The author has contributed to research in topics: Web search query & Web query classification. The author has an hindex of 18, co-authored 79 publications receiving 1039 citations. Previous affiliations of Ismail Sengor Altingovde include Bilkent University.

Papers published on a yearly basis

2023
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2004
2002
2001

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Neural information retrieval: at the end of the early years

[...]

Kezban Dilek Onal¹, Kezban Dilek Onal², Ye Zhang³, Ismail Sengor Altingovde¹, Md. Mustafizur Rahman³, Pinar Karagoz¹, Alexander Braylan³, Brandon Dang³, Heng-Lu Chang³, Henna Kim³, Quinten McNamara³, Aaron Angert⁴, Edward Banner⁵, Vivek Khetan³, Tyler McDonnell³, An Thanh Nguyen³, Dan Xu³, Byron C. Wallace⁵, Maarten de Rijke², Matthew Lease³ - Show less +16 more•Institutions (5)

Middle East Technical University¹, University of Amsterdam², University of Texas at Austin³, IBM⁴, Northeastern University⁵

01 Jun 2018-Information Retrieval

TL;DR: The successes of neural IR thus far are highlighted, obstacles to its wider adoption are cataloged, and potentially promising directions for future research are suggested.

...read moreread less

Abstract: A recent “third wave” of neural network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, work in this area is often referred to as deep learning. Recent years have witnessed an explosive growth of research into NN-based approaches to information retrieval (IR). A significant body of work has now been created. In this paper, we survey the current landscape of Neural IR research, paying special attention to the use of learned distributed representations of textual units. We highlight the successes of neural IR thus far, catalog obstacles to its wider adoption, and suggest potentially promising directions for future research.

...read moreread less

124 citations

Journal Article•DOI•

Cost-Aware Strategies for Query Result Caching in Web Search Engines

[...]

Rifat Ozcan¹, Ismail Sengor Altingovde¹, Özgür Ulusoy¹•Institutions (1)

Bilkent University¹

01 May 2011-ACM Transactions on The Web

TL;DR: Simulation results using two large Web crawl datasets and a real query log reveal that the proposed approach improves overall system performance in terms of the average query execution time.

...read moreread less

Abstract: Search engines and large-scale IR systems need to cache query results for efficiency and scalability purposes. Static and dynamic caching techniques (as well as their combinations) are employed to effectively cache query results. In this study, we propose cost-aware strategies for static and dynamic caching setups. Our research is motivated by two key observations: (i) query processing costs may significantly vary among different queries, and (ii) the processing cost of a query is not proportional to its popularity (i.e., frequency in the previous logs). The first observation implies that cache misses have different, that is, nonuniform, costs in this context. The latter observation implies that typical caching policies, solely based on query popularity, can not always minimize the total cost. Therefore, we propose to explicitly incorporate the query costs into the caching policies. Simulation results using two large Web crawl datasets and a real query log reveal that the proposed approach improves overall system performance in terms of the average query execution time.

...read moreread less

59 citations

Journal Article•DOI•

Analyzing and Mining Comments and Comment Ratings on the Social Web

[...]

Stefan Siersdorfer, Sergiu Chelaru, Jose San Pedro¹, Ismail Sengor Altingovde², Wolfgang Nejdl - Show less +1 more•Institutions (2)

Telefónica¹, Middle East Technical University²

08 Jul 2014-ACM Transactions on The Web

TL;DR: An in-depth study of commenting and comment rating behavior on a sample of more than 10 million user comments on YouTube and Yahoo! News, which explores the applicability of machine learning and data mining to detect acceptance of comments by the community, comments likely to trigger discussions, controversial and polarizing content, and users exhibiting offensive commenting behavior.

...read moreread less

Abstract: An analysis of the social video sharing platform YouTube and the news aggregator Yahooe News reveals the presence of vast amounts of community feedback through comments for published videos and news stories, as well as through metaratings for these comments. This article presents an in-depth study of commenting and comment rating behavior on a sample of more than 10 million user comments on YouTube and Yahooe News. In this study, comment ratings are considered first-class citizens. Their dependencies with textual content, thread structure of comments, and associated content (e.g., videos and their metadata) are analyzed to obtain a comprehensive understanding of the community commenting behavior. Furthermore, this article explores the applicability of machine learning and data mining to detect acceptance of comments by the community, comments likely to trigger discussions, controversial and polarizing content, and users exhibiting offensive commenting behavior. Results from this study have potential application in guiding the design of community-oriented online discussion platforms.

...read moreread less

57 citations

Journal Article•DOI•

Exploiting interclass rules for focused crawling

[...]

Ismail Sengor Altingovde¹, Özgür Ulusoy¹•Institutions (1)

Bilkent University¹

01 Nov 2004-IEEE Intelligent Systems

TL;DR: A rule-based Web-crawling approach that uses linkage statistics among topics to improve a baseline focused crawler's harvest rate and coverage and enhances the baseline crawler by supporting tunneling.

...read moreread less

Abstract: Crawling the Web quickly and entirely is an expensive, unrealistic goal because of the required hardware and network resources. We started with a focused-crawling approach designed by Soumen Chakrabarti, Martin van den Berg, and Byron Dom, and we implemented the underlying philosophy of their approach to derive our baseline crawler. This crawler employs a canonical topic taxonomy to train a naive-Bayesian classifier, which then helps determine the relevancy of crawled pages. The crawler also relies on the assumption of topical locality to decide which URLs to visit next. Building on this crawler, we developed a rule-based crawler, which uses simple rules derived from interclass (topic) linkage patterns to decide its next move. This rule-based crawler also enhances the baseline crawler by supporting tunneling. A focused crawler gathers relevant Web pages on a particular topic. This rule-based Web-crawling approach uses linkage statistics among topics to improve a baseline focused crawler's harvest rate and coverage.

...read moreread less

56 citations

Journal Article•DOI•

Efficiency and effectiveness of query processing in cluster-based retrieval

[...]

Fazli Can¹, Ismail Sengor Altingovde², Engin Demir²•Institutions (2)

Miami University¹, Bilkent University²

01 Dec 2004-Information Systems

TL;DR: This study provides CBR efficiency and effectiveness experiments using the largest corpus in an environment that employs no user interaction or user behavior assumption for clustering and confirms that the approach is scalable and system performance improves with increasing database size.

...read moreread less

42 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

Data Mining Practical Machine Learning Tools and Techniques

[...]

อนิรุธ สืบสิงห์

01 Jan 2014-Journal of management science

9,185 citations

Book•

Information retrieval

[...]

C. J. Van Rijsbergen

01 Jan 1975

TL;DR: The major change in the second edition of this book is the addition of a new chapter on probabilistic retrieval, which I think is one of the most interesting and active areas of research in information retrieval.

...read moreread less

Abstract: The major change in the second edition of this book is the addition of a new chapter on probabilistic retrieval. This chapter has been included because I think this is one of the most interesting and active areas of research in information retrieval. There are still many problems to be solved so I hope that this particular chapter will be of some help to those who want to advance the state of knowledge in this area. All the other chapters have been updated by including some of the more recent work on the topics covered. In preparing this new edition I have benefited from discussions with Bruce Croft, The material of this book is aimed at advanced undergraduate information (or computer) science students, postgraduate library science students, and research workers in the field of IR. Some of the chapters, particularly Chapter 6 * , make simple use of a little advanced mathematics. However, the necessary mathematical tools can be easily mastered from numerous mathematical texts that now exist and, in any case, references have been given where the mathematics occur. I had to face the problem of balancing clarity of exposition with density of references. I was tempted to give large numbers of references but was afraid they would have destroyed the continuity of the text. I have tried to steer a middle course and not compete with the Annual Review of Information Science and Technology. Normally one is encouraged to cite only works that have been published in some readily accessible form, such as a book or periodical. Unfortunately, much of the interesting work in IR is contained in technical reports and Ph.D. theses. For example, most the work done on the SMART system at Cornell is available only in reports. Luckily many of these are now available through the National Technical Information Service (U.S.) and University Microfilms (U.K.). I have not avoided using these sources although if the same material is accessible more readily in some other form I have given it preference. I should like to acknowledge my considerable debt to many people and institutions that have helped me. Let me say first that they are responsible for many of the ideas in this book but that only I wish to be held responsible. My greatest debt is to Karen Sparck Jones who taught me to research information retrieval as an experimental science. Nick Jardine and Robin …

...read moreread less

822 citations

Proceedings Article•DOI•

Short text classification in twitter to improve information filtering

[...]

Bharath Sriram¹, Dave Fuhry¹, Engin Demir¹, Hakan Ferhatosmanoglu¹, Murat Demirbas² - Show less +1 more•Institutions (2)

Ohio State University¹, University at Buffalo²

19 Jul 2010

TL;DR: A small set of domain-specific features extracted from the author's profile and text is proposed to use to classify short text messages to a predefined set of generic classes such as News, Events, Opinions, Deals, and Private Messages.

...read moreread less

Abstract: In microblogging services such as Twitter, the users may become overwhelmed by the raw data One solution to this problem is the classification of short text messages As short texts do not provide sufficient word occurrences, traditional classification methods such as "Bag-Of-Words" have limitations To address this problem, we propose to use a small set of domain-specific features extracted from the author's profile and text The proposed approach effectively classifies the text to a predefined set of generic classes such as News, Events, Opinions, Deals, and Private Messages

...read moreread less

782 citations

Journal Article•DOI•

Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey

[...]

Hamed Jelodar¹, Yongli Wang¹, Chi Yuan¹, Xia Feng¹, Xiahui Jiang¹, Yanchao Li¹, Liang Zhao¹ - Show less +3 more•Institutions (1)

Nanjing University of Science and Technology¹

01 Jun 2019-Multimedia Tools and Applications

TL;DR: In this article, the authors investigated highly scholarly articles (between 2003 to 2016) related to topic modeling based on LDA to discover the research development, current trends and intellectual structure of topic modeling.

...read moreread less

Abstract: Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data and text documents. Researchers have published many articles in the field of topic modeling and applied in various fields such as software engineering, political science, medical and linguistic science, etc. There are various methods for topic modelling; Latent Dirichlet Allocation (LDA) is one of the most popular in this field. Researchers have proposed various models based on the LDA in topic modeling. According to previous work, this paper will be very useful and valuable for introducing LDA approaches in topic modeling. In this paper, we investigated highly scholarly articles (between 2003 to 2016) related to topic modeling based on LDA to discover the research development, current trends and intellectual structure of topic modeling. In addition, we summarize challenges and introduce famous tools and datasets in topic modeling based on LDA.

...read moreread less

608 citations

Posted Content•

Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey

[...]

Hamed Jelodar¹, Yongli Wang¹, Chi Yuan¹, Xia Feng¹, Xiahui Jiang¹, Yanchao Li¹, Liang Zhao¹ - Show less +3 more•Institutions (1)

Nanjing University of Science and Technology¹

12 Nov 2017-arXiv: Information Retrieval

TL;DR: In this article, the authors investigated the research development, current trends and intellectual structure of topic modeling based on Latent Dirichlet Allocation (LDA), and summarized challenges and introduced famous tools and datasets in topic modelling based on LDA.

...read moreread less

Abstract: Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and applied in various fields such as software engineering, political science, medical and linguistic science, etc. There are various methods for topic modeling, which Latent Dirichlet allocation (LDA) is one of the most popular methods in this field. Researchers have proposed various models based on the LDA in topic modeling. According to previous work, this paper can be very useful and valuable for introducing LDA approaches in topic modeling. In this paper, we investigated scholarly articles highly (between 2003 to 2016) related to Topic Modeling based on LDA to discover the research development, current trends and intellectual structure of topic modeling. Also, we summarize challenges and introduce famous tools and datasets in topic modeling based on LDA.

...read moreread less

546 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184

Collapse