scispace - formally typeset
Search or ask a question
Conference

European Conference on Information Retrieval 

About: European Conference on Information Retrieval is an academic conference. The conference publishes majorly in the area(s): Computer science & Ranking (information retrieval). Over the lifetime, 2006 publications have been published by the conference receiving 37931 citations.


Papers
More filters
Book ChapterDOI
Cyril Goutte1, Eric Gaussier1
21 Mar 2005
TL;DR: A probabilistic setting is used which allows us to obtain posterior distributions on these performance indicators, rather than point estimates, and is applied to the case where different methods are run on different datasets from the same source.
Abstract: We address the problems of 1/ assessing the confidence of the standard point estimates, precision, recall and F-score, and 2/ comparing the results, in terms of precision, recall and F-score, obtained using two different methods. To do so, we use a probabilistic setting which allows us to obtain posterior distributions on these performance indicators, rather than point estimates. This framework is applied to the case where different methods are run on different datasets from the same source, as well as the standard situation where competing results are obtained on the same data.

1,402 citations

Book ChapterDOI
18 Apr 2011
TL;DR: This paper empirically compare the content of Twitter with a traditional news medium, New York Times, using unsupervised topic modeling, and finds interesting and useful findings for downstream IR or DM applications.
Abstract: Twitter as a new form of social media can potentially contain much useful information, but content analysis on Twitter has not been well studied. In particular, it is not clear whether as an information source Twitter can be simply regarded as a faster news feed that covers mostly the same information as traditional news media. In This paper we empirically compare the content of Twitter with a traditional news medium, New York Times, using unsupervised topic modeling. We use a Twitter-LDA model to discover topics from a representative sample of the entire Twitter. We then use text mining techniques to compare these Twitter topics with topics from New York Times, taking into consideration topic categories and types. We also study the relation between the proportions of opinionated tweets and retweets and topic categories and types. Our comparisons show interesting and useful findings for downstream IR or DM applications.

1,193 citations

Book ChapterDOI
28 Mar 2010
TL;DR: The characteristics needed in an information retrieval (IR) test collection to facilitate the evaluation of integrated search, i.e. search across a range of different sources but with one search box and one ranked result list, are discussed and a new test collection is described and analyses.
Abstract: The poster discusses the characteristics needed in an information retrieval (IR) test collection to facilitate the evaluation of integrated search, i.e. search across a range of different sources but with one search box and one ranked result list, and describes and analyses a new test collection constructed for this purpose. The test collection consists of approx. 18,000 monographic records, 160,000 papers and journal articles in PDF and 275,000 abstracts with a varied set of metadata and vocabularies from the physics domain, 65 topics based on real work tasks and corresponding graded relevance assessments. The test collection may be used for systems- as well as user-oriented evaluation.

1,039 citations

Book ChapterDOI
Ryan McDonald1
02 Apr 2007
TL;DR: This work defines a general framework for inference in summarization and presents three algorithms: a greedy approximate method, a dynamic programming approach based on solutions to the knapsack problem, and an exact algorithm that uses an Integer Linear Programming formulation of the problem.
Abstract: In this work we study the theoretical and empirical properties of various global inference algorithms for multi-document summarization. We start by defining a general framework for inference in summarization. We then present three algorithms: The first is a greedy approximate method, the second a dynamic programming approach based on solutions to the knapsack problem, and the third is an exact algorithm that uses an Integer Linear Programming formulation of the problem. We empirically evaluate all three algorithms and show that, relative to the exact solution, the dynamic programming algorithm provides near optimal results with preferable scaling properties.

382 citations

Book ChapterDOI
02 Apr 2007
TL;DR: This work formally evaluate and analyze the methods on a query-query similarity task using 363,822 queries from a web search log, and provides insights into the strengths and weaknesses of each method, including important tradeoffs between effectiveness and efficiency.
Abstract: Measuring the similarity between documents and queries has been extensively studied in information retrieval However, there are a growing number of tasks that require computing the similarity between two very short segments of text These tasks include query reformulation, sponsored search, and image retrieval Standard text similarity measures perform poorly on such tasks because of data sparseness and the lack of context In this work, we study this problem from an information retrieval perspective, focusing on text representations and similarity measures We examine a range of similarity measures, including purely lexical measures, stemming, and language modeling-based measures We formally evaluate and analyze the methods on a query-query similarity task using 363,822 queries from a web search log Our analysis provides insights into the strengths and weaknesses of each method, including important tradeoffs between effectiveness and efficiency

354 citations

Performance
Metrics
No. of papers from the Conference in previous years
YearPapers
2023162
2022110
2021135
2020145
2019122
201885