scispace - formally typeset
Book ChapterDOI

The linear combination data fusion method in information retrieval

The multiple linear regression technique is used to obtain optimum weights for all involved component systems and it is shown that the linear combination method with such weights steadily outperforms the best component system and other major data fusion methods such as CombSum, CombMNZ, and the linear combinations with performance level/performance square weighting schemas by large margins.
In information retrieval, data fusion has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility since different weights can be assigned to different component systems so as to obtain better fusion results. However, how to obtain suitable weights for all the component retrieval systems is still an open problem. In this paper, we use the multiple linear regression technique to obtain optimum weights for all involved component systems. Optimum is in the least squares sense that minimize the difference between the estimated scores of all documents by linear combination and the judged scores of those documents. Our experiments with four groups of runs submitted to TREC show that the linear combination method with such weights steadily outperforms the best component system and other major data fusion methods such as CombSum, CombMNZ, and the linear combination method with performance level/performance square weighting schemas by large margins.

read more

More filters
Journal ArticleDOI

The improvement of spatial-temporal resolution of PM2.5 estimation based on micro-air quality sensors by using data fusion technique.

TL;DR: A new data fusion method called multi-sensor space-time data fusion framework based on the Optimum Linear Data Fusion theory and integrating with a multi-time step Kriging method for spatial-temporal estimation is proposed, which is able to improve the estimation of PM2.5 concentration in space and time.
Proceedings ArticleDOI

Fusion in Information Retrieval: SIGIR 2018 Half-Day Tutorial

TL;DR: The goal of this half day, intermediate-level, tutorial is to provide a methodological view of the theoretical foundations of fusion approaches, the numerous fusion methods that have been devised and a variety of applications for which fusion techniques have been applied.
Proceedings ArticleDOI

Mixture model with multiple centralized retrieval algorithms for result merging in federated search

TL;DR: A mixture probabilistic model is proposed to learn more appropriate combination weights with respect to different types of information sources with some training data to deal with heterogeneous information sources.
Journal Article

Evaluating score normalization methods in data fusion

TL;DR: In this paper, the authors evaluate four linear score normalization methods, namely the fitting method, zero-one, Sum, and ZMUV, through extensive experiments and show that fitting method and Zero-one appear to be the two leading methods.
Journal ArticleDOI

New Re-ranking Approach in Merging Search Results

TL;DR: Two methods of merging search results are compared: a) applying formulas to re-evaluate document based on different combinations of returned order ranks, documents titles and snippets; b) Top-Down Re-ranking algorithm (TDR) gradually downloads, calculates scores and adds top documents from each source into the final list.
More filters
Proceedings ArticleDOI

Rank aggregation methods for the Web

TL;DR: A set of techniques for the rank aggregation problem is developed and compared to that of well-known methods, to design rank aggregation techniques that can be used to combat spam in Web searches.
Proceedings Article

Combination of multiple searches

TL;DR: This paper describes one method that has been shown to increase performance by combining the similarity values from five different retrieval runs using both vector space and P-norm extended boolean retrieval methods.
Book ChapterDOI

Developing a test collection for the evaluation of integrated search

TL;DR: The characteristics needed in an information retrieval (IR) test collection to facilitate the evaluation of integrated search, i.e. search across a range of different sources but with one search box and one ranked result list, are discussed and a new test collection is described and analyses.
Proceedings ArticleDOI

Models for metasearch

TL;DR: The experimental results show that metasearch algorithms based on the Borda and Bayesian models usually outperform the best input system and are competitive with, and often outperform, existing metAsearch strategies.
Proceedings ArticleDOI

Analyses of multiple evidence combination

Joon Ho Lee
TL;DR: This paper analyzes why improvements can be achieved with evidence combination, and proposes a combining method whose properties coincide with the rationale, and investigates the effect of using rank instead of similarity on retrieval effectiveness.
Related Papers (5)