scispace - formally typeset
Search or ask a question
Patent

Information retrieval self-adaption data fusion method

22 May 2013-
TL;DR: In this paper, an information retrieval self-adaption data fusion method was proposed, which can guarantee the effectiveness of fused results and is suitable for information retrieval data fusion, even on the condition of small data volume.
Abstract: The invention discloses an information retrieval self-adaption data fusion method. For a group of member retrieval systems L (1<=i<=t), the method comprises the following steps: 1, calculating the difference degree of results corresponding to any two retrieval systems; 2, calculating the weight of differentiation of each system L (1<=i<=t) according to the conclusion of the step 1; 3, utilizing a performance square weighting scheme to calculate the performance weight of each system; 4, calculating the final weight of each system according to the conclusions of the step 2 and the step 3; and 5, enabling retrieved result fusion of the weight calculated in the step 4 to be conducted in a linear combination method. According to the information retrieval self-adaption data fusion method, performance of a retrieval model of each member is considered in the weight updating method, and differences among retrieval models of the members are also considered. Weight updating only needs a few data, such as results produced by each inquiry. Even on the condition of a small data volume, the information retrieval self-adaption data fusion method can guarantee effectiveness of fused results, and is suitable for information retrieval self-adaption data fusion.
Citations
More filters
Patent
19 Feb 2014
TL;DR: In this paper, a method for testing combination properties of evaluation indexes of search engines comprises the following steps of selecting more than two datasets from datasets provided by a TREC (tracking radar electronic component) by using a testing device, calculating the score values of query results queried by the search engines in a dataset sequentially according to an evaluation index by using the testing device; pairwise matching the score value of all the search engine in the dataset, and performing analysis calculation by using two-tailed t examination according to matched results and a set threshold value.
Abstract: A method for testing combination properties of evaluation indexes of search engines comprises the following steps of selecting more than two datasets from datasets provided by a TREC (tracking radar electronic component) by using a testing device; calculating the score values of query results queried by the search engines in a dataset sequentially according to an evaluation index by using the testing device; pairwise matching the score values of all the search engines in the dataset; performing analysis calculation by using two-tailed t examination according to matched results and a set threshold value by using the testing device, and judging whether difference between searching qualities of each two search engines is obvious or not; and calculating the proportion of the matched results with the obvious difference in all the matched results after obtaining t examination values between all the matched results by using the testing device. By using the method, t examination is applied to calculation on stability and sensitivity of the evaluation indexes; and the evaluation index with the optimal overall characteristic can be obtained by only calculating a value.

6 citations

Patent
04 Jun 2014
TL;DR: In this article, an information retrieval data fusion method based on retrieval result diversification is proposed, which can improve the validity and diversity of infused results and can also be applied to different types of infusion problems such as documents, pictures, medical records and the like.
Abstract: The invention discloses an information retrieval data fusion method based on retrieval result diversification. The method includes the following steps that suppose that totally t information retrieval systems exist, the same database is searched by the t information retrieval systems for the same inquiry, and t results are obtained; the number of times of a file, occurring in other results, of any result is counted; the difference value of each retrieval result i (1<=i<=t) serves as the difference weight; the use performance index ERR-IA20 is used for evaluation, an obtained performance value serves as the performance weight of each information retrieval system; the difference weight and the performance weight are combined, the comprehensive weight of each information retrieval system is calculated; the method is repeatedly used in one group of inquiries, the final weight of each information retrieval system is the average value obtained in all the inquiries; retrieval result infusion is performed on the calculated final weights with a linear combination method. The information retrieval data fusion method can improve the validity and the diversity of infused results and can also be applied to different types of infusion problems such as documents, pictures, medical records and the like.

1 citations

Patent
11 Mar 2015
TL;DR: In this article, a data integration method supporting the diversification of information retrieval results is proposed. The method is mainly based on a complementary weight allocation strategy covered by a sub-theme.
Abstract: The invention discloses a data integration method supporting the diversification of information retrieving results. The method is mainly based on a complementary weight allocation strategy covered by a sub-theme. The calculation of the complementary weight mainly comprises the following steps of providing t information retrieving systems, retrieving a corresponding result r1, r2,...,rt from a same database by each information retrieving system for a given inquiry q; establishing a super result r on the basis of two results ri and rj; then evaluating the ri, rj and r by utilizing a performance index to obtain performance values, respectively recording the performance values as p , p and p , calculating the complementation degree of ri corresponding to rj according to the performance value, calculating the complementary weight ci of the calculation result ri (i is more than or equal to 1 and less than or equal to t), acquiring the complementary weight, and directly utilizing the complementary weight for the linear combination or as a part of the linear combined weight. By adopting the method, the novelty can be considered on the basis of diversification, the complementation degree of a result to the integrity can be quantified, and the method can be used for integrating various types such as texts, pictures and the like.
References
More filters
Patent
31 Dec 2008
TL;DR: In this article, a network multimedia searching and inquiry method integrating individualization and collaboration is presented, which includes the following steps: (1) existing semantic information is adopted to automatically mark media object semanteme; (2) a user sub-file containing user information and personal preferences is established and a searching system sorts and optimizes the searched results according to the intention of the user; (3) the weight of each key phrase in the user subfile is dynamically adjusted according to relevant feedbacks of the users so as to more accurately reflect the user intention; (4)
Abstract: The invention discloses a network multimedia searching and inquiry method integrating individualization and collaboration, which includes the following steps: (1) existing semantic information is adopted to automatically mark media object semanteme; (2) a user sub-file containing user information and personal preferences is established and a searching system sorts and optimizes the searched results according to the intention of the user; (3) the weight of each key phrase in the user sub-file is dynamically adjusted according to relevant feedbacks of the user so as to more accurately reflect the user intention; (4) a multi-layer sub-file mode respectively including the user sub-file, group sub-file and community sub-file is established and a succession and sharing mechanism is preserved between layers so as to both strive for sameness and allow the existence of difference and support a mass storage; (5) multi-modal information is converged and analyzed for multimedia semantic understanding so as to realize the cross-modal multimedia object searching. The invention can accurately learn about the intention of users and realize the high-accurate, individualized and cross-modal multimedia searching.

51 citations

Journal ArticleDOI
Shengli Wu1
01 Jan 2012
TL;DR: This paper uses the multiple linear regression technique with estimated relevance scores and judged scores to obtain suitable weights and shows that the linear combination method with such a weighting strategy steadily outperforms the best component system and other data fusion methods by large margins.
Abstract: In information retrieval, data fusion (also known as meta-search) has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility, since different weights can be assigned to different component systems so as to obtain better fusion results. The key issue is how to assign good weights to all the component retrieval systems involved. Surprisingly, research in this field is limited and it is still an open question. In this paper, we use the multiple linear regression technique with estimated relevance scores and judged scores to obtain suitable weights. Although the multiple linear regression technique is not new, the way of using it in this paper has never been attempted before for the data fusion problem in information retrieval. Our experiments with five groups of runs submitted to TREC show that the linear combination method with such a weighting strategy steadily outperforms the best component system and other data fusion methods including CombSum, CombMNZ, PosFuse, MAPFuse, SegFuse, and the linear combination method with performance level/performance square weighting schemes by large margins.

40 citations

Patent
09 Dec 2009
TL;DR: In this article, a system for information retrieval within a database of large size includes a first module for extracting the descriptors associated with each object in the database, and for constructing a table containing the objects and the value of a descriptor associated with an object.
Abstract: A system for information retrieval within a database of large size includes a first module for extracting the descriptors associated with each object in the database, and for constructing a table containing the objects and the value of a descriptor associated with an object. The system also includes a second module for applying a number of classification algorithms, for each of the tables obtained from the module, a third module to fusion the results obtained from the module in order to determine, for each type of descriptor, a class number associated with an object, a fourth module for finding which column of a table is closest to the column obtained during the first fusion of the step, and for selecting the map that is closest contained in the table, or best map, and a fifth module to fusion the aggregate “best maps”, and applying an algorithm for searching for the best map to be transmitted to a display means.

3 citations