scispace - formally typeset
Search or ask a question

Showing papers on "Ranking (information retrieval) published in 1995"


Patent
17 Aug 1995
TL;DR: In this paper, a negotiated matching system includes a plurality of remote terminals associated with respective potential counterparties, a communications network for permitting communication between the remote terminals, and a matching station.
Abstract: A negotiated matching system includes a plurality of remote terminals associated with respective potential counterparties, a communications network for permitting communication between the remote terminals, and a matching station. Each user enters trading information and ranking information into his or her remote terminal. The matching station then uses the trading and ranking information from each user to identify transactions between counterparties that are mutually acceptable based on the ranking information, thereby matching potential counterparties to a transaction. Once a match occurs, the potential counterparties transmit negotiating messages to negotiate some or all terms of the transaction. Thus, the negotiated matching system first matches potential counterparties who are acceptable to each other based on trading and ranking information, and then enables the two counterparties to negotiate and finalize the terms of a transaction.

742 citations


Book ChapterDOI
06 Aug 1995
TL;DR: An algorithm for ranking spatial objects according to increasing distance from a query object is introduced and analyzed, which is well suited for k nearest neighbor queries, and has the property that k needs not be fixed in advance.
Abstract: An algorithm for ranking spatial objects according to increasing distance from a query object is introduced and analyzed. The algorithm makes use of a hierarchical spatial data structure. The intended application area is a database environment, where the spatial data structure serves as an index. The algorithm is incremental in the sense that objects are reported one by one, so that a query processor can use the algorithm in a pipelined fashion for complex queries involving proximity. It is well suited for k nearest neighbor queries, and has the property that k needs not be fixed in advance.

400 citations


Proceedings ArticleDOI
David D. Lewis1
01 Jul 1995
TL;DR: This work shows how to define what constitutes good effectiveness for binary text classification systems, tune the systems to achieve the highest possible effectiveness, and estimate how the effectiveness changes as new data is processed.
Abstract: Text retrieval systems typically produce a ranking of documents and let a user decide how far down that ranking to go. In contrast, programs that filter text streams, software that categorizes documents, agents which alert users, and many other IR systems must make decisions without human input or supervision. It is important to define what constitutes good effectiveness for these autonomous systems, tune the systems to achieve the highest possible effectiveness, and estimate how the effectiveness changes as new data is processed. We show how to do this for binary text classification systems, emphasizing that different goals for the system le ad to different optimal behaviors. Optimizing and estimating effectiveness is greatly aided if classifiers that explicitly estimate the probability of class membership are used. Ranked retrieval is the information retrieval (IR) researc her’s favorite tool for dealing with information overload. Ranked retrieval systems display documents in order of probability of releva nce or some similar measure. Users see the best documents first, anddecide how far down the ranking to go in examining the available information. The central role played by ranking in this appr oach has led researchers to evaluate IR systems primarily, often exclusively, on the quality of their rankings. (See, for instance , the TREC evaluations [1].) In some IR applications, however, ranking is not enough: A company provides an SDI (selective dissemination of information) service which filters newswire feeds. Relevant articles are faxed each morning to clients. Interaction between customer and system takes place infrequently. The cost of resources (tying up phone lines, fax machine paper, etc.) is a factor to consider in operating the system. A text categorization system assigns controlled vocabulary categories to incoming documents as they are stored in a text database. Cost cutting has eliminated manual checking of category assignments.

397 citations


Patent
15 Sep 1995
TL;DR: In this article, a method for performing a search of a database in an information retrieval system in response to a query having at least one query word with a query word weight and for applying the query word to the database and selecting information from the information retrieval systems in accordance with the query words.
Abstract: A method for performing a search of a database in an information retrieval system in response to a query having at least one query word with a query word weight and for applying the query word to the database and selecting information from the information retrieval system in accordance with the query word. A query word is selected and assigned a weight. The weight is adjusted depending on whether the query word is a proper noun or slow word. The adjusting can be an increase or a decrease in the weight. Information is selected from the information retrieval system in accordance with the adjusted weight.

342 citations


Journal ArticleDOI
Yiyu Yao1
TL;DR: A new measure of system performance is suggested based on the distance between user ranking and system ranking that only uses the relative order of documents and therefore confirms to the valid use of an ordinal scale measuring relevance.
Abstract: The notion of user preference is adopted for the representation, interpretation, and measurement of the relevance or usefulness of documents. User judgments on documents may be formally described by a weak order (i.e., user ranking) and measured using an ordinal scale. Within this framework, a new measure of system performance is suggested based on the distance between user ranking and system ranking. It only uses the relative order of documents and therefore confirms to the valid use of an ordinal scale measuring relevance. It is also applicable to multilevel relevance judgments and ranked system output. The appropriateness of the proposed measure is demonstrated through an axiomatic approach. The inherent relationships between the new measure and many existing measures provide further supporting evidence

187 citations


Journal ArticleDOI
TL;DR: A “softening” of the hard Boolean scheme for information retrieval is presented, in which linguistic quantifiers are defined which capture the intrinsic vagueness of information needs.
Abstract: A “softening” of the hard Boolean scheme for information retrieval is presented. In this approach, information retrieval is seen as a multicriteria decision-making activity in which the criteria to be satisfied by the potential solutions, i.e., the archived documents, are the requirements expressed in the query. the retrieval function is then an overall decision function evaluating the degree to which each potential solution satisfies a query consisting of information requirements aggregated by operators. Linguistic quantifiers and a connector dealing with primary and optional criteria are defined and introduced in the query language in order to specify the aggregation criteria of the single query requirements. These criteria make it possible for users to express queries in a simple and self-explanatory manner. In particular, linguistic quantifiers are defined which capture the intrinsic vagueness of information needs. © 1995 John Wiley & Sons, Inc.

137 citations


Journal ArticleDOI
Kui Lam Kwok1
TL;DR: How probabilistic information retrieval based on document components may be implemented as a feedforward (feedbackward) artificial neural network is shown and performance of feedback improves substantially over no feedback, and further gains are obtained when queries are expanded with terms from the feedback documents.
Abstract: In this article we show how probabilistic information retrieval based on document components may be implemented as a feedforward (feedbackward) artificial neural network. The network supports adaptation of connection weights as well as the growing of new edges between queries and terms based on user relevance feedback data for training, and it reflects query modification and expansion in information retrieval. A learning rule is applied that can also be viewed as supporting sequential learning using a harmonic sequence learning rate. Experimental results with four standard small collections and a large Wall Street Journal collection (173,219 documents) show that performance of feedback improves substantially over no feedback, and further gains are obtained when queries are expanded with terms from the feedback documents. The effect is much more pronounced in small collections than in the large collection. Query expansion may be considered as a tool for both precision and recall enhancement. In particular, small query expansion levels of about 30 terms can achieve most of the gains at the low-recall high-precision region, while larger expansion levels continue to provide gains at the high-recall low-precision region of a precision recall curve.

109 citations


Proceedings ArticleDOI
01 Jul 1995
TL;DR: This study explores the highly effective approach of feeding back passages of large documents and a less- expensive method which discards long documents is also reviewed and found to be effective if there are enough relevant documents.
Abstract: Relevance Feedbacvk With Too Much Data A UMass Technical Report James Allan IR-57 (ref.TR95-6) Modern text collections often contain large documents which span several subject areas. Such documents are problematic for relevance feedback since inappropriate terms can easily be chosen. This study explores the highly effective approach of feeding back passages of large documents. A less- expensive method which discards long documents is also reviewed and found to be effective if there are enough relevant documents. A hybrid approach which feeds back short documents and passages of long documents may be the best compromise.

103 citations


Patent
11 Dec 1995
TL;DR: In this paper, the user can reorder the hit list by prioritizing the contribution of individual query elements to override the overall rank and by assigning additional weight(s) to those contributions.
Abstract: In an information retrieval system, a query issued by the user is analyzed by a query engine into query elements. After the query has been evaluated against the document collections, a resulting hit list is presented to the user, e.g., as a table. The presented hit list displays not only an overall rank of a document but also a contribution of each query element to the rank of the document. The user can reorder the hit list by prioritizing the contribution of individual query elements to override the overall rank and by assigning additional weight(s) to those contributions.

88 citations


Proceedings Article
01 Jan 1995
TL;DR: To adress the TREC-4 topics, this work used a precise query language that yields and combines arbitrary intervals of texts rather than pre-defined units like words and documents to compare favourably with the median average precision.
Abstract: To adress the TREC-4 topics, we used a precise query language that yields and combines arbitrary intervals of texts rather than pre-defined units like words and documents Each solution was scored in inverse proportion to the length of the shortest interval containing it Each document was scored by the sum of the scores of solutions within it Whenever the above strategy yielded less than 1000 documents, documents satisfying successively weaker queries were added with lower rank Our results for the ad-hoc topics compare favourably with the median average precision for all groups

87 citations


Journal ArticleDOI
TL;DR: A series of experiments conducted using a specific implementation of an inference network based probabilistic retrieval model to study the retrieval effectiveness of combining manaul and automatic index representations in queries and documents indicate that significant benefits in retrieval effectiveness can be obtained through combined representations.
Abstract: Results from research in information retrieval suggest that significant improvements in retrieval effectiveness could be obtained by combining results from multiple index representations and query strategies. Recently, an inference network based probabilistic retrieval model has been proposed, which views information retrieval as an evidential reasoning process in which multiple sources of evidence about document and query content are combined to estimate the relevance probabilities. In this paper we report a series of experiments we conducted using a specific implementation of this model to study the retrieval effectiveness of combining manaul and automatic index representations in queries and documents. The results indicate that significant benefits in retrieval effectiveness can be obtained through combined representations.

Journal ArticleDOI
TL;DR: Probabilistic retrieval, based on BI assumptions and applied to simple subject descriptions of documents and queries, can retrieve all relevant documents and only relevant documents, when term relevance weights are computed accurately.
Abstract: Computing formulas for binary independent (BI) term relevance weights are evaluated as a function of query representations and retrieval expectations in the CF database. Query representations consist of the limited set of terms appearing in each query statement and the complete set of terms appearing in the database. Retrieval expectations include comprehensive searches, for which many relevant documents are sought, and specific searches, for which only a few documents have merit. Conventional computing equations, which are known to over estimate term relevance weights, are shown to produce mediocre results for all combinations of query representations and retrieval expectations. Modified computing equations, which do not over estimate relevance weights, produce essentially perfect retrieval results for both comprehensive and specific searches, when the query representation is complete. Probabilistic retrieval, based on BI assumptions and applied to simple subject descriptions of documents and queries, can retrieve all relevant documents and only relevant documents, when term relevance weights are computed accurately.

Journal ArticleDOI
01 Apr 1995
TL;DR: The experience with query refinement has convinced us that the expansion of query fragments is essential in helping one use a large, dynamically changing, heterogenous distributed information system.
Abstract: We have built an HTTP based resource discovery system called Discover that provides a single point of access to over 500 WAIS servers. Discover provides two key services: query refinement and query routing . Query refinement helps a user improve a query fragment to describe the user's interests more precisely. Once a query has been refined and describes a manageable result set, query routing automatically forwards the query to the WAIS servers that contain relevant documents. Abbreviated descriptions of WAIS sites called content labels are used by the query refinement and query routing algorithms. Our experimental results suggest that query refinement in conjunction with the query routing provides an effective way to discover resources in a large universe of documents. Our experience with query refinement has convinced us that the expansion of query fragments is essential in helping one use a large, dynamically changing, heterogenous distributed information system.

Proceedings ArticleDOI
01 Jul 1995
TL;DR: An approach is developed that provides a framework to achieve both scalability and full integration of IR and RDBMS technology and validate the cooperative indexing scheme and suggest alternatives to further improve performance.
Abstract: The full integration of information retrieval (IR) features into a database management system (DBMS) has long been recognized as both a significant goal and a challenging undertaking. By full integration we mean: i) support for document storage, indexing, retrieval, and update, ii) transaction semantics, thus all database operations on documents have the ACID properties of atomicity, consistency, isolation, and durability, iii) concurrent addition, update, and retrieval of documents, and iv) database query language extensions to provide ranking for document retrieval operations. It is also necessary for the integrated offering to exhibit scaleable performance for document indexing and retrieval processes, To identify the implementation requirements imposed by the desired level of integration, we layered a representative IR application on Oracle Rdb and then conducted a number of database load and document retrieval experiments. The results of these experiments suggest that infrastructural extensions are necessary to obtain both the desired level of IR integration and scaleable performance. With the insight gained from our initial experiments, we developed an approach, called cooperative indexing, that provides a framework to achieve both scalability and full integration of IR and RDBMS technology. Prototype implementations of system-level extensions to support cooperative indexing were evaluated with a modified version of Oracle Rdb. Our experimental findings validate the cooperative indexing scheme and suggest alternatives to further improve performance.

Journal ArticleDOI
01 May 1995
TL;DR: This work continues the work in the TREC 2 environment, performing both routing and ad-hoc experiments on combining global similarities, giving an overall indication of how a document matches a query, with local similarities identifying a smaller part of the document that matches the query.
Abstract: The Smart information retrieval project emphasizes completely automatic approaches to the understanding and retrieval of large quantities of text. We continue our work in the TREC 2 environment, performing both routing and ad-hoc experiments. The ad-hoc work extends our investigations into combining global similarities, giving an overall indication of how a document matches a query, with local similarities identifying a smaller part of the document that matches the query. The performance of the ad-hoc runs is good, but it is clear we are not yet taking full advantage of the available local information. Our routing experiments use conventional relevance feedback approaches to routing, but with a much greater degree of query expansion than was previously done. The length of a query vector is increased by a factor of 5 to 10 by adding terms found in previously seen relevant documents. This approach improves effectiveness by 30–40% over the original query.

Journal ArticleDOI
TL;DR: In this paper, an approach to represent documents as structured entities is proposed, and by considering that the documents' information content can be interpreted differently according to the user's needs, a mechanism is introduced in an information retrieval system to dynamically control the retrieval performance according toThe user's specifications.

Proceedings Article
01 Jan 1995
TL;DR: This paper describes work done as part of the TREC-4 benchmarking exercice by a team from Dublin City University on improving the efficiency of standard SMART like query processing and applying various thresholding processes to the postings list of an inverted file.
Abstract: In this paper we describe work done as part of the TREC-4 benchmarking exercice by a team from Dublin City University. In TREC-4 we had 3 activities as follows : In work on improving the efficiency of standard SMART like query processing we have applied various thresholding processes to the postings list of an inverted file and we have limited the number of document score accumulators available during query processing. The first run we submitted for evaluation in TREC-4 used our best set of thresholding and acumulator set parameters ; The second run we submitted is based upon a query expansion using terms from WordNet. Essentially, for each original query term we determine its level of specificity or abstraction ; for broad terms we add more specific terms, for specific original terms we add broader ones ; for ones in-between we add both broader and narrower terms. When the query is expanded we then delete all the original query terms in order to add to the judged pool, documents that our expansion would find that would nat have been found by other retrieval. This run DCU952 ; The third run we submitted was for Spanish data. We ran the entire document corpus through a POS tagger and indexed documents (and query) by a combination of base form on non stopwords plus their POS class. Retrieval is performed using SMART with extra weights for query and document terms depending on their POS class.

Proceedings Article
01 Jan 1995
TL;DR: A study in which 50 searchers, of varying degrees of experience in information retrieval (IR), each performed searches on two TREC-4 adhoc interactive track topics, using a simple interface to the INQUIRY retrieval engine to study the relationships between the users'models and experience of IR.
Abstract: We present results of a study in which 50 searchers, of varying degrees of experience in information retrieval (IR), each performed searches on two TREC-4 adhoc interactive track topics, using a simple interface to the INQUIRY retrieval engine. The foci of our study were : the relationships between the users'models and experience of IR, and their performance in the TREC-4 adhoc task while using a best-match IR system with relevance feedback ; the understanding, use and utility of relevance feedback and ranking in interactive IR ; and, the evaluation of interactive IR.

Proceedings ArticleDOI
22 May 1995
TL;DR: An indexing technique based on Hidden Markov Models that dramatically improves the search time in a database of handwritten words and provides means for controlling the matching quality of the search process via a time-based budget is proposed.
Abstract: The emergence of the pen as the main interface device for personal digital assistants and pen-computers has made handwritten text, and more generally ink, a first-class object. As for any other type of data, the need of retrieval is a prevailing one. Retrieval of handwritten text is more difficult than that of conventional data since it is necessary to identify a handwritten word given slightly different variations in its shape. The current way of addressing this is by using handwriting recognition, which is prone to errors and limits the expressiveness of ink. Alternatively, one can retrieve from the database handwritten words that are similar to a query handwritten word using techniques borrowed from pattern and speech recognition. In particular, Hidden Markov Models (HMM) can be used as representatives of the handwritten words in the database. However, using HMM techniques to match the input against every item in the database (sequential searching) is unacceptably slow and does not scale up for large ink databases. In this paper, an indexing technique based on HMMs is proposed. The new index is a variation of the trie data structure that uses HMMs and a new search algorithm to provide approximate matching. Each node in the tree contains handwritten letters, where each letter is represented by an HMM. Branching in the trie is based on the ranking of matches given by the HMMs. The new search algorithm is parametrized so that it provides means for controlling the matching quality of the search process via a time-based budget. The index dramatically improves the search time in a database of handwritten words. Due to the variety of platforms for which this work is aimed, ranging from personal digital assistants to desktop computers, we implemented both main-memory and disk-based systems. The implementations are reported in this paper, along with performance results that show the practicality of the technique under a variety of conditions.

Journal ArticleDOI
30 Apr 1995
TL;DR: This paper presents extensions to the freeWAIS 4 5 indexer and server, which allow access to document structures using the original WAIS protocol, and presents a WWW-WAIS gateway specially tailored for usage with free WAIS-sf, which transforms filled-out HTML forms to the new query syntax.
Abstract: The original WAIS implementation by Thinking Machines et al. treats documents as uniform bags of terms. Since most documents exhibit some internal structure, it is desirable to provide the user means to exploit this structure in his queries. In this paper, we present extensions to the freeWAIS 4 5 indexer and server, which allow access to document structures using the original WAIS protocol. Major extensions include: arbitrary document formats, search in individual structure elements, stemming and phonetiic search, support of 8-bit character sets, numeric concepts and operators, combination of Boolean and linear retrieval. We also present a WWW-WAIS gateway specially tailored for usage with freeWAIS-sf [1], which transforms filled-out HTML forms to the new query syntax.


Patent
14 Jun 1995
TL;DR: In this article, a query vector is formed from the combination of word vectors associated with the words in the query, and the query vector can be divided into several factor clusters to form factor vectors, then compared to the document vectors to determine the ranking of the documents within the factor cluster.
Abstract: A method and apparatus accesses relevant documents based on a query (230). A thesaurus of word vectors (242) is formed for the words in the corpus of documents (240). The word vectors represent global lexical co-occurrence patterns and relationships between word neighbors. Document vectors (246), which are formed from the combination of word vectors, are in the same multi-dimensional space as the word vectors. A singular value decomposition is used to reduce the dimensionality of the document vectors. A query vector (232) is formed from the combination of word vectors associated with the words in the query. The query vector and document vectors are compared to determine the relevant documents. The query vector can be divided into several factor clusters to form factor vectors. The factor vectors are then compared to the document vectors to determine the ranking (252) of the documents within the factor cluster.

Journal ArticleDOI
TL;DR: CASE-DB is a real-time, single-user, relational prototype DBMS that permits the specification of strict time constraints for relational algebra queries and controls the risk of overspending the time quota at each step using a risk control technique.
Abstract: CASE-DB is a real-time, single-user, relational prototype DBMS that permits the specification of strict time constraints for relational algebra queries. Given a time constrained nonaggregate relational algebra query and a "fragment chain" for each relation involved in the query, CASE-DB initially obtains a response to a modified version of the query and then uses an "iterative query evaluation" technique to successively improve and evaluate the modified version of the query, CASE-DB controls the risk of overspending the time quota at each step using a "risk control technique".

Patent
07 Jun 1995
TL;DR: An associative text search and retrieval system uses one or more front-end processors to interact with a network having one or multiple user terminals connected to allow a user to provide information to the system and receive information from the system.
Abstract: An associative text search and retrieval system uses one or more front end processors to interacting with a network having one or more user terminals connected thereto to allow a user to provide information to the system and receive information from the system. The system also includes storage for a plurality of text documents, and at least one processor, coupled to the front end processors and the document storage. The processor(s) search the text documents according to a search request provided by the user and provide to the front end processor a predetermined number of retrieved documents containing at least one term of the search request. The retrieved documents have higher ranks than documents not provided to the front end processor. The system includes a display for displaying a window of text of one of the retrieved documents, the window having a highest score of all possible windows of the retrieved document, the score varying according to the number of search terms in the window and the number of search terms in the window preceded by a different search term in the window.


Proceedings Article
01 Nov 1995
TL;DR: The NIST's Prise engine has been modified to handle multi-word phrases, differential term weighting schemes, automatic query expansion, index partitioning and rank merging, as well as dealing with complex documents as discussed by the authors.
Abstract: In this paper we report on the joint GE/NYU natural language information retrieval project as related to the TREC-4. The main thrust of this project in to use natural language processing techniques to enhance the effectiveness of full-text document retrieval. During the course of the four TREC conferences, we have built a prototype IR system designed around a statistical full-text indexing and search backbone provided by the NIST's Prise engine. The original Prise has been modified to allow handling of multi-word phrases, differential term weighting schemes, automatic query expansion, index partitioning and rank merging, as well as dealing with complex documents. Natural langage proceding is used to preprocess the documents in order to extract content-carrying terms, discover inter-term dependencies and build a conceptual hierarchy specific to the database domain, and process user's natural language requests into effecrive search queries. The overall architecture of the system is essentially the same as in TREC-3, as our efforts this year were directed at optimizing the performance of all components. A notable exception is the new massive query expansion module used in routing experiments, which replaces a prototype extension used in the TREC-3 system. On the other hand, it has ne noted the the character and the level of difficulty of TREC queries has changed quite significantly since the last year evaluation. TREC-4 new ad-hoc queries are far shorter, less focused, and they have a flavor or information requests rather than search directives typical for earlier TRECs. This make building of good search queries a more sensitive task than before. We thus decided to introduce only minimum number of changes to our indexing and search processes, and even roll back some of the TREC-3 extensions which dealt with longer and somewhat redundant queries. Overall, our system performed quite well as our position with respect to the best systems improved steaddily since the beginning of the TREC. It should be noted that the most significant gain in performance seems to occur in precision near the top of the ranking, at 5,10,15 and 20 documents. Indeed, our unofficial manual runs performed after TREC-4 conference show superior results in these categories, topping by a large margin the best manual scores by any system in the official evaluation

Proceedings Article
15 Jul 1995
TL;DR: The degree of dynamic ranking induced by a simple genetic algorithm is highly correlated with the degree of static ranking that is inherent in the function, especially during the initial generations of search.
Abstract: We examine the role of hyperplane ranking during genetic search by developing a metric for measuring the degree of ranking that exists with respect to static hyperplane averages taken directly from the function, as well as the dynamic ranking of hyperplanes during genetic search. The metric applied to static rankings subsumes the concept of deception but the metric provides a more precise characterization of a function. We show that the degree of dynamic ranking induced by a simple genetic algorithm is highly correlated with the degree of static ranking that is inherent in the function, especially during the initial generations of search.


Patent
Hinrich Schuetze1
14 Jun 1995
TL;DR: In this article, a thesaurus of word vectors (242) is formed for the words in the corpus of documents (240) to represent global lexical co-occurrence patterns and relationships between word neighbors.
Abstract: of EP0687987A method and apparatus accesses relevant documents based on a query (230). A thesaurus of word vectors (242) is formed for the words in the corpus of documents (240). The word vectors represent global lexical co-occurrence patterns and relationships between word neighbors. Document vectors (246), which are formed from the combination of word vectors, are in the same multi-dimensional space as the word vectors. A singular value decomposition is used to reduce the dimensionality of the document vectors. A query vector (232) is formed from the combination of word vectors associated with the words in the query. The query vector and document vectors are compared to determine the relevant documents. The query vector can be divided into several factor clusters to form factor vectors. The factor vectors are then compared to the document vectors to determine the ranking (252) of the documents within the factor cluster.

27 Mar 1995
TL;DR: The behavioral aspects of various operators for AND and OR operations are analyzed, and important properties in terms of retrieval effectiveness are addressed, suggesting that the two properties, namely positive compensation and equal importance might help retrieval effectiveness.
Abstract: Many extended Boolean models such as fuzzy set, $p$-norm, et al. have been proposed in the information retrieval literature to support ranking facility for the Boolean retrieval system. They can be explained within the same framework, and each extended Boolean model is characterized by evaluation formulas for AND and OR operations. A variety of operators have been also developed in the area of fuzzy set theory for AND and OR operations, and can be used in extended Boolean models. In this paper we analyze the behavioral aspects of various operators for AND and OR operations, and address important properties in terms of retrieval effectiveness. Our analyses show that the four properties, namely single operand dependency, negative compensation, double operand dependency and unequal importance decrease retrieval effectiveness in some circumstances. This suggests that the two properties, namely positive compensation and equal importance might help retrieval effectiveness. We also provide the experimental results that coincide with our analyses.