scispace - formally typeset
Search or ask a question

Showing papers on "Ranking (information retrieval) published in 1988"


Journal ArticleDOI
01 May 1988
TL;DR: A series of experiments were run using the Cranfield test collection to discover techniques to select terms for lists of suggested terms gathered from feedback, nearest neighbors, and term variants of original query terms that would be effective for further retrieval.
Abstract: In an era of online retrieval, it is appropriate to offer guidance to users wishing to improve their initial queries. One form of such guidance could be short lists of suggested terms gathered from feedback, nearest neighbors, and term variants of original query terms. To verify this approach, a series of experiments were run using the Cranfield test collection to discover techniques to select terms for these lists that would be effective for further retrieval. The results show that significant improvement can be expected from this approach to query expansion.

206 citations


Journal ArticleDOI
TL;DR: This article describes an effort to study whether the order of document presentation to judges influences the relevance scores assigned to those documents.
Abstract: Studies concerned with the evaluation of information systems have typically relied on judgments of relevance as the fundamental measure in determining system performance. In most cases, subjects are asked to assign a relevance score using some category rating scale (1–4, 1–11, or simply relevant/non-relevant) to each document in a set retrieved in response to some information need or query. While the extensive studies of relevance conducted in the 1960s indicated that relevance judgments are influenced by a range of variables, little attention has been paid to the possible effects of the order in which the stimuli are presented to judges. This effect of “stimulus order” has been found to exist in measuring variables in other fields (Stevens 1975, Gescheider 1985). Questioning possible “presentable order effects” is particularly appropriate in that systems are being developed and evaluated in information science which present documents in some systematic way (e.g., with the documents considered by the system to be most relevant presented first). This article describes an effort to study whether the order of document presentation to judges influences the relevance scores assigned to those documents. A query and set of documents with relevance judgments were available from a previous study. Subjects were randomly assigned one of two orders (one ranked high to low, the other low to high) of fifteen document descriptions. They were then asked to assign a score to each document description to match their judgment of relevance in relation to the stated information query. Both a category rating (1–7) and open-ended, magnitude estimation scaling procedure were tested, and it was found that the judgments were influenced by the order of document presentation. © 1988 John Wiley & Sons, Inc.

118 citations


Journal ArticleDOI
01 Sep 1988
TL;DR: A number of algorithmic tools that have been found useful in the construction of parallel algorithms are described; among these are prefix computation, ranking, Euler tours, ear decomposition, and matrix calculations.
Abstract: We have described a number of algorithmic tools that have been found useful in the construction of parallel algorithms; among these are prefix computation, ranking, Euler tours, ear decomposition, and matrix calculations. We have also described some of the applications of these tools, and listed many other applications. These algorithms seem likely to be useful not only in their own right, but also as examples of ways to break up other problems into parts suitable for parallel solution.

99 citations


Journal ArticleDOI
TL;DR: In this paper techniques are described for implementing probabilistic ranking strategies with sequential and bit-sliced signature tiles and the limitations of these implementations with regard to their effectiveness are pointed out.
Abstract: Signature files provide an efficient access method for text in documents, but retrieval is usually limited to finding documents that contain a specified Boolean pattern of words. Effective retrieval requires that documents with similar meanings be found through a process of plausible inference. The simplest way of implementing this retrieval process is to rank documents in order of their probability of relevance. In this paper techniques are described for implementing probabilistic ranking strategies with sequential and bit-sliced signature tiles and the limitations of these implementations with regard to their effectiveness are pointed out. A detailed comparison is made between signature-based ranking techniques and ranking using term-based document representatives and inverted files. The comparison shows that term-based representations are at least competitive (in terms of efficiency) with signature files and, in some situations, superior.

64 citations


Journal ArticleDOI
TL;DR: In this article, the authors present an MCDM method, known as Pragma, which provides the ranking frequencies of feasible actions based on the comparison of the partial profiles of each alternative with reference to all the possible pairs of criteria considered.

44 citations


Proceedings ArticleDOI
01 May 1988
TL;DR: This paper describes some aspects of a project with the aim of developing a user-friendly interface to a classical Information Retrieval (IR) System in order to improve the effectiveness of retrieval.
Abstract: This paper describes some aspects of a project with the aim of developing a user-friendly interface to a classical Information Retrieval (IR) System in order to improve the effectiveness of retrieval. The character by character approach to IR has been abandoned in favor of an approach based on the meaning of both the queries and the texts containing the information to be sought. The concept space, locally derived from a thesaurus, is used to represent a query as well as documents retrieved in atomic concept units. Dependencies between the search terms are taken into account. The meanings of the query and the retrieved documents (results of Elementary Logical Conjuncts (ELCs)) are compared. The ranking method on the semantical level is used in connection with existing data of a classical IR system. The user enters queries without using complex Boolean expressions.

43 citations


Journal ArticleDOI
TL;DR: It is proposed that an appropriate metric for gauging the performances of information retrieval systems is a measure of the (relative) total relevance that a user can obtain from a set of documents sequentially scanned and evaluated in an information retrieval environment.
Abstract: The article presents a model based on the notion of the total relevance of a set of documents. The concept of a total relevance function is subsequently derived from the notion of cumulated relevance implied in the traditional summation of relevance ratings over the documents in a collection or in retrieved sets of documents. The model is intended to make explicit the perceptual underpinnings of relevance assessments while allowing for the consideration of interdocument dependencies as perceived by the user. Within this framework, it is proposed that an appropriate metric for gauging the performances of information retrieval systems is a measure of the (relative) total relevance that a user can obtain from a set of documents sequentially scanned and evaluated in an information retrieval environment. Some implications of the model are noted.

29 citations


Journal ArticleDOI
TL;DR: Implementation and performance details about SMART and SIRE illustrate how commercially available retrieval systems can be improved to use the fruits of these research efforts.
Abstract: During the last decade, studies with the SMART and SIRE systems have pioneered new techniques for improving the effectiveness of Boolean retrieval. Extended Boolean logic, automatic Boolean query construction, and Boolean feedback yield significant improvements according to a variety of experiments with SMART. Ranking of the output of Boolean queries has been shown to be of value with SIRE. Recent efforts have aimed at adapting SMART to allow large scale testing of these advanced retrieval methods. SIRE has been enhanced to include the p-norm scheme for extended Boolean query processing. Implementation and performance details about SMART and SIRE illustrate how commercially available retrieval systems can be improved to use the fruits of these research efforts.

27 citations


Journal ArticleDOI
TL;DR: The presented approach has been developed by extending well-known probabilistic output ranking methods that are applicable in retrieval systems in which document representations as well as search request formulations are simply sets of index terms.
Abstract: Current operational information retrieval systems based on Boolean searching could be radically improved through the incorporation of a weighting mechanism for ranking output documents. However, a number of previous attempts to refine conventional Boolean retrieval systems along these lines have not been fully successful because of their inherent inconsistencies and ambiguities. A detailed account of the research aimed at obtaining a more rigorous methodology is given in this article. The presented approach has been developed by extending well-known probabilistic output ranking methods that are applicable in retrieval systems in which document representations as well as search request formulations are simply sets of index terms. A series of experiments, carried out in recent years to verify some of these methods, have particularly demonstrated the value of a systematic statistical use of relevance feedback information. It is therefore expected that the application of the extended probabilistic document ranking methodology in conventional Boolean systems will also prove to be useful, and that considerable improvements in retrieval performance of these systems will be obtained. A theoretical framework used to derive the proposed output ranking scheme is described in detail. A simple illustrative example is included, followed by a thorough discussion of the suggested approach.

21 citations


Journal Article
TL;DR: A search strategy is proposed for locating documents which are likely to be relevant to a given query and the implementation of an improved algorithm for the identification of the closest document set is presented with emphasis on computational efficiency.
Abstract: SUMMARY In this paper, we emphasize the need of modelling the inherent uncertainty associated with the information retrieval process. Within this context, a search strategy is proposed for locating documents which are likely to be relevant to a given query. A notion of closeness between document(s) and query is introduced and the implementation of an improved algorithm for the identification of the closest document set is presented with emphasis on computational efficiency.

13 citations


Journal ArticleDOI
TL;DR: This paper presents a system (OFFICER) for the retrieval of office documents that is based on a model of plausible inference, which allows the specification of uncertain queries and combines uncertainties in the matching of queries and documents to produce an overall ranking for the documents.
Abstract: Office information systems are being used to describe and store documents with complex structure and multimedia content. Users of these systems can potentially make very complex specifications of the structure, layout and content of the documents they wish to retrieve. Although these complex queries could be more effective in identifying relevant documents, it is important that a well-defined model of retrieval is used, both as the basis for the retrieval strategies and the user interface. In this paper, we present a system (OFFICER) for the retrieval of office documents that is based on a model of plausible inference. The OFFICER query interface allows the specification of uncertain queries and combines uncertainties in the matching of queries and documents to produce an overall ranking for the documents.

Journal ArticleDOI
TL;DR: This paper presents a G.G.T.T., software which integrates several modules depending of the applications: Family formation, code optimization, ranking function, process-planning generation, cell workshop organization and data quering and analysis.

01 Jan 1988
TL;DR: The closure under boolean operations of the classes in CH is studied and a characterization of the hierarchy in terms of nondeterministic and probabilistic machines with access to oracles is proved.
Abstract: The polynomial time counting hierarchy (CH), was introduced by Wagner in [16]. In this article we study certain properties of the hierarchy continuing the work started in the mentioned paper. We study the closure under boolean operations of the classes in CH and prove a characterization of the hierarchy in terms of nondeterministic and probabilistic machines with access to oracles. We also translate the concept of lowness to the classes in CH, obtaining a way to characterize the sets that are low for PP using ranking functions. We find some other connections between ranking functions and low sets for PP, showing that if a class in the hierarchy is #P-rankable then it is low for PP.


Journal Article
TL;DR: This paper conducted a survey of superintendents in rural school districts in southeastern Nebraska to identify the most critical issues in managing and running small rural school systems, and found that the problem of finances was the most important issue in managing small rural schools.
Abstract: This paper describes the results of a survey of superintendents, who were asked to identify the most critical issues in managing and running small rural school districts. Thirty superintendents in southeastern Nebraska participated in the study. The problem of finances was the superintendents' greatest worry. They were also concerned about regional economic conditions, state regulations, salaries, and providing an adequate variety of classes. Superintendents feel that adequate financing of schools is based upon a positive perception of schools by community members, and that school boards are primarily interested in school curriculum. They are positive in their assessment of the overall quality of teachers they employ.

Book ChapterDOI
Pierre Michaud1
01 Jan 1988
TL;DR: In this article, the authors consider the case where voters give individually their opinions of preference on n candidates under the form of a (linear) ranking of these candidates, from the set of these different rankings one has to deduce a collective ranking, thanks to an aggregation rule.
Abstract: The problem of the collective choice from individual opinions is one of the central problem of the decision theory. The more classic case is certainly the one where voters give individually their opinion of preference on n candidates under the form of a (linear) ranking of these candidates. From the set of these different rankings one has to deduce a collective (linear) ranking, thanks to an aggregation rule.


Journal ArticleDOI
TL;DR: A ranking function for mapping a set of non-regular trees to a setof positive integers is derived and, as a result of this, a ranking and an unranking algorithms for non- regular trees can be readily constructed.
Abstract: This paper is concerned with rooted, ordered and non-regular trees. A tree is said to be non-regular if every internal node of the tree may not have the same number of sons. A ranking function for mapping a set of non-regular trees to a set of positive integers is derived. As a result of this, a ranking and an unranking algorithms for non-regular trees can be readily constructed. The unranking algorithm, of course, can be used for generating random non-regular trees.

Posted Content
TL;DR: In this article, a dynamic version of the principle of ranking and selection of projects by their benefit cost ratios (BCRs) is proposed, which permits a very simple and practical technique for adaptive sequential choice to provide incentives for self help, cost control, sharing and maintenance.
Abstract: This paper proposes a dynamic version of the principle of ranking and selection of projects by their benefit cost ratios (BCRs) It permits a very simple and practical technique for adaptive sequential choice to provide incentives for self help, cost control, sharing and maintenance We discuss optimality by using Pontrayagin's " maximum principle" This leads to a new criterion for judging the impartiality and optimality of any scheduling, queuing, benefit or work sharing scheme Applications to many kinds of business enterprises, governmental or charitable organizations are indicated For some applications we suggest the use of a new kind of coupon "money" to facilitate accounting for indivisible benefits and costs, and as a medium of exchange and store of "value"

Journal ArticleDOI
Yahya Badran1
TL;DR: In this article, a procedure for preference ranking of discrete multiattribute instances based on the observations that a difficult question could be answered in terms of answers to a sequence of easier ones is presented.

Book ChapterDOI
01 Jan 1988
TL;DR: This paper summarises an extended research programme to investigate the use of fragment-based measures of inter-molecular similarity in chemical information systems, with particular reference to structure-property correlation.
Abstract: This paper summarises an extended research programme to investigate the use of fragment-based measures of inter-molecular similarity in chemical information systems, with particular reference to structure-property correlation. Comparative studies are reported of structural similarity measures and of clustering methods for chemical structure databases. The methods are most appropriate when very sparse data matrices are available; in such cases, a very fast nearest neighbour searching algorithm can be used for the calculation of the requisite similarities.

Journal ArticleDOI
TL;DR: With this method not only a separation of relevant results from irrelevant ones is achieved, but also a ranking of the relevant results with respect to the degree of similarity to the search query.


01 Jan 1988
TL;DR: Research indicates t h a t a relevance-based ranking scheme of this type could readily be incorporated into conventional retrieval systems without altering their underlying fundamental principles; as a result, the performance of these systems may be significantly improved at an acceptable cost.
Abstract: This paper reports on research aimed at developing practical methods for improvina t h e performance of conventional Boolean information retrievai systims. ~ o i e specifically, the objective of this research is t o incorporate into these systems a mechanism for ranking t h e documents of a collection in descending order of their probabilities of usefulness t o the user. There are several reasons why a ranking mechanism of this type may b e expected t o provide retrieval results superior t o those of traditional Boolean search techniques which normally do not furnish users with any indication of document relevance. In particular, t h e sc-called output overload problem, which occurs when the size of the document s e t retrieved in response t o a given query is unmanageable, could practically be eliminated since t h e ranking information would assist t h e searcher in deciding when t o end an examination of the system's output with the confidence of identifying most of the useful i tems t h a t have been retrieved. Moreover, since i t would no longer be necessary t o inspect all of the output documents, a user q w r y could then be constructed broad enough t o allow more relevant i tems t o be retrieved. Accordingly, such a system enhancement may result in an improvement in retrieval effectiveness, especially in an increase in recall. his presents and il lustrates a theoretical framework, based on the Probability Ranking Principle, for implementing this enhancement. Research indicates t h a t a relevance-based ranking scheme of this type could readily b e incorporated into conventional retrieval systems without altering their underlying fundamental principles; some additional sof tware in the form of a front-end is all t h a t would be needed. As a result, the performance of these systems may be significantly improved at a n acceptable cost. A great deal of research on information retrieval has been motivated by t h e recognition t h a t clues for estimating t h e degrees of usefulness, or relevance, of documents in a collection a r e rarely complete. More specifically, i t is widely recognized t h a t predicting document relevance usually involves uncertainties which a r e inherently probabilistic, and a s a result retrieval systems a r e generally unable t o precisely determine the extent t o which a given document will satisfy the user's needs. Thus, designing retrieval systems which a r e capable of identifying all and only relevant documents does not seem feasible under available theoretical models. However, i t is often suggested t h a t they should be able t o present the retrieved documents in descending order of their estimated

01 Jan 1988
TL;DR: In this paper, the authors present the Concordance in KWIC - Word Index - Alphabetical Word List - Reverse Word List (RWN) - Ranking Frequency Lists (RFL) - General Ranking Frequency List (GRFL) to each individual romance.
Abstract: Contents: Concordance in KWIC - Word Index - Alphabetical Word List - Reverse Word List - Ranking Frequency Lists (General Ranking Frequency List and Ranking Frequency List to each individual romance).


Book
01 Jan 1988
TL;DR: In this article, the authors present the Concordance in KWIC - Word Index - Alphabetical Word List - Reverse Word List (RWN) - Ranking Frequency Lists (RFL) - General Ranking Frequency List (GRFL) to each individual romance.
Abstract: Contents: Concordance in KWIC - Word Index - Alphabetical Word List - Reverse Word List - Ranking Frequency Lists (General Ranking Frequency List and Ranking Frequency List to each individual romance).

01 Jan 1988
TL;DR: A methodology for the design of document retrieval systems is presented and a composite retrieval model is proposed to process a user's information request in a weighted Phrase-Oriented Fixed-Level Expression (POFLE), which may apply more than Boolean operators.
Abstract: A methodology for the design of document retrieval systems is presented. First, a composite index term weighting model is developed based on term frequency statistics, including document frequency, relative frequency within document and relative frequency within collection, which can be adjusted by selecting various coefficients to fit into different indexing environments. Then, a composite retrieval model is proposed to process a user's information request in a weighted Phrase-Oriented Fixed-Level Expression (POFLE), which may apply more than Boolean operators, through two phases. That is, we have a search for documents which are topically relevant to the information request by means of a descriptor matching mechanism, which incorporate a partial matching facility based on a structurally-restricted relationship imposed by indexing model, and is more general than matching functions of the traditional Boolean model and vector space model, and then we have a ranking of these topically relevant documents, by means of two types of heuristic-based selection rules and a knowledge-based evaluation function, in descending order of a preference score which predicts the combined effect of user preference for quality, recency, fitness and reachability of documents.

Journal ArticleDOI
TL;DR: A semiautomated user driven method for document mapping is described in this paper and should decrease the amount of time required to validate the information content of a query.
Abstract: An information retrieval system should provide references to the set of documents the user must evaluate in order to satisfy his/her information requirements. A major concern in this evaluation process is whether or not a document meets the user's needs. In many document retrieval systems there is no relevant information regarding the content of the documents. This makes it very difficult to evaluate if a document is relevant to the user's initial query. This suggests the need for a method to compare the documents on a word by word basis. Fully automated methods are too complex and difficult to generalize upon. A semiautomated user driven method for document mapping is described in this paper. It should decrease the amount of time required to validate the information content of a query.

01 Jan 1988
TL;DR: In this article, it was shown that a language C is rankable in deterministic polynomial time if P = Hash P, where C is any of the following classes of languages: (1) languages accepted by log-time-bounded non-turing machines; (2) languages accept by (uniform) families of unbounded fan-in circuits of constant depth and polynomially size; (3) languages acceptance by two-way deterministic finite automata; (4) languages Accepted by multi-head deterministic finitest autom
Abstract: It is shown that ranking languages accepted by one-way unambiguous auxiliary pushdown automata are in NC/sup (2)/. Negative results about ranking for several classes of simple languages are proved. It is shown that C is rankable in deterministic polynomial time if P= Hash P, where C is any of the following classes of languages: (1) languages accepted by logtime-bounded nondeterministic Turing machines; (2) languages accepted by (uniform) families of unbounded fan-in circuits of constant depth and polynomial size; (3) languages accepted by two-way deterministic finite automata; (4) languages accepted by multihead deterministic finite automata; (5) languages accepted by one-way nondeterministic logspace-bounded Turing machines; and (6) finitely ambiguous linear context-free languages.<>