scispace - formally typeset
Search or ask a question

Showing papers on "Ranking (information retrieval) published in 1990"


Proceedings ArticleDOI
01 May 1990
TL;DR: Experimental results show that when there is a single query object, searching in parameter space can be faster than searching in native space, if the data and query objects are large enough, and if sufficient redundancy is used for the query representation.
Abstract: Spatial queries can be evaluated in native space or in a parameter space. In the latter case, data objects are transformed into points and query objects are transformed into search regions. The requirement for different data and query representations may prevent the use of parameter-space searching in some applications. Native-space and parameter-space searching are compared in the context of a z order-based spatial access method. Experimental results show that when there is a single query object, searching in parameter space can be faster than searching in native space, if the data and query objects are large enough, and if sufficient redundancy is used for the query representation. The result is, however, less accurate than the native space result. When there are multiple query objects, native-space searching is better initially, but as the number of query objects increases, parameter space searching with low redundancy is superior. Native-space searching is much more accurate for multiple-object queries.

101 citations


Proceedings ArticleDOI
S. Chaudhuri1
05 Feb 1990
TL;DR: The aim of this work is to capture this iterative process by extending the query model by defining extended queries, which express additional constraints on the answer set and designate some of the conditions in the relational query as flexible.
Abstract: The rigidity and limited expressiveness of relational queries often require that a query be iteratively modified. An initial query is posed, and once it is discovered that the answer does not meet the additional constraints, which are not expressed in the relational query, it is necessary to modify the query in a way such that those constraints are satisfied. The aim of this work is to capture this iterative process by extending the query model. Extended queries, which express additional constraints on the answer set and designate some of the conditions in the relational query as flexible, are defined. The query modification operators modify flexible constraints to satisfy an extended query. The query modification operation, generalization, is described. The conditions under which generalization is applicable are identified. Rules of generalization are proposed, and an algorithm for picking a minimal generalization is suggested. >

79 citations


Journal ArticleDOI
TL;DR: A component theory of information retrieval using single content terms as component for queries and documents was reviewed and experimented with and performed substantially better than Croft's model because of the highly specific nature of document-focused feedback.
Abstract: A component theory of information retrieval using single content terms as component for queries and documents was reviewed and experimented with. The theory has the advantages of being able to (1) bootstrap itself, that is, define initial term weights naturally based on the fact that items are self relevent; (2) make use of within-item term frequencies; (3) account for query-focused and document-focused indexing and retrieval strategies cooperatively; and (4) allow for component-specific feedback if such information is available. Retrieval results with four collections support the effectiveness of all the first three aspects, except for predictive retrieval. At the initial indexing stage, the retrieval theory performed much more consistantly across collections than croft's model and provided results comparable to Salton's tf*idf approach. An inverse collection term frequency (ICTF) formula was also tested that performed much better than the inverse document frequency (IDF). With full feedback retrospective retrieval, the component theory performed substantially better than Croft's, because of the highly specific nature of document-focused feedback. Repetitive retireval results with partial relevance feedback mirrored those for the retrospective. However, for the important case of predictive retrieval using residual ranking, results were not unequivocal.

64 citations


Journal ArticleDOI
TL;DR: This study is based on the notions of user preference and an acceptable ranking strategy that enables us to adopt a gradient descent algorithm to formulate the query vector by an inductive process and has the added advantages that it is applicable to both nonbinary document representation and a user preference relation inducing more than two classes.
Abstract: The subject of query formulation is analyzed within the framework of adaptive linear models. Our study is based on the notions of user preference and an acceptable ranking strategy. Such an approach enables us to adopt a gradient descent algorithm to formulate the query vector by an inductive process. We also present a critical analysis of the existing relevance feedback and the probabilistic approaches. It is shown that Rocchio's method is a special case of our linear model and the independence assumption may be stronger than required for a linear system. Our method has the added advantages that it is applicable to both nonbinary document representation and a user preference relation inducing more than two classes. © 1990 John Wiley & Sons, Inc.

50 citations


Book ChapterDOI
01 Mar 1990
TL;DR: The retrieve process in multimedia document systems is inherently different from the retrieval process in traditional (record oriented) database systems.
Abstract: The retrieval process in multimedia document systems is inherently different from the retrieval process in traditional (record oriented) database systems. While the latter can be considered an exact process (records either satisfy the query or not), the former is not an exact process and the system must take into account the uncertainty factor (i.e. the answer is not only "true" or "false" but is often in between them).

47 citations


Proceedings Article
29 Jul 1990
TL;DR: This paper presents a domain independent strategy for ranking a set of plausible reuse candidates in the order of cost of modifying them to solve a new planning problem.
Abstract: Effective mapping and retrieval are important issues in successful deployment of plan reuse strategies. In this paper we present a domain independent strategy for ranking a set of plausible reuse candidates in the order of cost of modifying them to solve a new planning problem. The cost of modification is estimated by measuring the amount of disturbance caused to the validation structure of a reuse candidate if it were to be reused in the new problem situation. This strategy is more informed than the typical feature based retrieval strategies, and is more efficient than the methods which require partial knowledge of the nature of the plan for the new problem situation to guide the retrieval process. We discuss the implementation of this retrieval strategy in FRIAR, a framework for flexible reuse and modification in hierarchical planning.

42 citations


Journal ArticleDOI
TL;DR: This paper structurally characterizes the complexity of ranking and shows that even a type of approximate ranking, enumerative ranking, is hard unless P = P #P .

41 citations


Journal ArticleDOI
TL;DR: Two programs are described, INDEX and INDEXD, which locate repeated phrases in a document, gather statistical information about them, and rank them according to their value as index phrases, showing promise as the basis for a sophisticated conceptual indexing system.
Abstract: In recent years researchers have become increasingly convinced that the performance of information retrieval systems can be greatly enhanced by the use of key phrases for automatic conceptual document indexing and retrieval. In this article we describe two programs, INDEX and INDEXD, which locate repeated phrases in a document, gather statistical information about them, and rank them according to their value as index phrases. The programs show promise as the basis for a sophisticated conceptual indexing system. The simpler program, INDEX, ranks phrases in such a way that frequently occurring phrases which contain several frequently occurring words are given a high ranking. INDEXD is an extension of INDEX which incorporates a dictionary for stemming, weighting of words and validation of syntax of output phrases. Sample output of both programs is included, and we discuss plans to combine INDEXD with linguistic and artificial intelligence techniques to provide a general conceptual phrase-indexing system that can incorporate expert knowledge about a given application area. © 1990 John Wiley & Sons, Inc.

39 citations


Journal ArticleDOI
TL;DR: An interface is described that allows the specification of uncertain queries and combines evidence about the relevance of documents to produce an overall ranking, and the results of an experiment address some of the cognitive aspects of such an interface.
Abstract: Documents such as reports, articles, memos, and forms have complex structure and layout, as well as multimedia content. Information systems that represent these aspects of documents must be able to handle queries more complex than in typical database or text retrieval systems. In this paper, we describe an interface that allows the specification of uncertain queries and combines evidence about the relevance of documents to produce an overall ranking. We also report the results of an experiment that addresses some of the cognitive aspects of such an interface.

30 citations


Proceedings ArticleDOI
01 Jul 1990
TL;DR: In this article, the authors consider the extension of ranking elements in R to ranking a set of vectors in a p'th dimensional space R p. In the approach presented here vector ranking reduces to ordering vectors according to a sorted list of vector distances.
Abstract: In this paper, we consider the extension of ranking a set of elements in R to ranking a set of vectors in a p'th dimensional space R p . In the approach presented here vector ranking reduces to ordering vectors according to a sorted list of vector distances. A statistical analysis of this vector ranking is presented, and these vector ranking concepts are then used to develop ranked-order type estimators for multivariate image fields. We develop a class of vector filters which are efficient smoothers in additive noise and can be designed to have detail-preserving characteristics. A statistical analysis is developed for the class of filters and a number of simulations were performed in order to quantitatively evaluate their performance. These simulations involve the estimation of both stationary multivariate random signals and color images in additive noise.

19 citations


Book ChapterDOI
01 Jan 1990
TL;DR: In this article, a ranking of alternatives in the framework of fuzzy preference relations that are complementary to the unit (CU-relations) is constructed in the context of the CU-relations.
Abstract: A ranking of alternatives is constructed in the framework of the fuzzy preference relations that are complementary to the unit (CU-relations).

Journal ArticleDOI
Li D. Xu1
TL;DR: In this paper, the problem of employing linguistic variables to rank alternatives across a set of criteria is investigated, with emphasis placed on modelling the decision-maker's reasoning process, and a fuzzy mathematical model is employed to represent this sort of linguistic evaluation and synthesis, which can be employed in expert systems for making inferences on multi-criteria problems.
Abstract: The problem of employing linguistic variables to rank alternatives across a set of criteria is investigated, with emphasis placed on modelling the decision-maker's reasoning process. Given a set of alternatives, decision-makers often do not make conclusions immediately, instead evaluating them in the light of a given set of criteria and then synthesizing the knowledge obtained from the evaluation. Both evaluation and synthesis are usually expressed linguistically rather than numerically. A fuzzy mathematical model is employed to represent this sort of linguistic evaluation and synthesis. An example is presented to illustrate the basic idea and technique. This example is also used to compare the proposed technique with another existing technique. The model can be employed in expert systems for making inferences on multi-criteria problems.

Proceedings ArticleDOI
01 May 1990
TL;DR: Kaleidoscope is a cooperative query interface whose knowledge guides users to avoid most failure during query creation, and its approach to two linear-syntax languages in different levels of abstraction SQL and a query language whose syntax and semantics cover a subset of wh-queries is applied.
Abstract: Querying databases to obtain information requires the user's knowledge of query language and underlying data. However, because the knowledge in human long-term memory is imprecise, incomplete, and often incorrect, user queries are subject to various types of failure. These may include spelling mistakes, the violation of the syntax and semantics of a query language, and the misconception of the entities and relationships in a database.Kaleidoscope is a cooperative query interface whose knowledge guides users to avoid most failure during query creation. We call this type of cooperative behavior intraquery guidance. To enable this early, active engagement in the user's process of query creation, Kaleidoscope reduces the granularity of user-system interaction via a context-sensitive menu. The system generates valid query constituents as menu choices step-by-step by interpreting a language grammar, and the user creates a query following this menu guidance[2]. For instance, it takes four steps to create the following query [Q1] Who/1 authored/2 'Al'/3 journal papers/(3+) in 'Postquery COOP'/4At each of such steps, as the user selects one of menu choices, the system updates its partial query status window. If a choice is unique as in (3+), it is taken automatically. To guide the user's entry of values, the system provides a pop-up menu for each value domain.With Kaleidoscope's process of choice generation tightly controlled by the system's knowledge of query language and underlying data, users need not remember the query language and the underlying database structure but merely recognize or identify the constituents coming one after another that match their intended query. The system provides additional guidance for users to avoid creating semantically inconsistent queries. It informs the user of any derived predicates on the completion of a user-selected predicate. To illustrate this, consider a partially constructed SQL query [Q2] SELECT * FROM professor p#1 WHERE p#1 dept = 'CS' AND p#1 salary

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a nonparametric measure of the efficacy of the ranking, called the net difference in ranks (NDR), which is the sum of the differences in ranks of the paired players in the observed contests that agree with the ranking.
Abstract: The ranking of paired contestants (players) after a series of contests is difficult when every player does not play every other player. In the 1975 JASA Mark Thompson presented a maximum likelihood solution based on the assumption that the probability of any one player defeating any other is a function only of the difference in their ranks. Here the linear approximation to that likelihood is shown to lead to a nonparametric measure of the efficacy of the ranking, called the net difference in ranks (NDR) , which is the sum of the differences in ranks of the paired players in the observed contests that agree with the ranking minus the sum of the differences in ranks in the observed contests that disagree with the ranking (upsets) . The subject is part of a large literature that has been consolidated by H.A. David in The Method of Paired Comparisons (1963, 1988). The method was introduced by the psychophysicist Fechner in 1860 and has been widely applied to sensory testing,

Journal ArticleDOI
TL;DR: Two partitioning methods for signature files are presented in order to implement the tf × idf ranking strategy efficiently and represent term frequencies without storing them explicitly.
Abstract: In this paper we present two partitioning methods for signature files in order to implement the tf × idf ranking strategy efficiently. The methods represent term frequencies without storing them explicitly. The first method partitions terms in a document based upon their term frequencies. The second one further partitions the terms vertically based upon their ordinal numbers in the dictionary. The latter allows partial retrieval of the signature files in response to a query. A fast weight computation method is also described. Detailed analysis of the new methods is given. Experimental runs are performed on the document collections made available with the SMART system.


Proceedings Article
07 Nov 1990
TL;DR: Preliminary data assessing the suitability of various default heuristic query network edge assignment functions is presented, suggesting that query networks using default assignment functions exhibit behavior consistent with that expected from an information retrieval aid.
Abstract: Query networks are specializations of Belief networks used in information retrieval. We hypothesize that query networks can be incorporated into medical information systems in at least two ways: First, the relative values of nodes in the query networks can be used to initiate searches based on query term-weights. Second, query models can incorporate reader feedback and can become simple task-specific user models. If large query networks are to be useful, one must find means to assign reasonable “default” values to those nodes and edges which are not explicitly defined by some other means. This paper presents preliminary data assessing the suitability of various default heuristic query network edge assignment functions. Early evidence suggests that query networks using default assignment functions exhibit behavior consistent with that expected from an information retrieval aid.

Journal ArticleDOI
TL;DR: In this article, the relative seriousness of criminal conduct for the purpose of allocating statutory penalties is discussed, and a number of different approaches are examined, including different approaches for different levels of seriousness.
Abstract: Some ranking of the relative seriousness of criminal conduct for the purpose of allocating statutory penalties is needed. A number of different approaches are possible. This article examines those ...

Proceedings Article
01 Jan 1990
TL;DR: It is shown that ranking languages accepted by 1-way unambiguous auxiliary pushdown automata operating in polynomial time is inNC(2), and negative results about ranking for several classes of simple languages are proved.
Abstract: Ranking is the problem of computing for an input string its lexicographic index in a given (fixed) language. This paper concerns the complexity of ranking. We show that ranking languages accepted by 1-way unambiguous auxiliary pushdown automata operating in polynomial time is inNC (2). We also prove negative results about ranking for several classes of simple languages.C is rankable in deterministic polynomial time iffP=P #P , whereC is any of the following six classes of languages: (1) languages accepted by logtime-bounded nondeterministic Turing machines, (2) languages accepted by (uniform) families of unbounded fan-in circuits of constant depth and polynomial size, (3) languages accepted by 2-way deterministic pushdown automata, (4) languages accepted by multihead deterministic finite automata, (5) languages accepted by 1-way nondeterministic logspace-bounded Turing machines, and (6) finitely ambiguous linear context-free languages.

Proceedings ArticleDOI
02 Dec 1990
TL;DR: The paper proposes a fast parallel algorithm for finding approximate optimal node ranking of trees using O(logn) steps with n/sup 2/ processors on a CRCW PRAM and an efficient parallel algorithm using O (log/Sup 2/n) steps on a EREW model.
Abstract: Ranking a tree is defined as a mapping rho of the nodes to the set (1, 2, . . .) such that if there is a path from u to v and rho (u)= rho (v) then there is a node w on the path from u to v such that rho (w)> rho (u). The highest number assigned to the node is called the rank number of the mapping. A mapping rho with the smallest rank number is called optimal ranking. The best known serial algorithm takes O(n) time for the optimal node ranking. However, the problem of finding the optimal tree ranking appears to be highly sequential. It remains open whether it is in NC. The paper proposes a fast parallel algorithm for finding approximate optimal node ranking of trees using O(logn) steps with n/sup 2/ processors on a CRCW PRAM and an efficient parallel algorithm using O(log/sup 2/n) steps with n processors on a EREW model. >


Journal ArticleDOI
TL;DR: In this article, a method is proposed to combine a relative and an absolute approach to performance appraisal; specifically, graphic rating and ranking are combined, and less leniency generally was found in QRS than in graphic rating.
Abstract: A method is proposed to combine a relative and an absolute approach to performance appraisal; specifically, graphic rating and ranking are combined. In two studies which examined this method, called the Quantitative Ranking Scale (QRS), less leniency generally was found in QRS than in graphic rating. The psychometric performance of the QRS was best when the distinction between the QRS and pure ranking was least. Nonetheless, the system would appear to hold promise as an alternative to graphic rating that could easily be adapted to other rating formats, such as behavioral anchoring or computerized rating.


Journal ArticleDOI
TL;DR: A general model for the selection of scientific journals based on ranking of the data sources is presented and the validity of the concept applied is supported by multiple testings of the model for journals in the field of chemistry.
Abstract: A general model for the selection of scientific journals based on ranking of the data sources is presented. The validity of the concept applied is supported by multiple testings of the model for journals in the field of chemistry. Several analyses, including the impact of input values on the formation of the model output, are performed.

Journal ArticleDOI
TL;DR: In the course of the study of scientific journal's rank distributions two new parameters are defined reflecting collective properties of journals in a network where the journals are linked to each other through co-usage of user profiles for which they contain relevant papers.
Abstract: In the course of the study of scientific journal's rank distributions two new parameters are defined reflecting collective properties of journals in a network where the journals are linked to each other through co-usage of user profiles for which they contain relevant papers. The first, Collectivity C is a mere structure parameter whereas Selective Collectivity N·C uses C of a journal as a weight factor for the number of hits N produced in a retrospective search in a data file. The corresponding rank distributions show besides the expected reranking effect considerable deviations from a distribution where ranking is done according to the parameter Selective Journal Productivity N.

01 Jan 1990
TL;DR: In 1990, a documented list of 33 Critical Issues Facing American families was sent to a nationwide sample of 1,23 I persons as discussed by the authors, who were selected by State Extension Specialists serving the needs of families.
Abstract: In April 1990 a documented list of 33 Critical Issues Facing American Families was sent to a nationwide sample of 1,23 I persons. This list identified and briefly overviewed various social concerns which effect the family as an institution. The sample was selected by State Extension Specialists serving the needs of families. Each specialist was asked to identify 20 people from their state. The sample inclnded persons who were professional staff working for Extension, University teachers/researchers in a family-related discipline, public school educators, and other persons who had and had not used Extension Services in the recent past.

Book ChapterDOI
Samuel W. Bent1
01 Jul 1990

01 Apr 1990
TL;DR: Determining the optimal logical form of a query in information retrieval, given the attributes to be used, can be expressed as a parametric hyperbolic 0–1 program and solved in O(n logn) time, wheren is the number of elementary logical conjunctions of the attributes.
Abstract: Unconstrained hyperbolic 0–1 programming can be solved in linear time when the numerator and the denominator are linear and the latter is always positive. It is NP-hard, and finding an approximate solution with a value equal to a positive multiple of the optimal one is also NP-hard, if this last hypothesis does not hold. Determining the optimal logical form of a query in information retrieval, given the attributes to be used, can be expressed as a parametric hyperbolic 0–1 program and solved in O(n logn) time, wheren is the number of elementary logical conjunctions of the attributes. This allows to characterize the optimal queries for the Van Rijsbergen synthetic criterion.

Book ChapterDOI
M.M. Desu1
01 Jan 1990