scispace - formally typeset
Search or ask a question

Showing papers on "Ranking (information retrieval) published in 1987"


Proceedings ArticleDOI
Johann-Christoph Freytag1
01 Dec 1987
TL;DR: This paper describes its operations by transformation rules which generate different QEPs from initial query specifications and hopes that the approach taken will contribute to the more general goal of a modular query optimizer as part of an extensible database management system.
Abstract: The query optimizer is an important system component of a relational database management system (DBMS). It is the responsibility of this component to translate the user-submitted query - usually written in a non-procedural language - into an efficient query evaluation plan (QEP) which is then executed against the database. The research literature describes a wide variety of optimization strategies for different query languages and implementation environments. However, very little is known about how to design and structure the query optimization component to implement these strategies.This paper proposes a first step towards the design of a modular query optimizer. We describe its operations by transformation rules which generate different QEPs from initial query specifications. As we distinguish different aspects of the query optimization process, our hope is that the approach taken in this paper will contribute to the more general goal of a modular query optimizer as part of an extensible database management system.

159 citations


Patent
02 Oct 1987
TL;DR: In this article, a text compression method and apparatus are disclosed that enable overall compression ratios of more than six or eight to one for normal language text, and entries in these dictionaries are categorized by a weighted frequency of use ranking in which the product of the word length in characters and the frequency of occurrence of that word in the text is taken as the weighted figure of merit for ranking words to be placed in the individual dictionaries.
Abstract: A text compression method and apparatus are disclosed that enable overall compression ratios of more than six or eight to one for normal language text. Plural multiple-word dictionaries that are specialized for the particular field of use are employed together with a header transmission format that identifies which dictionaries are to be used. In addition, entries in these dictionaries are categorized by a weighted frequency of use ranking in which the product of the word length in characters and the frequency of occurrence of that word in the text is taken as the weighted figure of merit for ranking words to be placed in the individual dictionaries.

98 citations


Journal ArticleDOI
TL;DR: The main types of machine‐readable structure representation — fragmentation codes, linear notations and connection tables — are described, together with the retrieval algorithms which are used to provide structure and substructure search facilities.
Abstract: This paper decribes the development and current state-of-the-art in computerized systems for the storage and retrieval of chemical structure information. The main types of machine-readable structure representation — fragmentation codes, linear notations and connection tables — are described, together with the retrieval algorithms which are used to provide structure and substructure search facilities. Current research work in chemical structure retrieval includes the development of techniques for the representation and searching of the generic structures which occur in chemical patents, for searching files of three-dimensional structures, for ranking searches designed to identify compounds structurally similar to a given query compound, and the use of parallel computers to increase the efficiency of substructure searching. Chemical structure handling techniques are also applicable in a range of application areas, including chemical reaction indexing, computer-aided synthesis design and structure elucidation, and substructural analysis methods for the study of quantitative structure—activity relationships.

37 citations


Journal ArticleDOI
TL;DR: This article presents a conceptual model of the retrieval process of a document-retrieval system, which has been prototypically implemented in modular form to test system response to changes in model parameters.
Abstract: This article presents our conceptual model of the retrieval process of a document-retrieval system. The retrieval mechanism input is an unambiguous intermediate form of a user query generated by the language processor using the method described previously. Our retrieval mechanism uses a two-step procedure. In the first step a list of documents pertinent to the query are obtained from the document database, and then an evidence-combination scheme is used to compute the degree of support between the query and individual documents. The second step uses a ranking procedure to obtain a final degree of support for each document chosen, as a function of individual degrees of support associated with one or more parts of the query. The end result is a set of document citations presented to the user in ranked order in response to the information request. Numerical examples are given to illustrate various facets of the overall system, which has been prototypically implemented in modular form to test system response to changes in model parameters. © 1987 John Wiley & Sons, Inc.

31 citations


Proceedings ArticleDOI
01 Nov 1987
TL;DR: The interaction of suffixing algorithms and ranking techniques in retrieval performance, particularly in an online environment, was investigated and two modifications to ranking techniques were suggested: variable weighting of word variants and selective stemming depending on query length.
Abstract: The interaction of suffixing algorithms and ranking techniques in retrieval performance, particularly in an online environment, was investigated. Three general purpose suffixing algorithms were used for retrieval on the Cranfield 1400, Medlars, and CACM collections, and the results analysed with several standard evaluation measures. An examination of the retrieval performance using suffixing suggested two modifications to ranking techniques: variable weighting of word variants and selective stemming depending on query length. The experimental data is presented, and the limitations of suffixing in an online environment is discussed.

28 citations


Journal ArticleDOI
TL;DR: In this article, the rank of a set of alternatives can change if a new criterion is introduced into the set of criteria, but it can also change if the importance of the criteria depend on the number of alternatives and on the strength of their ranking.

27 citations



Journal ArticleDOI
TL;DR: In considering the problem of ranking some of the greatest active sports records, three categories of sports records are examined—season records, career or multiple-year records, and daily or single game records.
Abstract: In considering the problem of ranking some of the greatest active sports records, we examine three categories of sports records—season records, career or multiple-year records, and daily or single game records. Within each category, we compare a number of outstanding records using the analytic hierarchy process. We discuss the hierarchies, data requirements, and evaluation results in detail along with a host of interesting comparison issues.

12 citations


Proceedings ArticleDOI
01 Nov 1987
TL;DR: The experimental results show that in this case no improvement over a simple coordination match function can be achieved, and models based on probabilistic indexing outperform the ranking procedures using search term weights.
Abstract: The effect of probabilistic search term weighting on the improvement of retrieval quality has been demonstrated in various experiments described in the literature. In this paper, we investigate the feasibility of this method for boolean retrieval with terms from a prescribed indexing vocabulary. This is a quite different test setting in comparison to other experiments where linear retrieval with free text terms was used. The experimental results show that in our case no improvement over a simple coordination match function can be achieved. On the other hand, models based on probabilistic indexing outperform the ranking procedures using search term weights.

11 citations





01 Jan 1987
TL;DR: In this article, the authors identify key strategies for school effectiveness and how they are implemented, as described by administrators and teachers in SELECTED EXEMPLARY PRIVATE SECONDARY SCHOOLS.
Abstract: IDENTIFICATION OF KEY STRATEGIES FOR SCHOOL EFFECTIVENESS AND HOW THEY ARE IMPLEMENTED, AS PERCEIVED BY ADMINISTRATORS AND TEACHERS IN SELECTED EXEMPLARY PRIVATE SECONDARY SCHOOLS

Journal ArticleDOI
TL;DR: A general similarity measure between two efficiency measures is established and is analyzed in the case of monotonic marginal functions and, especially, when the set D is convex.
Abstract: The concepts of efficiency measures of multicategory information systems and similarity measures between two efficiency measures are introduced and developed. The relationship between two efficiency measures is defined by the corresponding set D , and the appropriate marginal functions are introduced. An earlier work of Ben-Bassat's dealing with the ranking of features according to feature selection criteria is generalized and refined. Consequently, a general similarity measure between two efficiency measures is established. It is analyzed in the case of monotonic marginal functions and, especially, when the set D is convex. Some important sufficient conditions for D to be convex are also given. Finally, two other similarity measures are defined and interpreted.


Book ChapterDOI
01 Jan 1987
TL;DR: A critical review of the progress made in the development of the risk ranking technique is presented, which takes into account the technical, economic and sociopolitical factors involved in determining the acceptability of a risk.
Abstract: This paper presents a critical review of the progress made in the development of the risk ranking technique. The aim of the development of the technique has been to produce a method of making a comprehensive assessment that takes into account the technical, economic and sociopolitical factors involved in determining the acceptability of a risk.




Proceedings ArticleDOI
K. L. Kwok1
01 Nov 1987
TL;DR: An optimal query has been defined as one which will recover all the known relevant documents of a query in their best probability of relevance ranking, and it is slightly modified so that it allows one to trace its evolution from the original to the optimal via the various feedback stages.
Abstract: An optimal query has been defined as one which will recover all the known relevant documents of a query in their best probability of relevance ranking. We have slightly modified the definition so that it also allows one to trace its evolution from the original to the optimal via the various feedback stages. Such a query can be constructed by modifying the original query with terms from the known relevant documents. It is pointed out that such a term addition strategy differs materially from other approaches that add terms based on term association with all query terms, and calculated from the whole document collection. The effect of viewing a document as constituted of components, and hence affecting the weighting and retreival results of of the optimal query, is also discussed.

Book ChapterDOI
01 Jan 1987
TL;DR: In this article, an ordering on the set of all rooted trees of a fixed number of vertices is defined, which leads to fast ranking and unranking algorithms for the graceful labeling problem.
Abstract: We define an ordering on the set of all rooted trees of a fixed number of vertices that leads to fast ranking and unranking algorithms. An application to the graceful labeling problem is given, which shows how the method can eliminate repeated isomorphism testing.

Journal ArticleDOI
TL;DR: This study investigated the feasibility of improving the precision of retrieval in searching for patent information by comparing the abstract patent information databases of VINITI and VNIIPI and an experimental comparison of techniques of reducing 'noise' (i.e. irrelevant information) in patent searches.

Journal ArticleDOI
TL;DR: This paper estimates the average cost of a range query in MAT based data organization and proves that the average performance can be improved by ranking the attributes in such a way that theaverage size of the filial sets decreases towards the lower levels of the tree structure.
Abstract: In a Multiple Attribute Tree (MAT) based data organization, the average case response to a specific range query depends on the structural properties of MAT. These structural properties depend very much on the interrelationships among the data elements. Efficiency in searching can be achieved by exploiting the data properties in the construction of MAT. The order or ranking of attributes is a key factor in deciding the profile of the MAT for given data. In this paper, we estimate the average cost of a range query in MAT based data organization. We then prove that the average performance can be improved by ranking the attributes in such a way that the average size of the filial sets decreases towards the lower levels of the tree structure.


Proceedings ArticleDOI
01 Feb 1987
TL;DR: A special-purpose hardware device is proposed that, given an initial query, scans the disk-based Prolog database and returns to the main computer a subset of predicates relevant to the query, which may then be used to resolve the user query in the normal Prolog manner.
Abstract: PROBIB-2 is a prototype expert system for online bibliographic retrieval that provides enhanced retrieval capabilities through the application of logic programming. Implemented in Prolog, this system attempts to integrate the capabilities of a human search intermediary into a retrieval system that provides a deductive reasoning capability. Through a natural language interface, the user may retrieve information about the knowledge in the database as well as documents in response to query. User profiles may be used to establish a query environment to enable the system to determine the information need of the user.A concern with using Prolog to perform an online search of a large database is that the response time would be unacceptable. In order to overcome this drawback a special-purpose hardware device is proposed that, given an initial query, scans the disk-based Prolog database and returns to the main computer a subset of predicates relevant to the query. This set of predicates may then be used to resolve the user query in the normal Prolog manner.