scispace - formally typeset
Search or ask a question

Showing papers on "Document retrieval published in 1994"


Proceedings Article
01 Jan 1994
TL;DR: Much of the work involved investigating plausible methods of applying Okapi-style weighting to phrases, and expansion using terms from the top documents retrieved by a pilot search on topic terms was used.
Abstract: City submitted two runs each for the automatic ad hoc, very large collection track, automatic routing and Chinese track; and took part in the interactive and filtering tracks. The method used was : expansion using terms from the top documents retrieved by a pilot search on topic terms. Additional runs seem to show that we would have done better without expansion. Twor runs using the method of city96al were also submitted for the Very Large Collection track. The training database and its relevant documents were partitioned into three parts. Working on a pool of terms extracted from the relevant documents for one partition, an iterative procedure added or removed terms and/or varied their weights. After each change in query content or term weights a score was calculated by using the current query to search a second protion of the training database and evaluating the results against the corresponding set of relevant documents. Methods were compared by evaluating queries predictively against the third training partition. Queries from different methods were then merged and the results evaluated in the same way. Two runs were submitted, one based on character searching and the other on words or phrases. Much of the work involved investigating plausible methods of applying Okapi-style weighting to phrases

2,459 citations


Proceedings Article
01 Jan 1994
TL;DR: This paper describes one method that has been shown to increase performance by combining the similarity values from five different retrieval runs using both vector space and P-norm extended boolean retrieval methods.
Abstract: The TREC-2 project at Virginai Tech focused on methods for combining the evidence from multiple retrieval runs to improve performance over any single retrieval method. This paper describes one such method that has been shown to increase performance by combining the similarity values from five different retrieval runs using both vector space and P-norm extended boolean retrieval methods

1,106 citations


Journal ArticleDOI
TL;DR: Examination of the mathematical relationship between Precision and Recall shows that a quadratic Recall curve can resemble empirical Recall–Precision behavior if transformed into a tangent parabola.
Abstract: Empirical studies of retrieval performance have shown a tendency for Precision to decline as Recall increases. This article examines the nature of the relationship between Precision and Recall. The relationships between Recall and the number of documents retrieved, between Precision and the number of documents retrieved, and between Precision and Recall are described in the context of different assumptions about retrieval performance. It is demonstrated that a tradeoff between Recall and Precision is unavoidable whenever retrieval performance is consistently better than retrieval at random. More generally, for the Precision–Recall trade-off to be avoided as the total number of documents retrieved increases, retrieval performance must be equal to or better than overall retrieval performance up to that point. Examination of the mathematical relationship between Precision and Recall shows that a quadratic Recall curve can resemble empirical Recall–Precision behavior if transformed into a tangent parabola. With very large databases and/or systems with limited retrieval capabilities there can be advantages to retrieval in two stages: Initial retrieval emphasizing high Recall, followed by more detailed searching of the initially retrieved set, can be used to improve both Recall and Precision simultaneously. Even so, a tradeoff between Precision and Recall remains. © 1994 John Wiley & Sons, Inc.

714 citations


Proceedings ArticleDOI
01 Aug 1994
TL;DR: The increasing lengths of documents in full-text collections encourages renewed interest in the ranking and retrieval of document passages, but questions about how passages are defined, how they can be ranked efficiently, and what is their proper role in long, structured documents are raised.
Abstract: The increasing lengths of documents in full-text collections encourages renewed interest in the ranking and retrieval of document passages. Past research showed that evidence from passages can improve retrieval results, but it also raised questions about how passages are defined, how they can be ranked efficiently, and what is their proper role in long, structured documents.

544 citations


Journal ArticleDOI
TL;DR: The results indicate that the criteria employed by users included tangible characteristics of documents (e.g., the information content of the document, the provision of references to other sources of information), subjective qualities (e.)g.
Abstract: The objective of this study was to describe the criteria mentioned by users evaluating the information within documents as it related to the users' information need situations. Data were collected by asking users in an academic environment to evaluate representations and the full text of documents that had been retrieved specifically for each user's information need situation. Users were asked to mark the portions of the document representations or of the full text of documents that indicated to the users whether they would or would not pursue the information within documents. An open-ended interview technique was then employed to discuss each marked portion with users. The interviews were audiotaped, the tapes transcribed, and the transcriptions were content analyzed in order to identify and describe evaluation criteria. The results indicate that the criteria employed by users included tangible characteristics of documents (e.g., the information content of the document, the provision of references to other sources of information), subjective qualities (e.g., agreement with the information provided by the document) and situational factors (e.g., the time constraints under which the user was working). The implications of this research for our understanding of the concept of relevance, and for the design and evaluation of information retrieval systems, are discussed. © 1994 John Wiley & Sons, Inc.

403 citations


Proceedings ArticleDOI
Ross Wilkinson1
01 Aug 1994
TL;DR: This work considers what information is needed to retrieve effectively and shows that knowledge of the structure of documents can lead to improved retrieval performance.
Abstract: Information systems usually retrieve whole documents as answers to queries. However, it may in some circumstances be more appropriate to retrieve parts of documents. We consider formulas for retrieving whole documents and parts of documents horn a large structured document collection. We consider what information is needed to retrieve effectively and show that knowledge of the structure of documents can lead to improved retrieval performance.

286 citations


Proceedings ArticleDOI
24 May 1994
TL;DR: In this paper, the problem of incremental updates of inverted lists is addressed using a new dual-structure index that dynamically separates long and short inverted lists and optimizes retrieval, update, and storage of each type of list.
Abstract: With the proliferation of the world's “information highways” a renewed interest in efficient document indexing techniques has come about. In this paper, the problem of incremental updates of inverted lists is addressed using a new dual-structure index. The index dynamically separates long and short inverted lists and optimizes retrieval, update, and storage of each type of list. To study the behavior of the index, a space of engineering trade-offs which range from optimizing update time to optimizing query performance is described. We quantitatively explore this space by using actual data and hardware in combination with a simulation of an information retrieval system. We then describe the best algorithm for a variety of criteria.

200 citations


Proceedings Article
12 Sep 1994
TL;DR: This work describes the system and presents experimental results showing superior incremental indexing and competitive query processing performance, using a traditional inverted file index built on top of a persistent object store.
Abstract: Full-text information retrieval systems have traditionally been designed for archival environments. They often provide little or no support for adding new documents to an existing document collection, requiring instead that the entire collection be re-indexed. Modern applications, such as information filtering, operate in dynamic environments that require frequent additions to document collections. We provide this ability using a traditional inverted file index built on top of a persistent object store. The data management facilities of the persistent object store are used to produce efficient incremental update of the inverted lists. We describe our system and present experimental results showing superior incremental indexing and competitive query processing performance.

149 citations


Posted Content
TL;DR: This article developed an automatic abstract generation system for Japanese expository writings based on rhetorical structure extraction, which first extracts the rhetorical structure, the compound of the rhetorical relations between sentences, and then cuts out less important parts in the extracted structure to generate an abstract of the desired length.
Abstract: We have developed an automatic abstract generation system for Japanese expository writings based on rhetorical structure extraction. The system first extracts the rhetorical structure, the compound of the rhetorical relations between sentences, and then cuts out less important parts in the extracted structure to generate an abstract of the desired length. Evaluation of the generated abstract showed that it contains at maximum 74\% of the most important sentences of the original text. The system is now utilized as a text browser for a prototypical interactive document retrieval system.

146 citations


Proceedings Article
01 Jan 1994
TL;DR: This work combines a vector processing model for documents and queries, but using N-gram frequencies as the basis for the vector element values instead of more traditional term frequencies, which provides good retrieval performance on the TREC-1 andTREC-2 tests without the need for any kind of word stemming or stopword removal.
Abstract: N-gram based representations for documents have several distinct advantages for various document processing tasks. First, they provide a more robust representation in the face of grammatical and typographical errors in the documents. Secondly, N-gram representations require no linguistic preparations such as word-stemming or stopword removal. Thus they are ideal in situations requiring multi-language operations. Vector processing retrieval models also have some unique advantages for information retrieval tasks. In particular, they provide a simple, uniform representation for documents and queries, and an intuitively appealing document similarity measure. Also, modern vector space models have good retrieval performance characteristics. In this work, we combine these two ideas by using a vector processing model for documents and queries, but using N-gram frequencies as the basis for the vector element values instead of more traditional term frequencies. The resulting system provides good retrieval performance on the TREC-1 and TREC-2 tests without the need for any kind of word stemming or stopword removal. We also have begun testing the system on Spanish language documents.

133 citations


Proceedings ArticleDOI
01 Aug 1994
TL;DR: The model utilizes the technique of logistic regression to obtain equations which rank documents by probability of relevance as a function of document and query properties and is compared directly to the particular vector space model of retrieval which uses term-frequency/inverse-document-frequency weighting and the cosine similarity measure.
Abstract: This research evaluates a model for probabilistic text and document retrieval; the model utilizes the technique of logistic regression to obtain equations which rank documents by probability of relevance as a function of document and query properties. Since the model infers probability of relevance from statistical clues present in the texts of documents and queries, we call it logistic inference. By transforming the distribution of each statistical clue into its standardized distribution (one with mean μ = 0 and standard deviation σ = 1), the method allows one to apply logistic coefficients derived from a training collection to other document collections, with little loss of predictive power. The model is applied to three well-known information retrieval test collections, and the results are compared directly to the particular vector space model of retrieval which uses term-frequency/inverse-document-frequency (tfidf) weighting and the cosine similarity measure. In the comparison, the logistic inference method performs significantly better than (in two collections) or equally well as (in the third collection) the tfidf/cosine vector space model. The differences in performances of the two models were subjected to statistical tests to see if the differences are statistically significant or could have occurred by chance.

Proceedings ArticleDOI
Kenji Ono1, Kazuo Sumita1, Seiji Miike1
05 Aug 1994
TL;DR: The system first extracts the rhetorical structure, the compound of the rhetorical relations between sentences, and then cuts out less important parts in the extracted structure to generate an abstract of the desired length.
Abstract: We have developed an automatic abstract generation system for Japanese expository writings based on rhetorical structure extraction. The system first extracts the rhetorical structure, the compound of the rhetorical relations between sentences, and then cuts out less important parts in the extracted structure to generate an abstract of the desired length.Evaluation of the generated abstract showed that it contains at maximum 74% of the most important sentences of the original text. The system is now utilized as a text browser for a prototypical interactive document retrieval system.

Proceedings ArticleDOI
Michael Persin1
01 Aug 1994
TL;DR: The experiments show that the proposed evaluation technique reduces both main memory usage and query evaluation time, based on early recognition of which documents are likely to be highly ranked, without degradation in retrieval effectiveness.
Abstract: Ranking techniques are effective for finding answers in document collections but the cost of evaluation of ranked queries can be unacceptably high. We propose an evaluation technique that reduces both main memory usage and query evaluation time. based on early recognition of which documents are likely to be highly ranked. Our experiments show that, for our test data, the proposed technique evaluates queries in 20% of the time and 2% of the memory taken by the standard inverted file implementation, without degradation in retrieval effectiveness.

Journal ArticleDOI
TL;DR: An algorithm which automatically extracts geopositional coordinate index terms from text to support georeferenced document indexing and retrieval is presented.
Abstract: In this paper we present an algorithm which automatically extracts geopositional coordinate index terms from text to support georeferenced document indexing and retrieval. Under this algorithm, words and phrases containing geographic place names or characteristics are extracted from a text document and used as input to database functions which use spatial reasoning to approximate statistically the geoposition being referenced in the text. We conclude with a discussion of preliminary results and future work.

Journal ArticleDOI
TL;DR: Analysis of users' verbal data shows that high precision does not always mean high quality to users because of different users' expectations, and four related measures of recall and precision are found to be significantly correlated with success.
Abstract: The appropriateness of evaluation criteria and measures have been a subject of debate and a vital concern in the information retrieval evaluation literature. A study was conducted to investigate the appropriateness of 20 measures for evaluating interactive information retrieval performance, representing four major evaluation criteria. Among the 20 measures studied were the two most well-known relevance-based measures of effectiveness, recall and precision. The user's judgment of information retrieval success was used as the devised criterion measure with which all other 20 measures were to be correlated. A sample of 40 end-users with individual information problems from an academic environment were observed, interacting with six professional intermediaries searching on their behalf in large operational systems. Quantitative data consisting of values for all measures studied and verbal data containing users' reasons for assigning certain values to selected measures were collected. Statistical analysis of the quantitative data showed that precision, one of the most important traditional measures of effectiveness, is not significantly correlated with the user's judgment of success. Users appear to be more concerned with absolute recall than with precision, although absolute recall was not directly tested in the study. Four related measures of recall and precision are found to be significantly correlated with success. Among these are user's satisfaction with completeness of search results and user's satisfaction with precision of the search. This article explores the possible explanations for this outcome through content analysis of users' verbal data. The analysis shows that high precision does not always mean high quality (relevancy, completeness, etc.) to users because of different users' expectations. The user's purpose in obtaining information is suggested to be the primary cause for the high concern for recall. Implications for research and practice are discussed. © 1994 John Wiley & Sons, Inc.

Journal ArticleDOI
Steve Putz1
01 Nov 1994
TL;DR: This paper describes some interactive World-Wide Web servers that produce information displays and documents dynamically rather than just providing access to static files.
Abstract: Most World-Wide Web information servers provide simple browsing access to collections of static text or hypertext files. This paper describes some interactive World-Wide Web servers that produce information displays and documents dynamically rather than just providing access to static files. The PARC Map Viewer uses a geographic database to create and display maps of any part of the world on demand. The Digital Tradition folk music server provides access to a large database of song lyrics and melodies. These applications take advantage of the multimedia capabilities of World-Wide Web to deliver graphical and audio content as well as formatted text. Hypertext links are used not only for navigation, but also for setting search and presentation parameters. In these applications the HTML format and the HTTP protocol are used like a user interface tool kit to provide not only document retrieval but a complete custom user interface specialized for the application.

Journal ArticleDOI
TL;DR: These experiments revealed that using roots and using stems as index terms gives better retrieval results than using words.
Abstract: The Micro-AIRS System, a microcomputer system for Arabic Information Retrieval, was designed as an experimental system to investigate indexing and retrieval processes for Arabic bibliographic data. A series of experiments were performed using 29 queries against a base of 355 Arabic bibliographic records, covering computer and information science from the bibliographic databank at King Abdulaziz City for Science and Technology. These experiments revealed that using roots and using stems as index terms gives better retrieval results than using words. The root performs as well as or better than the stem at low recall levels and definitely better at high recall levels. Several different binary similarity coefficients were tried: the cosine, Dice, and Jaccard coefficients

Patent
15 Aug 1994
TL;DR: In this article, the results of a full-text, document search by a character string search processor are treated as vector patterns whose elements become a term match grade by use of a membership function of the term match frequency.
Abstract: The results of a full-text, document search by a character string search processor are treated as vector patterns whose elements become a term match grade by use of a membership function of the term match frequency. The closest pattern to the query pattern is found by the similarity between the query pattern and each of the filed sample patterns. The similarity is calculated by use of fuzzy-logic. The similarity is ranked in order of similarity magnitude, thereby reducing the search time. The search time can be shortened by categorizing the filed patterns by term set and similarity to a cluster center pattern. If the cluster center patterns are stored, the closest cluster address can be inferred by fuzzy logic inference from the match between the query document and the term set or the similarity of the query to the cluster center.

Journal ArticleDOI
TL;DR: Regardless of the method, user‐centered indexing cannot be developed before searching behavior is understood better, and Automated indexing with its dynamic and flexible nature is most fit to tailor indexing to requirements of individual users and requests.
Abstract: Two distinct approaches describe the process of indexing. The document-oriented approach claims that indexing summarizes or represents the content of a document. The user-oriented approach requires that indexing reflect the requests for which a document might be relevant. Most indexing, in practice as well as in theory, subscribe to both, but the document-oriented approach has enjoyed most visibility. While request-oriented indexing is a user-centered approach, it is very difficult to implement with human, a priori indexing. Automated indexing with its dynamic and flexible nature is most fit to tailor indexing to requirements of individual users and requests, yet most of current research in the area focuses on the development of global methods. Regardless of the method, user-centered indexing cannot be developed before searching behavior is understood better. © 1994 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: An examination of some nonbook materials with respect to an aboutness model of indexing leads to the conclusion that there are instances that defy subject indexing.
Abstract: An examination of some nonbook materials with respect to an aboutness model of indexing leads to the conclusion that there are instances that defy subject indexing. These occur not so much because of the nature of the medium per se but because it is being used for nondocumentary purposes, or, where being used for such purposes, the subject referenced is nonlexical. © 1994 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: The ability to ask the questions one needs to ask as the foundation of performance evaluation, and recall and discrimination as the basic quantitative performance measures for binary noninteractive retrieval systems are established.
Abstract: This article presents a logical analysis of the characteristics of indexing and their effects on retrieval performance. It establishes the ability to ask the questions one needs to ask as the foundation of performance evaluation, and recall and discrimination as the basic quantitative performance measures for binary noninteractive retrieval systems. It then defines the characteristics of indexing that affect retrieval—namely, indexing devices, viewpoint-based and importance-based indexing exhaustivity, indexing specificity, indexing correctness, and indexing consistency—and examines in detail their effects on retrieval. It concludes that retrieval performance depends chiefly on the match between indexing and the requirements of the individual query and on the adaptation of the query formulation to the characteristics of the retrieval system, and that the ensuing complexity must be considered in the design and testing of retrieval systems. © 1994 John Wiley & Sons, Inc.

Proceedings Article
01 Jan 1994
TL;DR: In the TREC experiments this year, a number of new techniques were introduced for both the ad-hoc retrieval and routing runs, and experiments with Spanish retrieval were carried out.
Abstract: The INQUERY retrieval and routing system, which is based on the Bayesian inference net retrieval model, has been described in a number of papers (5,4,10,11). In the TREC experiments this year, a number of new techniques were introduced for both the ad-hoc retrieval and routing runs. In addition, experiments with Spanish retrieval were carried out.

Journal ArticleDOI
TL;DR: If all the documents in a database have readily available precomputed nearest neighbors, a new search algorithm, which is called parallel neighborhood searching, is conveniently used and it is shown that this feedback-based method provides significant improvement in recall over traditional linear searching methods and even appears superior to traditional feedback methods in overall performance.
Abstract: We consider two kinds of queries that may be applied to a database. The first is a query written by a searcher to express an information need. The second is a request for documents most similar to a document already judged relevant by the searcher. We examine the effectiveness of these two procedures and show that in important cases the latter query type is more effective than the former. This provides a new view of the cluster hypothesis and a justification for document neighboring procedures (precomputation of closely related documents). If all the documents in a database have readily available precomputed nearest neighbors, a new search algorithm, which we call parallel neighborhood searching, is conveniently used. We show that this feedback-based method provides significant improvement in recall over traditional linear searching methods and even appears superior to traditional feedback methods in overall performance.

Patent
Jonathan Devito1, Harry T. Garland1, Ken Hunter1, Gerald A May1, Michael G. Roberts1 
06 May 1994
TL;DR: In this article, a text-image correspondence (TIC) table is generated that includes data representative of coordinates information corresponding to each phrase of the document set and a search phrase is identified in response to user-specified search criteria and the search phrase identified in the text image data set.
Abstract: A method and system for storing and selectively retrieving information, such as words, from a document set. The method includes generating an image data set representative of the information contained in the document set. The method also involves generating a text data set representative of a text portion of the information contained in the document set. A text-image correspondence (TIC) table is generated that includes data representative of coordinates information corresponding to each phrase of the document set. A search phrase is identified in response to user-specified search criteria and the search phrase is identified in the text image data set. Then, the TIC table is used to identify the coordinates information corresponding to the search phrase identified in the text data set. A display of the portion of the page containing the search phrase is generated using the coordinates information.

Journal ArticleDOI
TL;DR: An iterative model of retrieval evaluation is proposed, starting first with the use of topical relevance to insure documents on the subject can be retrieved, followed by theUse of situational relevance to show the user can interact positively with the system.
Abstract: The traditional notion of topical relevance has allowed much useful work to be done in the evaluation of retrieval systems, but has limitations for complete assessment of retrieval systems. While topical relevance can be effective in evaluating various indexing and retrieval approaches, it is ineffective for measuring the impact that systems have on users. An alternative is to use a more situational definition of relevance, which takes account of the impact of the system on the user. Both types of relevance are examined from the standpoint of the medical domain, concluding that each have their appropriate use. But in medicine there is increasing emphasis on outcomes-oriented research which, when applied to information science, requires that the impact of an information system on the activities which prompt its use be assessed. An iterative model of retrieval evaluation is proposed, starting first with the use of topical relevance to insure documents on the subject can be retrieved. This is followed by the use of situational relevance to show the user can interact positively with the system. The final step is to study how the system impacts the user in the purpose for which the system was consulted, which can be done by methods such as protocol analysis and simulation. These diverse types of studies are necessary to increase our understanding of the nature of retrieval systems. © 1994 John Wiley & Sons, Inc.

Journal ArticleDOI
Valerie Cross1
01 Feb 1994
TL;DR: A general description of the main components of fuzzy information retrieval are given: document representation, query representation, computer-aided query formulation, document retrieval status, and performance measures.
Abstract: Over the past decade, information retrieval has emerged as an active research area in the application of fuzzy set theory. Fuzzy information retrieval utilizes fuzzy sets to represent documents, membership degrees for query term relevance, fuzzy logical operators to define queries, and fuzzy compatibility measures to assess the retrieval status value of a document. This paper presents an overview of fuzzy relational databases and fuzzy information retrieval. A general description of the main components of fuzzy information retrieval are given: document representation, query representation, computer-aided query formulation, document retrieval status, and performance measures. Examples of areas currently being researched are provided. The relation between fuzzy information retrieval and fuzzy relational databases is examined.

Proceedings ArticleDOI
13 Oct 1994
TL;DR: It is demonstrated that the use of syntactic compounds in the representation of database documents as well as in the user queries, coupled with an appropriate term weighting strategy, can considerably improve the effectiveness of retrospective search.
Abstract: We report on the results of a series of experiments with a prototype text retrieval system which uses relatively advanced natural language processing techniques in order to enhance the effectiveness of statistical document retrieval. In this paper we show that large-scale natural language processing (hundreds of millions of words and more) is not only required for a better retrieval, but it is also doable, given appropriate resources. In particular, we demonstrate that the use of syntactic compounds in the representation of database documents as well as in the user queries, coupled with an appropriate term weighting strategy, can considerably improve the effectiveness of retrospective search. The experiments reported here were conducted on TIPSTER database in connection with the Text REtrieval Conference series (TREC).

Journal ArticleDOI
TL;DR: An OCR‐generated database and its corresponding 99.8% correct version are used to process a set of queries to determine the effect the degraded version will have on retrieval, and it is shown that the effect is insignificant.
Abstract: We report on the results of our experiments on query evaluation in the presence of noisy data. In particular, an OCR-generated database and its corresponding 99.8% correct version are used to process a set of queries to determine the effect the degraded version will have on retrieval. It is shown that, with the set of scientific documents we use in our testing, the effect is insignificant. We further improve the result by applying an automatic postprocessing system designed to correct the kinds of errors generated by recognition devices. © 1994 John Wiley & Sons, Inc.

Proceedings Article
12 Sep 1994
TL;DR: The integration of a structured-text retrieval system (TextMachine) into an object-oriented database system (Op) is described, using the external function capability of the database system to encapsulate the text retrieval system as an external information source.
Abstract: We describe the integration of a structured-text retrieval system (TextMachine) into an object-oriented database system (OpOur approach is a light-weight one, using the external function capability of the database system to encapsulate the text retrieval system as an external information source. Yet, we are able to provide a tight integration in the query language and processing; the user can access the text retrieval system using a standard database query language. The effcient and effective retrieval of structured text performed by the text retrieval system is seamlessly combined with the rich modeling and general-purpose querying capabilities of the database system, resulting in an integrated system with querying power beyond those of the underlying systems. The integrated system also provides uniform access to textual data in the text retrieval system and structured data in the database system, thereby achieving information fusion. We discuss the design and implementation of our prototype system, and address issues such as the proper framework for external integration, the modeling of complex categorization and structure hierarchies of documents (under automatic document schema impand techniques to reduce the performance overhead of accessing an external source.

01 Jan 1994
TL;DR: This dissertation examines the use of adaptive methods to automatically improve the performance of ranked text retrieval systems and proposes and empirically validate general adaptive methods which improve the ability of a large class of retrieval systems to rank documents effectively.
Abstract: This dissertation examines the use of adaptive methods to automatically improve the performance of ranked text retrieval systems. The goal of a ranked retrieval system is to manage a large collection of text documents and to order documents for a user based on the estimated relevance of the documents to the user's information need (or query). The ordering enables the user to quickly find documents of interest. Ranked retrieval is a difficult problem because of the ambiguity of natural language, the large size of the collections, and because of the varying needs of users and varying collection characteristics. We propose and empirically validate general adaptive methods which improve the ability of a large class of retrieval systems to rank documents effectively. Our main adaptive method is to numerically optimize free parameters in a retrieval system by minimizing a non-metric criterion function. The criterion measures how well the system is ranking documents relative to a target ordering, defined by a set of training queries which include the users' desired document orderings. Thus, the system learns parameter settings which better enable it to rank relevant documents before irrelevant. The non-metric approach is interesting because it is a general adaptive method, an alternative to supervised methods for training neural networks in domains in which rank order or prioritization is important. A second adaptive method is also examined, which is applicable to a restricted class of retrieval systems but which permits an analytic solution. The adaptive methods are applied to a number of problems in text retrieval to validate their utility and practical efficiency. The applications include: A dimensionality reduction of vector-based document representations to a vector space in which inter-document similarity more accurately predicts semantic association; the estimation of a similarity measure which better predicts the relevance of documents to queries; and the estimation of a high-performance neural network combination of multiple retrieval systems into a single overall system. The applications demonstrate that the approaches improve performance and adapt to varying retrieval environments. We also compare the methods to numerous alternative adaptive methods in the text retrieval literature, with very positive results.