scispace - formally typeset
Search or ask a question

Showing papers on "Ranking (information retrieval) published in 2000"


Journal ArticleDOI
TL;DR: A novel approach to balance objective and penalty functions stochastically, i.e., stochastic ranking, is introduced, and a new view on penalty function methods in terms of the dominance of penalty and objective functions is presented.
Abstract: Penalty functions are often used in constrained optimization. However, it is very difficult to strike the right balance between objective and penalty functions. This paper introduces a novel approach to balance objective and penalty functions stochastically, i.e., stochastic ranking, and presents a new view on penalty function methods in terms of the dominance of penalty and objective functions. Some of the pitfalls of naive penalty methods are discussed in these terms. The new ranking method is tested using a (/spl mu/, /spl lambda/) evolution strategy on 13 benchmark problems. Our results show that suitable ranking alone (i.e., selection), without the introduction of complicated and specialized variation operators, is capable of improving the search performance significantly.

1,571 citations


Journal ArticleDOI
01 Jul 2000
TL;DR: The novel evaluation methods and the case demonstrate that non-dichotomous relevance assessments are applicable in IR experiments, may reveal interesting phenomena, and allow harder testing of IR methods.
Abstract: This paper proposes evaluation methods based on the use of non-dichotomous relevance judgements in IR experiments It is argued that evaluation methods should credit IR methods for their ability to retrieve highly relevant documents This is desirable from the user point of view in modem large IR environments The proposed methods are (1) a novel application of P-R curves and average precision computations based on separate recall bases for documents of different degrees of relevance, and (2) two novel measures computing the cumulative gain the user obtains by examining the retrieval result up to a given ranked position We then demonstrate the use of these evaluation methods in a case study on the effectiveness of query types, based on combinations of query structures and expansion, in retrieving documents of various degrees of relevance The test was run with a best match retrieval system (In- Query I) in a text database consisting of newspaper articles The results indicate that the tested strong query structures are most effective in retrieving highly relevant documents The differences between the query types are practically essential and statistically significant More generally, the novel evaluation methods and the case demonstrate that non-dichotomous relevance assessments are applicable in IR experiments, may reveal interesting phenomena, and allow harder testing of IR methods

1,461 citations


Journal ArticleDOI
TL;DR: A new technique is proposed, called local context analysis, which selects expansion terms based on cooccurrence with the query terms within the top-ranked documents.
Abstract: Techniques for automatic query expansion have been extensively studied in information research as a means of addressing the word mismatch between queries and documents. These techniques can be categorized as either global or local. While global techniques rely on analysis of a whole collection to discover word relationships, local techniques emphasize analysis of the top-ranked documents retrieved for a query. While local techniques have shown to be more effective that global techniques in general, existing local techniques are not robust and can seriously hurt retrieved when few of the retrieval documents are relevant. We propose a new technique, called local context analysis, which selects expansion terms based on cooccurrence with the query terms within the top-ranked documents. Experiments on a number of collections, both English and non-English, show that local context analysis offers more effective and consistent retrieval results.

613 citations


Patent
26 Dec 2000
TL;DR: In this article, a system allows a user to submit an ambiguous search query and to receive potentially disambiguated search results by translating a search engine's conventional alphanumeric index into a second index that is ambiguous in the same manner as which the user's input is ambiguated, and the corresponding documents are provided to the user as search results.
Abstract: A system allows a user to submit an ambiguous search query and to receive potentially disambiguated search results. In one implementation, a search engine's conventional alphanumeric index is translated into a second index that is ambiguated in the same manner as which the user's input is ambiguated. The user's ambiguous search query is compared to this ambiguated index, and the corresponding documents are provided to the user as search results.

300 citations


Journal ArticleDOI
TL;DR: A model allowing to determine the weights related to interacting criteria is presented, done on the basis of the knowledge of a partial ranking over a reference set of alternatives (prototypes), a partialranking over the set of criteria, and apartial ranking over theSet of interactions between pairs of criteria.

286 citations


Journal ArticleDOI
Wen-Syan Li1, Divyakant Agrawal1
01 Dec 2000
TL;DR: The notion of a multi-granularity information and processing structure is used to support efficient query expansion, which involves an indexing phase, a query processing and a ranking phase.
Abstract: A method and apparatus for efficient query expansion using reduced size indices and for progressive query processing. Queries are expanded conceptually, using semantically similar and syntactically related words to those specified by the user in the query to reduce the chances of missing relevant documents. The notion of a multi-granularity information and processing structure is used to support efficient query expansion, which involves an indexing phase, a query processing and a ranking phase. In the indexing phase, semantically similar words are grouped into a concept which results in a substantial index size reduction due to the coarser granularity of semantic concepts. During query processing, the words in a query are mapped into their corresponding semantic concepts and syntactic extensions, resulting in a logical expansion of the original query. Additionally, the processing overhead is avoided. The initial query words can then be used to rank the documents in the answer set on the basis of exact, semantic and syntactic matches and also to perform progressive query processing.

260 citations


Proceedings ArticleDOI
01 Jul 2000
TL;DR: The results show that incorporating quality metrics can generally improve search effectiveness in both centralized and distributed search environments.
Abstract: Most information retrieval systems on the Internet rely primarily on similarity ranking algorithms based solely on term frequency statistics Information quality is usually ignored This leads to the problem that documents are retrieved without regard to their quality We present an approach that combines similarity-based similarity ranking with quality ranking in centralized and distributed search environments Six quality metrics, including the currency, availability, information-to-noise ratio, authority, popularity, and cohesiveness, were investigated Search effectiveness was significantly improved when the currency, availability, information-to-noise ratio and page cohesiveness metrics were incorporated in centralized search The improvement seen when the availability, information-to- noise ratio, popularity, and cohesiveness metrics were incorporated in site selection was also significant Finally, incorporating the popularity metric in information fusion resulted in a significant improvement In summary, the results show that incorporating quality metrics can generally improve search effectiveness in both centralized and distributed search environments

252 citations


Proceedings ArticleDOI
01 Jul 2000
TL;DR: An experimental evaluation of link analysis algorithms for their potential to identify high quality items using a dataset of web documents rated for quality by human topic experts found link-based metrics did a good job of picking out high-quality items.
Abstract: For many topics, the World Wide Web contains hundreds or thousands of relevant documents of widely varying quality. Users face a daunting challenge in identifying a small subset of documents worthy of their attention.Link analysis algorithms have received much interest recently, in large part for their potential to identify high quality items. We report here on an experimental evaluation of this potential.We evaluated a number of link and content-based algorithms using a dataset of web documents rated for quality by human topic experts. Link-based metrics did a good job of picking out high-quality items. Precision at 5 is about 0.75, and precision at 10 is about 0.55; this is in a dataset where 0.32 of all documents were of high quality. Surprisingly, a simple content-based metric performed nearly as well; ranking documents by the total number of pages on their containing site.

244 citations


Journal ArticleDOI
TL;DR: The paper shows that the new probabilistic interpretation of tf×idf term weighting might lead to better understanding of statistical ranking mechanisms, for example by explaining how they relate to coordination level ranking.
Abstract: This paper presents a new probabilistic model of information retrieval. The most important modeling assumption made is that documents and queries are defined by an ordered sequence of single terms. This assumption is not made in well known existing models of information retrieval, but is essential in the field of statistical natural language processing. Advances already made in statistical natural language processing will be used in this paper to formulate a probabilistic justification for using tf.idf term weighting. The paper shows that the new probabilistic interpretation of tf.idf term weighting might lead to better understanding of statistical ranking mechanisms, for example by explaining how they relate to coordination level ranking. A pilot experiment on the TREC collection shows that the linguistically motivated weighting algorithm outperforms the popular BM25 weighting algorithm.

209 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigated the application of a novel relevance ranking technique, cover density ranking, to the requirements of Web-based information retrieval, where a typical query consists of a few search terms and a typical result consists of pages indicating several potentially relevant documents.
Abstract: We investigate the application of a novel relevance ranking technique, cover density ranking, to the requirements of Web-based information retrieval, where a typical query consists of a few search terms and a typical result consists of a page indicating several potentially relevant documents. Traditional ranking methods for information retrieval, based on term and inverse document frequencies, have been found to work poorly in this context. Under the cover density measure, ranking is based on term proximity and cooccurrence. Experimental comparisons show performance that compares favorably with previous work.

203 citations


Patent
10 Jul 2000
TL;DR: In this article, an automated method of creating or updating a database of resumes and related documents is proposed. But, the method is limited to the retrieval of documents from a network of documents, where the document is the most relevant document to the subject taxonomy stored in the retrieval priority list.
Abstract: An automated method of creating or updating a database of resumes and related documents, the method comprising, a) entering at least one example document that is relevant to a subject taxonomy in a retrieval priority list, if there is a plurality of example documents stored in the retrieval priority list, ranking the example documents according to the relevancy of the example documents to the subject taxonomy; b) retrieving a document from a network of documents, where the document is the most relevant document to the subject taxonomy stored in the retrieval priority list; c) harvesting information from specified fields of the document; d) classifying the information into one or more classes according to specified categories of the subject taxonomy; e) storing the information into a database; f) determining whether the information are links to other documents; g) ranking the link's according to relevancy to the subject taxonomy, and storing the links in the retrieval priority list according to the relevancy; h) terminating the method, provided the method's stop criteria have been met; and i) repeating steps b) through h), provided the method's stop criteria has not been met.

Patent
31 May 2000
TL;DR: In this article, a query manager is used to monitor user choices and selections on a search result web page and provide alternative query expressions to further narrow and enhance the user's search.
Abstract: An invention for monitoring user choices and selections on a search result web page and providing alternative query expressions to further narrow and enhance the user's search. Monitoring and recording user choices and selections is achieved by a query manager. Query strings are then standardized. The search is performed on an Internet search engine, and each search result item in the result output set is then associated with a list of alternative standardized queries by an alternate query matching manager. Each search result item in the result output set that is associated with the alternate queries is then flagged. The resulting flagged list of alternative queries is displayed to the user by a page presentation manager using a graphical user interface for subsequent user selection. Upon selection of the graphical user interface for alternate query expressions, an alternate query selection manager retrieves and displays each alternate query to the user.

Patent
31 Jul 2000
TL;DR: An economic, scalable machine learning system and process performed document (concept) classification with high accuracy using large topic schemes, including large hierarchical topic schemes as discussed by the authors, which includes training and concept classification processes.
Abstract: An economic, scalable machine learning system and process perform document (concept) classification (210) with high accuracy using large topic schemes, including large hierarchical topic schemes. One or more highly relevant classification topics is suggested for a given document (concept) to be classified (210). The invention includes training (200) and concept classification (210) processes. The invention also provides methods that may be used as part of the training and/or concept classification processes, including: a method of scoring (303) the relevance of features in training concepts, a method of ranking concepts based on relevance score, and a method of voting on topics associated with an input concept. In a preferred embodiment, the invention is applied to the legal (case law) domain, classifying legal concepts (rules of law) according to a proprietary legal topic classification scheme (a hierarchical scheme of areas of law).

Journal ArticleDOI
01 Jun 2000
TL;DR: It was discovered that the ability to maintain search context explicitly seems to affect the way people search, and an efficient implementation of this idea deployed on four search engines: AltaVista, Excite, Google and Hotbot is described.
Abstract: Experienced users who query search engines have a complex behavior. They explore many topics in parallel, experiment with query variations, consult multiple search engines, and gather information over many sessions. In the process they need to keep track of search context — namely useful queries and promising result links, which can be hard. We present an extension to search engines called SearchPad that makes it possible to keep track of ‘search context’ explicitly. We describe an efficient implementation of this idea deployed on four search engines: AltaVista, Excite, Google and Hotbot. Our design of SearchPad has several desirable properties: (i) portability across all major platforms and browsers; (ii) instant start requiring no code download or special actions on the part of the user; (iii) no server side storage; and (iv) no added client–server communication overhead. An added benefit is that it allows search services to collect valuable relevance information about the results shown to the user. In the context of each query SearchPad can log the actions taken by the user, and in particular record the links that were considered relevant by the user in the context of the query. The service was tested in a multi-platform environment with over 150 users for 4 months and found to be usable and helpful. We discovered that the ability to maintain search context explicitly seems to affect the way people search. Repeat SearchPad users looked at more search results than is typical on the Web, suggesting that availability of search context may partially compensate for non-relevant pages in the ranking.

Patent
29 Jun 2000
TL;DR: In this article, meta-descriptors are generated for multimedia information in a repository by extracting the descriptors from the multimedia information and clustering the metadata information based on the descriptor.
Abstract: Multimedia information retrieval is performed using meta-descriptors in addition to descriptors. A 'descriptor' is a representation of a feature, a 'feature' being a distinctive characteristic of multimedia information, while a 'meta-descriptor' is information about the descriptor. Meta-descriptors are generated for multimedia information in a repository (10, 12, 14, 16, 18, 20, 22, 24) by extracting the descriptors from the multimedia information (111), clustering the multimedia information based on the descriptors (112), assigning meta-descriptors to each cluster (113), and attaching the meta-descriptors to the multimedia information in the repository (114). The multimedia repository is queried by formulating a query using query-by-example (131), acquiring the descriptor/s and meta-descriptor/s for a repository multimedia item (132), generating a query descriptor/s if none of the same type has been previously generated (133, 134), comparing the descriptors of the repository multimedia item and the query multimedia item (135), and ranking and displaying the results (136, 137).

Journal ArticleDOI
TL;DR: A novel approach to automatically retrieve keywords and then uses genetic algorithms to adapt the keyword weights and this approach is faster and uses less memory than the PAT-tree based approach.
Abstract: This paper proposes a novel approach to automatically retrieve keywords and then uses genetic algorithms to adapt the keyword weights. One of the contributions of the paper is to combine the Bigram (Chen, A., He, J., Xu, L., Gey, F. C., & Meggs, J. 1997. Chinese text retrieval without using a dictionary , ACM SIGIR’97, Philadelphia, PA, USA, pp. 42–49; Yang, Y.-Y., Chang, J.-S., & Chen, K.-J. 1993), Document automatic classification and ranking , Master thesis, Department of Computer Science, National Tsing Hua University) model and PAT-tree structure (Chien, L.-F., Huang, T.-I., & Chien, M.-C. 1997 Pat-tree-based keyword extraction for Chinese information retrieval , ACM SIGIR’97, Philadelphia, PA, US, pp. 50–59) to retrieve keywords. The approach extracts bigrams from documents and uses the bigrams to construct a PAT-tree to retrieve keywords. The proposed approach can retrieve any type of keywords such as technical keywords and a person’s name. Effectiveness of the proposed approach is demonstrated by comparing how effective are the keywords found by both this approach and the PAT-tree based approach. This comparison reveals that our keyword retrieval approach is as accurate as the PAT-tree based approach, yet our approach is faster and uses less memory. The study then applies genetic algorithms to tune the weight of retrieved keywords. Moreover, several documents obtained from web sites are tested and experimental results are compared with those of other approaches, indicating that the proposed approach is highly promising for applications.

Patent
Wacholder1, Faye
26 Dec 2000
TL;DR: In this paper, a "domain-general" method for representing the sense of a document includes the steps of extracting a list of simplex noun phrases representing candidate significant topics in the document, clustering the noun phrases by head, and ranking the noun phrase according to a significance measure.
Abstract: A "domain-general" method for representing the "sense" of a document includes the steps of extracting a list of simplex noun phrases representing candidate significant topics in the document, clustering the simplex noun phrases by head, and ranking the simplex noun phrases according to a significance measure to indicate the relative importance of the simplex noun phrases as significant topics of the document. Furthermore, the output can be filtered in a variety of ways, both for automatic processing and for presentation to users.

Patent
22 Sep 2000
TL;DR: In this article, a method for associating search results is presented, where an original list of search results are provided to a user in response to a first query, and the search results selected by the first user are recorded and associated with the first query.
Abstract: A method for associating search results is provided. According to the method, an original list of search results is provided to a first user in response to a first query, and the search results selected by the first user are recorded and associated with the first query. Additionally, a second query that is the same as or similar to the first query is received from a second user, and an alternate list of search results is provided to the second user. The alternate list lists those search results from the original list that have been associated with the first query due to selection by a user. Also provided is a system for providing search results that includes a search engine, a query database, and a controller. The search engine provides original lists of search results in response to queries, and the query database stores the search results selected by users in response to each of the queries. The controller provides an alternate list of search results in response to another query that is the same as or similar to one of the queries, with the alternate list of search results listing those search results from the original list that have been recorded in the query database as having been previously selected in response to the one query.

Proceedings ArticleDOI
30 Apr 2000
TL;DR: This paper proposes using sentence-rank-based and content-based measures for evaluating extract summaries, and compares these with recall-based evaluation measures.
Abstract: Summary evaluation measures produce a ranking of all possible extract summaries of a document. Recall-based evaluation measures, which depend on costly human-generated ground truth summaries, produce uncorrelated rankings when ground truth is varied. This paper proposes using sentence-rank-based and content-based measures for evaluating extract summaries, and compares these with recall-based evaluation measures. Content-based measures increase the correlation of rankings induced by synonymous ground truths, and exhibit other desirable properties.

Journal ArticleDOI
TL;DR: A user-centered investigation of interactive query expansion within the context of a relevance feedback system is presented, providing evidence for the effectiveness of interactive querying and highlighting the need for more research on.
Abstract: A user-centered investigation of interactive query expansion within the context of a relevance feedback system is presented in this article. Data were collected from 25 searches using the INSPEC database. The data collection mechanisms included questionnaires, transaction logs, and relevance evaluations. The results discuss issues that relate to query expansion, retrieval effectiveness, the correspondence of the on-line-to-off-line relevance judgments, and the selection of terms for query expansion by users (interactive query expansion). The main conclusions drawn from the results of the study are that: (1) one-third of the terms presented to users in a list of candidate terms for query expansion was identified by the users as potentially useful for query expansion. (2) These terms were mainly judged as either variant expressions (synonyms) or alternative (related) terms to the initial query terms. However, a substantial portion of the selected terms were identified as representing new ideas. (3) The relationships identified between the five best terms selected by the users for query expansion and the initial query terms were that: (a) 34% of the query expansion terms have no relationship or other type of correspondence with a query term; (b) 66% of the remaining query expansion terms have a relationship to the query terms. These relationships were: narrower term (46%), broader term (3%), related term (17%). (4) The results provide evidence for the effectiveness of interactive query expansion. The initial search produced on average three highly relevant documents; the query expansion search produced on average nine further highly relevant documents. The conclusions highlight the need for more research on: interactive query expansion, the comparative evaluation of automatic vs. interactive query expansion, the study of weighted Web-based or Web-accessible retrieval systems in operational environments, and for user studies in searching ranked retrieval systems in general.

Journal ArticleDOI
01 Jun 2000
TL;DR: Q-Pilot is described, an automatic query routing system that attempts to dynamically route each user query to the appropriate specialized search engines, based on an off-line component that creates an approximate model of each specialized search engine's topic.
Abstract: General-purpose search engines such as AltaVista and Lycos are notorious for returning irrelevant results in response to user queries. Consequently, thousands of specialized, topic-specific search engines (from VacationSpot.com to KidsHealth.org) have proliferated on the Web. Typically, topic-specific engines return far better results for `on topic' queries as compared with standard Web search engines. However, it is difficult for the casual user to identify the appropriate specialized engine for any given search. It is more natural for a user to issue queries at a particular Web site, and have these queries automatically routed to the appropriate search engine(s). This paper describes an automatic query routing system called Q-Pilot. Q-Pilot has an off-line component that creates an approximate model of each specialized search engine's topic. On line, Q-Pilot attempts to dynamically route each user query to the appropriate specialized search engines. In our experiments, Q-Pilot was able to identify the appropriate query category 70% of the time. In addition, Q-Pilot picked the best search engine for the query, as one of the top three picks out of its repository of 144 engines, about 40% of the time. This paper reports on Q-Pilot's architecture, the query expansion and clustering algorithms it relies on, and the results of our preliminary experiments.

Patent
28 Apr 2000
TL;DR: In this paper, a system for ranking search results obtained from an information retrieval system includes a search pre-processor (30), a search engine (20), and a search postprocessor (40).
Abstract: A system for ranking search results obtained from an information retrieval system includes a search pre-processor (30), a search engine (20) and a search post-processor (40). The search pre-processor (30) determines the context of the search query by comparing the terms in the search query with a predetermined user context profile. Preferably, the context profile is a user profile or a community profile, which includes a set of terms which have been rated by the user, community, or a recommender system. The search engine generates a search result comprising at least one item obtained from the information retrieval system. The search post-processor (40) ranks each item returned in the search result in accordance with the context of the search query.

Proceedings ArticleDOI
01 Jul 2000
TL;DR: Search effectiveness when using query-based Internet search, directory-based search and phrase-based query reformulation assisted search is compared by means of a controlled, user-based experimental study.
Abstract: This article compares search effectiveness when using query-based Internet search (via the Google search engine), directory-based search (via Yahoo) and phrase-based query reformulation assisted search (via the Hyperindex browser) by means of a controlled, user-based experimental study. The focus was to evaluate aspects of the search process. Cognitive load was measured using a secondary digit-monitoring task to quantify the effort of the user in various search states; independent relevance judgements were employed to gauge the quality of the documents accessed during the search process. Time was monitored in various search states. Results indicated the directory-based search does not offer increased relevance over the query-based search (with or without query formulation assistance), and also takes longer. Query reformulation does significantly improve the relevance of the documents through which the user must trawl versus standard query-based internet search. However, the improvement in document relevance comes at the cost of increased search time and increased cognitive load.

Journal ArticleDOI
TL;DR: An integrated visual thesaurus and results browser to support information retrieval was designed using a task model of information searching and found that while visual user interfaces for information searching might seem to be usable, they may not actually improve performance.
Abstract: An integrated visual thesaurus and results browser to support information retrieval was designed using a task model of information searching. The system provided a hierarchical thesaurus with a results cluster display representing similarity between retrieved documents and relevance ranking using a bullseye metaphor. Latent semantic indexing (LSI) was used as the retrieval engine and to calculate the similarity between documents. The design was tested with two information retrieval tasks. User behaviour, performance and attitude were recorded as well as usability problems. The system had few usability problems and users liked the visualizations, but recall performance was poor. The reasons for poor/good performance were investigated by examining user behaviour and search strategies. Better searchers used the visualizations more effectively and spent longer on the task, whereas poorer performances were attributable to poor motivation, difficulty in assessing article relevance and poor use of system visualizations. The bullseye browser display appeared to encourage limited evaluation of article relevance on titles, leading to poor performance. The bullseye display metaphor for article relevance was understood by users; however, they were confused by the concept of similarity searching expressed as visual clusters. The conclusions from the study are that while visual user interfaces for information searching might seem to be usable, they may not actually improve performance. Training and advisor facilities for effective search strategies need to be incorporated to enhance the effectiveness of visual user interfaces for information retrieval.

Proceedings ArticleDOI
01 Jul 2000
TL;DR: An information retrieval model developed to deal with hyperlinked environments that is based on belief networks and provides a framework for combining information extracted from the content of the documents with information derived from cross-references among the documents is presented.
Abstract: This work presents an information retrieval model developed to deal with hyperlinked environments. The model is based on belief networks and provides a framework for combining information extracted from the content of the documents with information derived from cross-references among the documents. The information extracted from the content of the documents is based on statistics regarding the keywords in the collection and is one of the basis for traditional information retrieval (IR) ranking algorithms. The information derived from cross-references among the documents is based on link references in a hyperlinked environment and has received increased attention lately due to the success of the Web. We discuss a set of strategies for combining these two types of sources of evidential information and experiment with them using a reference collection extracted from the Web. The results show that this type of combination can improve the retrieval performance without requiring any extra information from the users at query time. In our experiments, the improvements reach up to 59% in terms of average precision figures.

Book ChapterDOI
13 Sep 2000
TL;DR: The adjusted ratio of ratios ranking method takes into account not only accuracy but also the time performance of the candidate algorithms, and indicates that on average better results are obtained with zooming than without it.
Abstract: Given the wide variety of available classification algorithms and the volume of data today's organizations need to analyze, the selection of the right algorithm to use on a new problem is an important issue. In this paper we present a combination of techniques to address this problem. The first one, zooming, analyzes a given dataset and selects relevant (similar) datasets that were processed by the candidate algoritms in the past. This process is based on the concept of "distance", calculated on the basis of several dataset characteristics. The information about the performance of the candidate algorithms on the selected datasets is then processed by a second technique, a ranking method. Such a method uses performance information to generate advice in the form of a ranking, indicating which algorithms should be applied in which order. Here we propose the adjusted ratio of ratios ranking method. This method takes into account not only accuracy but also the time performance of the candidate algorithms. The generalization power of this ranking method is analyzed. For this purpose, an appropriate methodology is defined. The experimental results indicate that on average better results are obtained with zooming than without it.

Patent
Reiner Kraft1, Joann Ruvolo1
30 Jun 2000
TL;DR: In this article, a question management system for an expert advice web site maintains a database of experts in different subject matter categories and ranking scores associated with each expert are continually updated based on the timeliness of answers provided by the experts and answer rating feedback received from the question poser.
Abstract: A question management system for an expert advice web site maintains a database of experts in different subject matter categories. Ranking scores associated with each expert are continually updated based on the timeliness of answers provided by the experts and answer rating feedback received from the question poser. According to another aspect of the invention, method and computer readable medium is disclosed for carrying out the above method.

Patent
Sebastien Roy1
30 Aug 2000
TL;DR: In this paper, the authors present a method for computing the location and orientation of an object in 3D space, which comprises the steps of marking a plurality of feature points on a 3D model and corresponding feature points in a 2D query image; for all possible subsets of three two-dimensional feature points marked in step (a), computing the four possible three-dimensional rigid motion solutions of a set of three points in threedimensional space such that after each of the four rigid motions, under a fixed perspective projection, the three threedimensional points are mapped precisely to the three corresponding
Abstract: A method for computing the location and orientation of an object in three-dimensional space. The method comprises the steps of: (a) marking a plurality of feature points on a three-dimensional model and corresponding feature points on a two-dimensional query image; (b) for all possible subsets of three two-dimensional feature points marked in step (a), computing the four possible three-dimensional rigid motion solutions of a set of three points in three-dimensional space such that after each of the four rigid motions, under a fixed perspective projection, the three three-dimensional points are mapped precisely to the three corresponding two-dimensional points; (c) for each solution found in step (b), computing an error measure derived from the errors in the projections of all three-dimensional marked points in the three-dimensional model which were not among the three points used in the solution, but which did have corresponding marked points in the two-dimensional query image; (d) ranking the solutions from step (c) based on the computed error measure; and (e) selecting the best solution based on the ranking in step (d). Also provided is a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform the method steps of the present invention and a computer program product embodied in a computer-readable medium for carrying out the methods of the present invention.

Patent
Kobayashi Mei1, Kohichi Takeda1
11 Feb 2000
TL;DR: In this article, a method and a system for sorting a specific collection of documents in various orderings, and defining a new ranking metrics by composing multiple ranking to provide a user with highly relevant search results is provided.
Abstract: A method and a system for sorting a specific collection of documents in various orderings, and defining a new ranking metrics by composing multiple ranking to provide a user with highly relevant search results is provided. Collections of documents are sorted with multiple ranking metrics, a new collection of documents in higher-ranking positions of the sorted collections of documents is determined; and an arithmetical operation between the new collections of documents in higher-ranking positions is performed. A search result is determined by the documents in higher-ranking positions as result of the arithmetical operation. Final search results are acquired by performing an arithmetical operation among specific (with fixed search results) collections of documents sorted in various orderings. The most suitable arrangement of search results can be specified by interactively combining such ranking metrics.

Patent
11 Dec 2000
TL;DR: In this paper, a method and system for publishing a plurality of books for user access to information is presented, in which a user can remotely access the database, search desired content, and view an image of a portion of the book with the desired data.
Abstract: A method and system for publishing a plurality of books for user access to information includes selecting a plurality of books, converting each book from a publisher's digital form, e.g., by training a tool to detect characteristic features (such as layout, typeface, and hierarchical or organizational features such as chapter headings, captions, drawings and tables), and extracting text or data information of the book tagged with the features. This produces a searchable library database arranged, for example, as an xml database indexed by book structure such that a user may remotely, over the internet or other network, access the database, search desired content, and view an image of a portion of the book with the desired data. The system includes a user registration module to identify an authorized user, and may maintain a personal bookshelf for the user. A search engine may score search results based on their position in the hierarchy or other factors, determining degree of relevance of text or data information located by the search engine. The other factors may include position of located search data in the hierarchy, identification of search data in the user's personal library or in a prior search by the user, or degree of match of data identified in the search. An interface with a commercially available search engine may operate to adapt the search. When provided a search query by a user, it may search for an exact match and score hits for relevance, and in the event an exact match is not found, operate to expand the query and return hits in order of rank together with an indication of the expanded search. The user may thus ascertain a degree of likely relevance of returned text or data information. The relational database may include hyperlinks to section headings and related data passages, such that a user accessing a page of a book may immediately view related data and context of a page. The relational database is indexed by logical subunits of the book such that expanded searches for Boolean combinations or proximity of elements span page breaks of book text to identify all instances of the desired search data. The search engine may expand a search if all hits have low ranking, and may suppress hits of low ranking when the search produces hits of high ranking. In further embodiments, the search engine may search tables, drawings and formulae of the converted book file.