scispace - formally typeset
Search or ask a question

Showing papers by "Google published in 2000"


Patent
04 Dec 2000
TL;DR: In this paper, a search and recommendation system employs the preferences and profiles of individual users and groups within a community of users, as well as information derived from categorically organized content pointers, to augment Internet searches, re-rank search results and provide recommendations for objects based on an initial subject-matter query.
Abstract: A search and recommendation system employs the preferences and profiles of individual users and groups within a community of users, as well as information derived from categorically organized content pointers, to augment Internet searches, re-rank search results, and provide recommendations for objects based on an initial subject-matter query. The search and recommendation system operates in the context of a content pointer manager, which stores individual users' content pointers (some of which may be published or shared for group use) on a centralized content pointer database connected to the Internet. The shared content pointer manager is implemented as a distributed program, portions of which operate on users' terminals and other portions of which operate on the centralized content pointer database. A user's content pointers are organized in accordance with a local topical categorical hierarchy. The hierarchical organization is used to define a relevance context within which returned objects are evaluated and ordered.

496 citations


Patent
26 Dec 2000
TL;DR: In this article, a system allows a user to submit an ambiguous search query and to receive potentially disambiguated search results by translating a search engine's conventional alphanumeric index into a second index that is ambiguous in the same manner as which the user's input is ambiguated, and the corresponding documents are provided to the user as search results.
Abstract: A system allows a user to submit an ambiguous search query and to receive potentially disambiguated search results. In one implementation, a search engine's conventional alphanumeric index is translated into a second index that is ambiguated in the same manner as which the user's input is ambiguated. The user's ambiguous search query is compared to this ambiguated index, and the corresponding documents are provided to the user as search results.

300 citations


Patent
06 Oct 2000
TL;DR: In this paper, an improved duplicate detection technique that uses query-relevant information to limit the portion(s) of documents to be compared for similarity is described, where the content of these documents may be condensed based on the query.
Abstract: An improved duplicate detection technique that uses query-relevant information to limit the portion(s) of documents to be compared for similarity is described. Before comparing two documents for similarity, the content of these documents may be condensed based on the query. In one embodiment, query-relevant information or text (also referred to as “snippets”) is extracted from the documents and only the extracted snippets, rather than the entire documents, are compared for purposes of determining similarity.

290 citations


Journal ArticleDOI
01 Jun 2000
TL;DR: This paper suggests ways of improving sampling based on random walks of the Web graph to make the samples closer to uniform and suggests a natural test bed based onrandom graphs for testing the effectiveness of the procedures.
Abstract: We consider the problem of sampling URLs uniformly at random from the Web. A tool for sampling URLs uniformly can be used to estimate various properties of Web pages, such as the fraction of pages in various Internet domains or written in various languages. Moreover, uniform URL sampling can be used to determine the sizes of various search engines relative to the entire Web. In this paper, we consider sampling approaches based on random walks of the Web graph. In particular, we suggest ways of improving sampling based on random walks to make the samples closer to uniform. We suggest a natural test bed based on random graphs for testing the effectiveness of our procedures. We then use our sampling approach to estimate the distribution of pages over various Internet domains and to estimate the coverage of various search engine indexes.

287 citations


Patent
19 Dec 2000
TL;DR: Location-blocking and identity-blocking services that can be commercially offered by a service promoter, e.g., a cellular service provider or a web advertiser, are discussed in this paper.
Abstract: Location-blocking and identity-blocking services that can be commercially offered by a service promoter, e.g., a cellular service provider or a web advertiser. In the identity-blocking service, the service promoter may disclose the current physical location of a mobile subscriber (i.e., a cellular phone operator) to a third party (e.g., a web advertiser) subscribing to the identity-blocking service. However, the service promoter may not send any identity information for the mobile subscriber to the third party. On the other hand, in the location-blocking service, the service promoter may disclose the mobile subscriber's identity information to the third party, but not the current physical location of the mobile subscriber. Blocking of the mobile subscriber's identity or location information may be desirable for privacy reasons, to comply with a government regulation, or to implement a telecommunication service option selected by the mobile subscriber. However, in the case of the mobile subscriber requesting emergency help, the service promoter may not block identity and/or location information. Instead, the service promoter may send all such information to the emergency service provider (e.g., the police or a hospital).

238 citations


Patent
Kin Lun Law1, Georges R. Harik1
06 Apr 2000
TL;DR: In this article, a technique for finding related hyperlinked documents using link-based analysis is provided. But, the technique is limited to the case of web pages that are from the same host and links from web pages with numerous links.
Abstract: Techniques for finding related hyperlinked documents using link-based analysis are provided. Backlink and forwardlink sets can be utilized to find web pages that are related to a selected web page. The scores for links from web pages that are from the same host and links from web pages with numerous links can be reduced to achieve a better list of related web pages. The list of related web pages can be utilized as a feature to a word-based search engine or an addition to a web browser.

217 citations


Patent
26 Dec 2000
TL;DR: In this paper, a sequence of numbers received from a user of a standard telephone keypad is translated into a set of potentially corresponding alphanumeric sequences, provided as an input to a conventional search engine, using a boolean "OR" expression.
Abstract: Methods and apparatus consistent with the invention allow a user to submit an ambiguous search query and to receive relevant search results. In one embodiment, a sequence of numbers received from a user of a standard telephone keypad is translated into a set of potentially corresponding alphanumeric sequences. These potentially corresponding alphanumeric sequences are provided as an input to a conventional search engine, using a boolean “OR” expression, and the search results are presented to the user. The search engine effectively limits search results to those in which the user was likely interested.

162 citations


Patent
02 Feb 2000
TL;DR: A computer program product, method and system for producing seat availability information for a mode of travel such as airline travel produce a prediction of availability of a seat in accordance with an availability query.
Abstract: A computer program product, method and system for producing seat availability information for a mode of travel such as airline travel produce a prediction of availability of a seat in accordance with an availability query. The prediction is used in place of making an actual query to an airline or other travel mode availability system.

154 citations


Patent
15 Dec 2000
TL;DR: In this paper, the authors propose a recommender system which provides a value for a document according to user recommendations (using explicit recommendations) or from statistical analysis of site visits from unique users (implicit recommendations).
Abstract: A system and method of caching uses quality or value attributes, provided for example, by a recommender system or by a dynamical analysis of site accesses, which are attached to cached information to prioritize items in the cache. Documents are prioritized in the cache according to the relative value of their content. Value data may be provided from a recommender system which provides a value for a document according to user recommendations (using explicit recommendations) or from statistical analysis of site visits from unique users (implicit recommendations) or a combination of the two to identify the higher value documents. The caching method may also be used to improve performance of a recommender system.

144 citations


Journal Article
Monika Henzinger1
TL;DR: This survey describes two successful link analysis algorithms and the state-of-the art of the field.
Abstract: The analysis of the hyperlink structure of the web has led to significant improvements in web information retrieval. This survey describes two successful link analysis algorithms and the state-of-the art of the field.

130 citations


Patent
28 Jan 2000
TL;DR: In this article, the semantic space is created by a lexicon of concepts and relations between concepts, and each data element in the target data set being searched is associated with a location in semantic space.
Abstract: The present invention is directed to a system in which a semantic space is searched in order to determine the semantic distance between two locations. A further aspect of the present invention provides a system in which a portion of semantic space is purchased and associated with a target data set element which is returned in response to a search input. The semantic space is created by a lexicon of concepts and relations between concepts. An input is associated with a location in the semantic space. Similarly, each data element in the target data set being searched is associated with a location in the semantic space. Searching is accomplished by determining a semantic distance between the first and second location in semantic space, wherein this distance represents their closeness in meaning and where the cost for retrieval of target data elements is based on this distance.

Patent
Moroney Paul1
16 Aug 2000
TL;DR: In this article, a set-top terminal (400) and a method that receives and stores digital programming services such as television programs for subsequent playback by the user in a manner analogous to a conventional video cassette recorder (VCR).
Abstract: A consumer set-top terminal (400) and method that receives and stores digital programming services such as television programs for subsequent playback by the user in a manner analogous to a conventional video cassette recorder (VCR). An interface (480) allows the terminal's user to control a transcoding process (427) based on the desired quality level for the transcoded data, e.g., high, medium or low. The transcoding is provided without the expense and complexity of a full encoder. By performing transcoding at the terminal (400), the bit rate of the data can be reduced sufficiently to allow economical storage at the terminal. Moreover, the user can set the quality level to be different for different programs, different parts of the same program, or for different channels.

Patent
03 Aug 2000
TL;DR: In this article, the authors present a method for retrieving e-mail that was sent over a TCP/IP network, using a text-to-speech (T2T) system.
Abstract: A voice web browser system includes a telephone, an access system coupled to a TCP/IP network, a telephone system coupling the telephone to the access system, and a speech-to-text system for “reading” text that had been sent over the TCP/IP network to the telephone user. Preferably, the access system receives TCP/IP packets from web pages accessible over the TCP/IP network and parses the HTML code of the web pages into text and non-text portions, such that the text portion can be read to the telephone user. A computer implemented process for obtaining web page information over a TCP/IP network includes implementing a connection of a telephone user to an access system that is coupled to a TCP/IP network, detecting a selection of at least one navigation command by the telephone user to access a web page accessible over the TCP/IP network, and navigating over the TCP/IP network to the web page in response to the navigation command, resulting in a verbal communication of at least some information derivable from the web page to the telephone user. A method for retrieving e-mail that was sent over a TCP/IP network includes calling from a user telephone to an access computer coupled to a TCP/IP network, providing user identification to the access computer, retrieving e-mail via the access computer that was sent over the TCP/IP network and addressed to the user, and reading the e-mail to the user of the user telephone utilizing a text-to-speech system.

Patent
22 Nov 2000
TL;DR: In this paper, a three-dimensional design and modeling environment allows users to draw the outlines, or perimeters, of objects in a two-dimensional manner, similar to pencil and paper, already familiar to them.
Abstract: A three-dimensional design and modeling environment allows users to draw the outlines, or perimeters, of objects in a two-dimensional manner, similar to pencil and paper, already familiar to them. The two-dimensional, planar faces created by a user can then be pushed and pulled by editing tools within the environment to easily and intuitively model three-dimensional volumes and geometries.

Journal ArticleDOI
TL;DR: In this article, the authors compare several algorithms for identifying mirrored hosts on the World Wide Web, based on URL strings and linkage data, the type of information about Web pages easily available from Web proxies and crawlers.
Abstract: We compare several algorithms for identifying mirrored hosts on the World Wide Web. The algorithms operate on the basis of URL strings and linkage data: the type of information about Web pages easily available from Web proxies and crawlers. Identification of mirrored hosts can improve Web-based information retrieval in several ways: first, by identifying mirrored hosts, search engines can avoid storing and returning duplicate documents. Second, several new information retrieval techniques for the Web make inferences based on the explicit links among hypertext documents—mirroring perturbs their graph model and degrades performance. Third, mirroring information can be used to redirect users to alternate mirror sites to compensate for various failures, and can thus improve the performance of Web browsers and proxies. We evaluated four classes of “top-down” algorithms for detecting mirrored host pairs (that is, algorithms that are based on page attributes such as URL, IP address, and hyperlinks between pages, and not on the page content) on a collection of 140 million URLs (on 230,000 hosts) and their associated connectivity information. Our best approach is one which combines five algorithms and achieved a precision of 0.57 for a recall of 0.86 considering 100,000 ranked host pairs.


Patent
13 Dec 2000
TL;DR: In this article, a server includes a processor and a memory that stores instructions and a group of query themes, and the processor receives a search query containing at least one search term, retrieves one or more objects based on the at least 1 search term and determines whether the search query corresponds to at least 5 of the 5 query themes.
Abstract: A server improves the ranking of search results. The server includes a processor and a memory that stores instructions and a group of query themes. The processor receives a search query containing at least one search term, retrieves one or more objects based on the at least one search term and determines whether the search query corresponds to at least one of the group of query themes. The processor then ranks the one or more objects based on whether the search query corresponds to at least one of the group of query themes and provides the ranked one or more objects to a user.

Patent
15 Dec 2000
TL;DR: In this article, a fuel cell membrane electrode assembly including a plurality of hydrophilic threads for the wicking of reaction water is formed on the major surface of the base portion.
Abstract: A fuel cell device and method of forming the fuel cell device including a base portion, formed of a singular body, and having a major surface. At least one fuel cell membrane electrode assembly including a plurality of hydrophilic threads for the wicking of reaction water is formed on the major surface of the base portion. A fluid supply channel including a mixing chamber is defined in the base portion and communicating with the fuel cell membrane electrode assembly for supplying a fuel-bearing fluid to the membrane electrode assembly. An exhaust channel including a water recovery and recirculation channel is defined in the base portion and communicating with the membrane electrode assembly and the plurality of hydrophilic threads. The membrane electrode assembly and the cooperating fluid supply channel and cooperating exhaust channel forming a single fuel cell assembly.

Patent
Daniel E. Tsai1
13 Mar 2000
TL;DR: In this paper, a schema-based navigational layer is used on top of conventional physical, logical and conceptual database schema layers, to dynamically map data stored in a relational database onto web pages.
Abstract: Relational databases are browsed in a manner that mirrors the interactive browsing of world wide web pages. A schema-based navigational layer is used on top of conventional physical, logical and conceptual database schema layers, to dynamically map data stored in a relational database onto web pages. The navigational schema or schema base is an independent abstraction from the underlying conceptual database schema. The schema base is constructed from relationships and information models. The schema base can be reused or derived from the database design process or produced specifically for navigation through the database. An object-role schema base is used to demonstrate the traversal of relational information in a regenerative, propagative manner. Navigating a database via the presented-schema extends the conventional database concept of the logical view to an interactive model of logical view-transitions. The technique is a simple and powerful model for automated access to relational databases making available vast amounts of data stored in relational databases for Internet and intranet web browsing.

Patent
15 Dec 2000
TL;DR: In this article, a fuel cell device and method of forming the fuel cell devices including a base portion, formed of a singular body, and having a major surface is presented, where at least one fuel cell membrane electrode assembly formed on the major surface of the base portion.
Abstract: A fuel cell device and method of forming the fuel cell device including a base portion, formed of a singular body, and having a major surface. At least one fuel cell membrane electrode assembly formed on the major surface of the base portion. A fluid supply channel including a mixing chamber is defined in the base portion and communicating with the fuel cell membrane electrode assembly for supplying a fuel-bearing fluid to the membrane electrode assembly. An exhaust channel is defined in the base portion and communicating with the membrane electrode. A multi-dimensional fuel flow field is defined in the multi-layer base portion and in communication with the fluid supply channel, the membrane electrode assembly and the exhaust channel. The membrane electrode assembly and the cooperating fluid supply channel, multi-dimensional fuel flow field, and cooperating exhaust channel forming a single fuel cell assembly.

Patent
05 Dec 2000
TL;DR: In this paper, a search engine for searching a corpus improves the relevancy of the results by classifying multiple terms in a search query as a single semantic unit, and the resultant semantic units are used to refine the results of the search.
Abstract: A search engine for searching a corpus improves the relevancy of the results by classifying multiple terms in a search query as a single semantic unit. A semantic unit locator of the search engine generates a subset of documents that are generally relevant to the query based on the individual terms within the query. Combinations of search terms that define potential semantic units from the query are then evaluated against the subset of documents to determine which combinations of search terms should be classified as a semantic unit. The resultant semantic units are used to refine the results of the search.

Patent
13 Dec 2000
TL;DR: In this paper, the system detects selection of one or more words in a document currently accessed by the user, generates a search query using the selected word(s) and retrieves a document based on the search query.
Abstract: A system facilitates a search by a user. The system detects selection of one or more words in a document currently accessed by the user, generates a search query using the selected word(s), and retrieves a document based on the search query. When the document includes one or more links corresponding to a linked document, the system analyzes each of the links, prefetches the linked documents corresponding to a number of the links, and presents the document to the user. The system receives selection of one of the links and retrieves the linked document corresponding to the selected link. The system identifies one or more pieces of information in the retrieved document, determines a link to a related document for each of the identified pieces of information, and provides the determined links with the related document to the user.

Patent
02 Nov 2000
TL;DR: In this paper, the authors propose a method in which a communication system (100) includes transmitting from a source user (101) a first data packet (111) over a first time frame (121) having a finite time period (131).
Abstract: A method in a communication system (100) includes transmitting from a source user (101) a first data packet (111) over a first time frame (121) having a finite time period (131), transmitting from source user (101) a second data packet (112) over a second time frame (122) immediately subsequent to first time frame (121), detecting an acknowledgment of acceptable reception of data packet associated with either first or said second data packets (111 and 112), repeating transmission of first and second data packets (111 and 112) in a sequence of first and second time frames (121 and 122) in a time frame sequence (190) until the detection.

Proceedings ArticleDOI
01 May 2000
TL;DR: A new limited form of interprocedural analysis called field analysis is presented that can be used by a compiler to reduce the costs of modern language features such as object-oriented programming, automatic memory management, and run-time checks required for type safety.
Abstract: We present a new limited form of interprocedural analysis called field analysis that can be used by a compiler to reduce the costs of modern language features such as object-oriented programming, automatic memory management, and run-time checks required for type safety. Unlike many previous interprocedural analyses, our analysis is cheap, and does not require access to the entire program. Field analysis exploits the declared access restrictions placed on fields in a modular language (e.g. field access modifiers in Java) in order to determine useful properties of fields of an object. We describe our implementation of field analysis in the Swift optimizing compiler for Java, as well a set of optimizations that exploit the results of field analysis. These optimizations include removal of run-time tests, compile-time resolution of method calls, object inlining, removal of unnecessary synchronization, and stack allocation. Our results demonstrate that field analysis is efficient and effective. Speedups average 7% on a wide range of applications, with some times reduced by up to 27%. Compile time overhead of field analysis is about 10%.

Patent
01 Nov 2000
TL;DR: In this article, a travel itinerary that includes a first segment that is scheduled to arrive at a location at an arrival time and a second segment that was scheduled to depart from the location at a departure time is determined.
Abstract: A method includes determining a travel itinerary that includes a first segment that is scheduled to arrive at a location at an arrival time and a second segment that is scheduled to depart from the location at a departure time. The method also includes deriving a probability distribution of delays in the arrival time based on an arrival statistical model of the first segment, retrieving a minimum connection time required by a traveler traveling in the first segment to connect to the second segment, and computing a likelihood that the traveler will fail to connect to the second segment based on the probability distribution of delays in the arrival time. Annotations are derived from the computed likelihood and added to the travel itinerary.

Patent
18 Dec 2000
TL;DR: In this article, a method and system for providing access control using Lightweight Directory Access Protocol (LDAP) is presented, which can include a series of steps which include receiving from a user an LDAP operation directed to an LDA search engine.
Abstract: A method and system for providing access control using Lightweight Directory Access Protocol (LDAP). The method can include a series of steps which can include receiving from a user an LDAP operation directed to an LDAP search engine. The method can include associating the user with an access control group and reformatting the LDAP operation based on the access control group. Additionally, the step of providing the reformatted LDAP operation to the LDAP search engine can be included.

Patent
22 Dec 2000
TL;DR: In this paper, the processor selects which information to display in accordance with a predetermined relationship based on a group-based recommendation criteria and user interest, and a plurality of sensors disposed behind the screen detect user interest in information displayed on the screen near the sensor.
Abstract: An electronic board system includes an electronic board having a screen for displaying information of interest to a work group or community, an input device for receiving information from users in a group or community to be displayed on the electronic board, a memory for storing information received from the input device and a processor for selecting which information stored in the memory to display on the screen, where and how to display the selected information on the screen and displaying the selected information on the screen. The processor selects which information to display in accordance with a predetermined relationship based on a group-based recommendation criteria and user interest. A plurality of sensors disposed behind the screen detect user interest in information displayed on the screen near the sensor. User interest may also be determined by monitoring user requests for copies of information displayed on the screen.

Patent
13 Dec 2000
TL;DR: In this article, a system limits search results based on context information and a search query and obtains a set of references to documents in response to the search query, and then filters the set of reference based on the context information.
Abstract: A system limits search results based on context information The system obtains the context information and a search query, and obtains a set of references to documents in response to the search query The system then filters the set of references based on the context information and presents the filtered set of references to a user

Journal ArticleDOI
01 Jun 2000
TL;DR: The Term Vector Database is a database that provides term vector information for large numbers of pages (hundreds of millions), enabling a large class of applications that would be impractical without such a database.
Abstract: We have built a database that provides term vector information for large numbers of pages (hundreds of millions). The basic operation of the database is to take URLs and return term vectors. Compared to computing vectors by downloading pages via HTTP, the Term Vector Database is several orders of magnitude faster, enabling a large class of applications that would be impractical without such a database. This paper describes the Term Vector Database in detail. It also reports on two applications built on top of the database. The first application is an optimization of connectivity-based topic distillation. The second application is a Web page classifier used to annotate results returned by a Web search engine.

Patent
14 Dec 2000
TL;DR: In this paper, the authors present an efficient and effective quality of service for information that is time sensitive (e.g., real-time data) by using cut through switching.
Abstract: The present invention provides efficient and effective quality of service for information that is time sensitive (eg, real time data) An intermediate network communication system and method (eg, a router) of the present invention performs cut through switching to reduce latency problems for time sensitive information In one embodiment of the present invention, communication packet header information is encoded with a time sensitive identifier that identifies the information as time sensitive In one exemplary transfer control protocol/internet protocol TCP/IP implementation of the present invention, time sensitive indication is provided in the link layer information In one embodiment of the present invention, time sensitive information is dropped if the intermediate network device can not communicate the information within specified timing constraints In one embodiment of the present invention time sensitive information is cut through routed on a virtual channel and pre-empts non time sensitive information In one embodiment a communication path probe is cut through routed via intermediate network devices to establish a communication path before other information is communicated from a originating source to a final destination In one embodiment the present invention leverages previously collected information to establish a communication path In one embodiment the present invention, an intermediate network device establishes a second communication link if a first communication link is unavailable