Home
/
Authors
/
Taher H. Haveliwala

Author

Taher H. Haveliwala

Bio: Taher H. Haveliwala is an academic researcher from Google. The author has contributed to research in topics: Web search query & Web page. The author has an hindex of 29, co-authored 49 publications receiving 7585 citations. Previous affiliations of Taher H. Haveliwala include Stanford University.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Topic-sensitive PageRank

[...]

Taher H. Haveliwala¹•Institutions (1)

Stanford University¹

07 May 2002

TL;DR: A set of PageRank vectors are proposed, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic, and are shown to generate more accurate rankings than with a single, generic PageRank vector.

...read moreread less

Abstract: In the original PageRank algorithm for improving the ranking of search-query results, a single PageRank vector is computed, using the link structure of the Web, to capture the relative "importance" of Web pages, independent of any particular search query. To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic. By using these (precomputed) biased PageRank vectors to generate query-specific importance scores for pages at query time, we show that we can generate more accurate rankings than with a single, generic PageRank vector. For ordinary keyword search queries, we compute the topic-sensitive PageRank scores for pages satisfying the query using the topic of the query keywords. For searches done in context (e.g., when the search query is performed by highlighting words in a Web page), we compute the topic-sensitive PageRank scores using the topic of the context in which the query appeared.

...read moreread less

1,765 citations

Journal Article•DOI•

Topic-sensitive PageRank: a context-sensitive ranking algorithm for Web search

[...]

Taher H. Haveliwala¹•Institutions (1)

Stanford University¹

01 Jul 2003-IEEE Transactions on Knowledge and Data Engineering

TL;DR: It is shown that using linear combinations of these (precomputed) biased PageRank vectors to generate context-specific importance scores for pages at query time, can generate more accurate rankings than with a single, generic PageRank vector.

...read moreread less

Abstract: The original PageRank algorithm for improving the ranking of search-query results computes a single vector, using the link structure of the Web, to capture the relative "importance" of Web pages, independent of any particular search query. To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic. For ordinary keyword search queries, we compute the topic-sensitive PageRank scores for pages satisfying the query using the topic of the query keywords. For searches done in context (e.g., when the search query is performed by highlighting words in a Web page), we compute the topic-sensitive PageRank scores using the topic of the context in which the query appeared. By using linear combinations of these (precomputed) biased PageRank vectors to generate context-specific importance scores for pages at query time, we show that we can generate more accurate rankings than with a single, generic PageRank vector. We describe techniques for efficiently implementing a large-scale search system based on the topic-sensitive PageRank scheme.

...read moreread less

1,161 citations

Patent•

Anticipated query generation and processing in a search engine

[...]

Sepandar D. Kamvar¹, Taher H. Haveliwala¹, Glen Jeh¹•Institutions (1)

Google¹

22 Jun 2004

TL;DR: In this paper, a search system monitors the input of a search query by a user, and sends a portion of the query as a partial query to the search engine for possible selection.

...read moreread less

Abstract: A search system monitors the input of a search query by a user. Before the user finishes entering the search query, the search system identifies and sends a portion of the query as a partial query to the search engine. Based on the partial query, the search engine creates a set of predicted queries. This process may take into account prior queries submitted by a community of users, and may take into account a user profile. The predicted queries are be sent back to the user for possible selection. The search system may also cache search results corresponding to one or more of the predicted queries in anticipation of the user selecting one of the predicted queries. The search engine may also return at least a portion of the search results corresponding to one or more of the predicted queries.

...read moreread less

545 citations

Proceedings Article•DOI•

Extrapolation methods for accelerating PageRank computations

[...]

Sepandar D. Kamvar¹, Taher H. Haveliwala¹, Christopher D. Manning¹, Gene H. Golub¹•Institutions (1)

Stanford University¹

20 May 2003

TL;DR: In Quadratic Extrapolation, the first eigenvalue of a Markov matrix is known to be 1 to compute the nonprincipal eigenvectors using successive iterates of the Power Method, a fast method for determining the dominant eigenvector of a matrix that is too large for standard fast methods to be practical.

...read moreread less

Abstract: We present a novel algorithm for the fast computation of PageRank, a hyperlink-based estimate of the ''importance'' of Web pages. The original PageRank algorithm uses the Power Method to compute successive iterates that converge to the principal eigenvector of the Markov matrix representing the Web link graph. The algorithm presented here, called Quadratic Extrapolation, accelerates the convergence of the Power Method by periodically subtracting off estimates of the nonprincipal eigenvectors from the current iterate of the Power Method. In Quadratic Extrapolation, we take advantage of the fact that the first eigenvalue of a Markov matrix is known to be 1 to compute the nonprincipal eigenvectors using successive iterates of the Power Method. Empirically, we show that using Quadratic Extrapolation speeds up PageRank computation by 25-300% on a Web graph of 80 million nodes, with minimal overhead. Our contribution is useful to the PageRank community and the numerical linear algebra community in general, as it is a fast method for determining the dominant eigenvector of a matrix that is too large for standard fast methods to be practical.

...read moreread less

497 citations

Patent•

Results based personalization of advertisements in a search engine

[...]

Taher H. Haveliwala¹, Glen Jeh¹, Sepandar D. Kamvar¹•Institutions (1)

Google¹

21 Jun 2005

TL;DR: In this article, personalized advertisements are provided to a user using a search engine to obtain documents relevant to a search query, where advertisements are personalized in response to a query profile that is derived from personalized search results.

...read moreread less

Abstract: Personalized advertisements are provided to a user using a search engine to obtain documents relevant to a search query. The advertisements are personalized in response to a search profile that is derived from personalized search results. The search results are personalized based on a user profile of the user providing the query. The user profile describes interests of the user, and can be derived from a variety of sources, including prior search queries, prior search results, expressed interests, demographic, geographic, psychographic, and activity information.

...read moreread less

387 citations

1
2
3
4
…
5
6
7
8
9
10

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

The Graph Neural Network Model

[...]

Franco Scarselli¹, Marco Gori¹, Ah Chung Tsoi², Markus Hagenbuchner³, Gabriele Monfardini¹ - Show less +1 more•Institutions (3)

University of Siena¹, Hong Kong Baptist University², University of Wollongong³

01 Jan 2009-IEEE Transactions on Neural Networks

TL;DR: A new neural network model, called graph neural network (GNN) model, that extends existing neural network methods for processing the data represented in graph domains, and implements a function tau(G,n) isin IRm that maps a graph G and one of its nodes n into an m-dimensional Euclidean space.

...read moreread less

Abstract: Many underlying relationships among data in several areas of science and engineering, e.g., computer vision, molecular chemistry, molecular biology, pattern recognition, and data mining, can be represented in terms of graphs. In this paper, we propose a new neural network model, called graph neural network (GNN) model, that extends existing neural network methods for processing the data represented in graph domains. This GNN model, which can directly process most of the practically useful types of graphs, e.g., acyclic, cyclic, directed, and undirected, implements a function tau(G,n) isin IRm that maps a graph G and one of its nodes n into an m-dimensional Euclidean space. A supervised learning algorithm is derived to estimate the parameters of the proposed GNN model. The computational cost of the proposed algorithm is also considered. Some experimental results are shown to validate the proposed learning algorithm, and to demonstrate its generalization capabilities.

...read moreread less

5,701 citations

Proceedings Article•DOI•

Approximate nearest neighbors: towards removing the curse of dimensionality

[...]

Piotr Indyk¹, Rajeev Motwani¹•Institutions (1)

Stanford University¹

23 May 1998

TL;DR: In this paper, the authors present two algorithms for the approximate nearest neighbor problem in high-dimensional spaces, for data sets of size n living in R d, which require space that is only polynomial in n and d.

...read moreread less

Abstract: We present two algorithms for the approximate nearest neighbor problem in high-dimensional spaces. For data sets of size n living in R d , the algorithms require space that is only polynomial in n and d, while achieving query times that are sub-linear in n and polynomial in d. We also show applications to other high-dimensional geometric problems, such as the approximate minimum spanning tree. The article is based on the material from the authors' STOC'98 and FOCS'01 papers. It unifies, generalizes and simplifies the results from those papers.

...read moreread less

4,478 citations

Journal Issue•DOI•

The link-prediction problem for social networks

[...]

David Liben-Nowell¹, Jon Kleinberg²•Institutions (2)

Carleton College¹, Cornell University²

01 May 2007-Journal of the Association for Information Science and Technology

TL;DR: Experiments on large coauthorship networks suggest that information about future interactions can be extracted from network topology alone, and that fairly subtle measures for detecting node proximity can outperform more direct measures.

...read moreread less

Abstract: Given a snapshot of a social network, can we infer which new interactions among its members are likely to occur in the near future? We formalize this question as the link-prediction problem, and we develop approaches to link prediction based on measures for analyzing the “proximity” of nodes in a network. Experiments on large coauthorship networks suggest that information about future interactions can be extracted from network topology alone, and that fairly subtle measures for detecting node proximity can outperform more direct measures. © 2007 Wiley Periodicals, Inc.

...read moreread less

4,181 citations

Proceedings Article•DOI•

The Eigentrust algorithm for reputation management in P2P networks

[...]

Sepandar D. Kamvar¹, Mario T. Schlosser¹, Hector Garcia-Molina¹•Institutions (1)

Stanford University¹

20 May 2003

TL;DR: An algorithm to decrease the number of downloads of inauthentic files in a peer-to-peer file-sharing network that assigns each peer a unique global trust value, based on the peer's history of uploads is described.

...read moreread less

Abstract: Peer-to-peer file-sharing networks are currently receiving much attention as a means of sharing and distributing information. However, as recent experience shows, the anonymous, open nature of these networks offers an almost ideal environment for the spread of self-replicating inauthentic files.We describe an algorithm to decrease the number of downloads of inauthentic files in a peer-to-peer file-sharing network that assigns each peer a unique global trust value, based on the peer's history of uploads. We present a distributed and secure method to compute global trust values, based on Power iteration. By having peers use these global trust values to choose the peers from whom they download, the network effectively identifies malicious peers and isolates them from the network.In simulations, this reputation system, called EigenTrust, has been shown to significantly decrease the number of inauthentic files on the network, even under a variety of conditions where malicious peers cooperate in an attempt to deliberately subvert the system.

...read moreread less

3,715 citations

Journal Article•DOI•

The university of Florida sparse matrix collection

[...]

Timothy A. Davis¹, Yifan Hu²•Institutions (2)

University of Florida¹, AT&T Labs²

07 Dec 2011-ACM Transactions on Mathematical Software

TL;DR: The University of Florida Sparse Matrix Collection, a large and actively growing set of sparse matrices that arise in real applications, is described and a new multilevel coarsening scheme is proposed to facilitate this task.

...read moreread less

Abstract: We describe the University of Florida Sparse Matrix Collection, a large and actively growing set of sparse matrices that arise in real applications The Collection is widely used by the numerical linear algebra community for the development and performance evaluation of sparse matrix algorithms It allows for robust and repeatable experiments: robust because performance results with artificially generated matrices can be misleading, and repeatable because matrices are curated and made publicly available in many formats Its matrices cover a wide spectrum of domains, include those arising from problems with underlying 2D or 3D geometry (as structural engineering, computational fluid dynamics, model reduction, electromagnetics, semiconductor devices, thermodynamics, materials, acoustics, computer graphics/vision, robotics/kinematics, and other discretizations) and those that typically do not have such geometry (optimization, circuit simulation, economic and financial modeling, theoretical and quantum chemistry, chemical process simulation, mathematics and statistics, power networks, and other networks and graphs) We provide software for accessing and managing the Collection, from MATLAB™, Mathematica™, Fortran, and C, as well as an online search capability Graph visualization of the matrices is provided, and a new multilevel coarsening scheme is proposed to facilitate this task

...read moreread less

3,456 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse