scispace - formally typeset
Search or ask a question
Author

Madhu Sudan

Other affiliations: IBM, Weizmann Institute of Science, Vassar College  ...read more
Bio: Madhu Sudan is an academic researcher from Harvard University. The author has contributed to research in topics: Property testing & Linear code. The author has an hindex of 76, co-authored 318 publications receiving 26104 citations. Previous affiliations of Madhu Sudan include IBM & Weizmann Institute of Science.


Papers
More filters
Journal ArticleDOI
TL;DR: This work describes schemes that enable a user to access k replicated copies of a database and privately retrieve information stored in the database, so that each individual server gets no information on the identity of the item retrieved by the user.
Abstract: Publicly accessible databases are an indispensable resource for retrieving up-to-date information. But they also pose a significant risk to the privacy of the user, since a curious database operator can follow the user's queries and infer what the user is after. Indeed, in cases where the users' intentions are to be kept secret, users are often cautious about accessing the database. It can be shown that when accessing a single database, to completely guarantee the privacy of the user, the whole database should be down-loaded; namely n bits should be communicated (where n is the number of bits in the database).In this work, we investigate whether by replicating the database, more efficient solutions to the private retrieval problem can be obtained. We describe schemes that enable a user to access k replicated copies of a database (k≥2) and privately retrieve information stored in the database. This means that each individual server (holding a replicated copy of the database) gets no information on the identity of the item retrieved by the user. Our schemes use the replication to gain substantial saving. In particular, we present a two-server scheme with communication complexity O(n1/3).

1,918 citations

Journal ArticleDOI
TL;DR: It is proved that no MAX SNP-hard problem has a polynomial time approximation scheme, unless NP = P, and there exists a positive ε such that approximating the maximum clique size in an N-vertex graph to within a factor of Nε is NP-hard.
Abstract: We show that every language in NP has a probablistic verifier that checks membership proofs for it using logarithmic number of random bits and by examining a constant number of bits in the proof. If a string is in the language, then there exists a proof such that the verifier accepts with probability 1 (i.e., for every choice of its random string). For strings not in the language, the verifier rejects every provided “proof” with probability at least 1/2. Our result builds upon and improves a recent result of Arora and Safra [1998] whose verifiers examine a nonconstant number of bits in the proof (though this number is a very slowly growing function of the input length).As a consequence, we prove that no MAX SNP-hard problem has a polynomial time approximation scheme, unless NP = P. The class MAX SNP was defined by Papadimitriou and Yannakakis [1991] and hard problems for this class include vertex cover, maximum satisfiability, maximum cut, metric TSP, Steiner trees and shortest superstring. We also improve upon the clique hardness results of Feige et al. [1996] and Arora and Safra [1998] and show that there exists a positive e such that approximating the maximum clique size in an N-vertex graph to within a factor of Ne is NP-hard.

1,501 citations

Proceedings ArticleDOI
30 Jun 2002
TL;DR: In this article, the authors describe a fuzzy vault construction that allows Alice to place a secret value /spl kappa/ in a secure vault and lock it using an unordered set A of elements from some public universe U. If Bob tries to "unlock" the vault using B, he obtains the secret value if B is close to A, i.e., only if A and B overlap substantially.
Abstract: We describe a simple and novel cryptographic construction that we call a fuzzy vault. Alice may place a secret value /spl kappa/ in a fuzzy vault and "lock" it using an unordered set A of elements from some public universe U. If Bob tries to "unlock" the vault using an unordered set B, he obtains /spl kappa/ only if B is close to A, i.e., only if A and B overlap substantially.

1,481 citations

Proceedings ArticleDOI
24 Oct 1992
TL;DR: Agarwal et al. as discussed by the authors showed that the MAXSNP-hard problem does not have polynomial-time approximation schemes unless P=NP, and for some epsilon > 0 the size of the maximal clique in a graph cannot be approximated within a factor of n/sup 1/ε / unless P = NP.
Abstract: The class PCP(f(n),g(n)) consists of all languages L for which there exists a polynomial-time probabilistic oracle machine that used O(f(n)) random bits, queries O(g(n)) bits of its oracle and behaves as follows: If x in L then there exists an oracle y such that the machine accepts for all random choices but if x not in L then for every oracle y the machine rejects with high probability. Arora and Safra (1992) characterized NP as PCP(log n, (loglogn)/sup O(1)/). The authors improve on their result by showing that NP=PCP(logn, 1). The result has the following consequences: (1) MAXSNP-hard problems (e.g. metric TSP, MAX-SAT, MAX-CUT) do not have polynomial time approximation schemes unless P=NP; and (2) for some epsilon >0 the size of the maximal clique in a graph cannot be approximated within a factor of n/sup epsilon / unless P=NP. >

1,277 citations

Journal ArticleDOI
TL;DR: An improved list decoding algorithm for decoding Reed-Solomon codes and alternant codes and algebraic-geometry codes is presented and a solution to a weighted curve-fitting problem is presented, which may be of use in soft-decision decoding algorithms for Reed- Solomon codes.
Abstract: Given an error-correcting code over strings of length n and an arbitrary input string also of length n, the list decoding problem is that of finding all codewords within a specified Hamming distance from the input string. We present an improved list decoding algorithm for decoding Reed-Solomon codes. The list decoding problem for Reed-Solomon codes reduces to the following "curve-fitting" problem over a field F: given n points ((x/sub i//spl middot/y/sub i/))/sub i=1//sup n/, x/sub i/, y/sub i//spl isin/F, and a degree parameter k and error parameter e, find all univariate polynomials p of degree at most k such that y/sub i/=p(x/sub i/) for all but at most e values of i/spl isin/(1,...,n). We give an algorithm that solves this problem for e 1/3, where the result yields the first asymptotic improvement in four decades. The algorithm generalizes to solve the list decoding problem for other algebraic codes, specifically alternant codes (a class of codes including BCH codes) and algebraic-geometry codes. In both cases, we obtain a list decoding algorithm that corrects up to n-/spl radic/(n(n-d')) errors, where n is the block length and d' is the designed distance of the code. The improvement for the case of algebraic-geometry codes extends the methods of Shokrollahi and Wasserman (see in Proc. 29th Annu. ACM Symp. Theory of Computing, p.241-48, 1998) and improves upon their bound for every choice of n and d'. We also present some other consequences of our algorithm including a solution to a weighted curve-fitting problem, which may be of use in soft-decision decoding algorithms for Reed-Solomon codes.

1,108 citations


Cited by
More filters
Book
08 Sep 2000
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

23,600 citations

Proceedings ArticleDOI
22 Jan 2006
TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.
Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

7,116 citations

Book ChapterDOI
John R. Douceur1
07 Mar 2002
TL;DR: It is shown that, without a logically centralized authority, Sybil attacks are always possible except under extreme and unrealistic assumptions of resource parity and coordination among entities.
Abstract: Large-scale peer-to-peer systems face security threats from faulty or hostile remote computing elements. To resist these threats, many such systems employ redundancy. However, if a single faulty entity can present multiple identities, it can control a substantial fraction of the system, thereby undermining this redundancy. One approach to preventing these "Sybil attacks" is to have a trusted agency certify identities. This paper shows that, without a logically centralized authority, Sybil attacks are always possible except under extreme and unrealistic assumptions of resource parity and coordination among entities.

4,816 citations

Book
01 Jan 1995
TL;DR: This book introduces the basic concepts in the design and analysis of randomized algorithms and presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications.
Abstract: For many applications, a randomized algorithm is either the simplest or the fastest algorithm available, and sometimes both. This book introduces the basic concepts in the design and analysis of randomized algorithms. The first part of the text presents basic tools such as probability theory and probabilistic analysis that are frequently used in algorithmic applications. Algorithmic examples are also given to illustrate the use of each tool in a concrete setting. In the second part of the book, each chapter focuses on an important area to which randomized algorithms can be applied, providing a comprehensive and representative selection of the algorithms that might be used in each of these areas. Although written primarily as a text for advanced undergraduates and graduate students, this book should also prove invaluable as a reference for professionals and researchers.

4,412 citations

Book
02 Jul 2001
TL;DR: Covering the basic techniques used in the latest research work, the author consolidates progress made so far, including some very recent and promising results, and conveys the beauty and excitement of work in the field.
Abstract: Covering the basic techniques used in the latest research work, the author consolidates progress made so far, including some very recent and promising results, and conveys the beauty and excitement of work in the field. He gives clear, lucid explanations of key results and ideas, with intuitive proofs, and provides critical examples and numerous illustrations to help elucidate the algorithms. Many of the results presented have been simplified and new insights provided. Of interest to theoretical computer scientists, operations researchers, and discrete mathematicians.

4,290 citations