scispace - formally typeset
Search or ask a question
Author

John R. Douceur

Bio: John R. Douceur is an academic researcher from Microsoft. The author has contributed to research in topics: File system & Distributed File System. The author has an hindex of 53, co-authored 150 publications receiving 15259 citations.


Papers
More filters
Book ChapterDOI
John R. Douceur1
07 Mar 2002
TL;DR: It is shown that, without a logically centralized authority, Sybil attacks are always possible except under extreme and unrealistic assumptions of resource parity and coordination among entities.
Abstract: Large-scale peer-to-peer systems face security threats from faulty or hostile remote computing elements. To resist these threats, many such systems employ redundancy. However, if a single faulty entity can present multiple identities, it can control a substantial fraction of the system, thereby undermining this redundancy. One approach to preventing these "Sybil attacks" is to have a trusted agency certify identities. This paper shows that, without a logically centralized authority, Sybil attacks are always possible except under extreme and unrealistic assumptions of resource parity and coordination among entities.

4,816 citations

Journal ArticleDOI
09 Dec 2002
TL;DR: The design of Farsite is reported on and the lessons learned by implementing much of that design are reported, including how to locally caching file data, lazily propagating file updates, and varying the duration and granularity of content leases.
Abstract: Farsite is a secure, scalable file system that logically functions as a centralized file server but is physically distributed among a set of untrusted computers. Farsite provides file availability and reliability through randomized replicated storage; it ensures the secrecy of file contents with cryptographic techniques; it maintains the integrity of file and directory data with a Byzantine-fault-tolerant protocol; it is designed to be scalable by using a distributed hint mechanism and delegation certificates for pathname translations; and it achieves good performance by locally caching file data, lazily propagating file updates, and varying the duration and granularity of content leases. We report on the design of Farsite and the lessons we have learned by implementing much of that design.

1,037 citations

Proceedings ArticleDOI
John R. Douceur1, Atul Adya1, William J. Bolosky1, P. Simon1, Marvin M. Theimer1 
02 Jul 2002
TL;DR: This work presents a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication, and includes convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys.
Abstract: The Farsite distributed file system provides availability by replicating each file onto multiple desktop computers. Since this replication consumes significant storage space, it is important to reclaim used space where possible. Measurement of over 500 desktop file systems shows that nearly half of all consumed space is occupied by duplicate files. We present a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication. Our mechanism includes: (1) convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys; and (2) SALAD, a Self-Arranging Lossy Associative Database for aggregating file content and location information in a decentralized, scalable, fault-tolerant manner. Large-scale simulation experiments show that the duplicate-file coalescing system is scalable, highly effective, and fault-tolerant.

690 citations

Proceedings ArticleDOI
01 Jun 2000
TL;DR: It is concluded that the measured desktop infrastructure would passably support the proposed serverless distributed file system, providing availability on the order of one unfilled file request per user per thousand days.
Abstract: We consider an architecture for a serverless distributed file system that does not assume mutual trust among the client computers. The system provides security, availability, and reliability by distributing multiple encrypted replicas of each file among the client machines. To assess the feasibility of deploying this system on an existing desktop infrastructure, we measure and analyze a large set of client machines in a commercial environment. In particular, we measure and report results on disk usage and content; file activity; and machine uptimes, lifetimes, and loads. We conclude that the measured desktop infrastructure would passably support our proposed system, providing availability on the order of one unfilled file request per user per thousand days.

569 citations

Proceedings Article
01 Oct 2007
TL;DR: A CAPTCHA that asks users to identify cats out of a set of 12 photographs of both cats and dogs, and two novel algorithms for amplifying the skill gap between humans and computers that can be used on many existing CAPTCHAs are described.
Abstract: We present Asirra (Figure 1), a CAPTCHA that asks users to identify cats out of a set of 12 photographs of both cats and dogs. Asirra is easy for users; user studies indicate it can be solved by humans 99.6% of the time in under 30 seconds. Barring a major advance in machine vision, we expect computers will have no better than a 1/54,000 chance of solving it. Asirra’s image database is provided by a novel, mutually beneficial partnership with Petfinder.com. In exchange for the use of their three million images, we display an “adopt me” link beneath each one, promoting Petfinder’s primary mission of finding homes for homeless animals. We describe the design of Asirra, discuss threats to its security, and report early deployment experiences. We also describe two novel algorithms for amplifying the skill gap between humans and computers that can be used on many existing CAPTCHAs.

519 citations


Cited by
More filters
Book ChapterDOI
TL;DR: Pastry as mentioned in this paper is a scalable, distributed object location and routing substrate for wide-area peer-to-peer ap- plications, which performs application-level routing and object location in a po- tentially very large overlay network of nodes connected via the Internet.
Abstract: This paper presents the design and evaluation of Pastry, a scalable, distributed object location and routing substrate for wide-area peer-to-peer ap- plications. Pastry performs application-level routing and object location in a po- tentially very large overlay network of nodes connected via the Internet. It can be used to support a variety of peer-to-peer applications, including global data storage, data sharing, group communication and naming. Each node in the Pastry network has a unique identifier (nodeId). When presented with a message and a key, a Pastry node efficiently routes the message to the node with a nodeId that is numerically closest to the key, among all currently live Pastry nodes. Each Pastry node keeps track of its immediate neighbors in the nodeId space, and notifies applications of new node arrivals, node failures and recoveries. Pastry takes into account network locality; it seeks to minimize the distance messages travel, according to a to scalar proximity metric like the number of IP routing hops. Pastry is completely decentralized, scalable, and self-organizing; it automatically adapts to the arrival, departure and failure of nodes. Experimental results obtained with a prototype implementation on an emulated network of up to 100,000 nodes confirm Pastry's scalability and efficiency, its ability to self-organize and adapt to node failures, and its good network locality properties.

7,423 citations

Proceedings ArticleDOI
27 Aug 2001
TL;DR: The concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales is introduced and its scalability, robustness and low-latency properties are demonstrated through simulation.
Abstract: Hash tables - which map "keys" onto "values" - are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales. The CAN is scalable, fault-tolerant and completely self-organizing, and we demonstrate its scalability, robustness and low-latency properties through simulation.

6,703 citations

Book
01 Jan 2003
TL;DR: In this paper, Sherry Turkle uses Internet MUDs (multi-user domains, or in older gaming parlance multi-user dungeons) as a launching pad for explorations of software design, user interfaces, simulation, artificial intelligence, artificial life, agents, virtual reality, and the on-line way of life.
Abstract: From the Publisher: A Question of Identity Life on the Screen is a fascinating and wide-ranging investigation of the impact of computers and networking on society, peoples' perceptions of themselves, and the individual's relationship to machines. Sherry Turkle, a Professor of the Sociology of Science at MIT and a licensed psychologist, uses Internet MUDs (multi-user domains, or in older gaming parlance multi-user dungeons) as a launching pad for explorations of software design, user interfaces, simulation, artificial intelligence, artificial life, agents, "bots," virtual reality, and "the on-line way of life." Turkle's discussion of postmodernism is particularly enlightening. She shows how postmodern concepts in art, architecture, and ethics are related to concrete topics much closer to home, for example AI research (Minsky's "Society of Mind") and even MUDs (exemplified by students with X-window terminals who are doing homework in one window and simultaneously playing out several different roles in the same MUD in other windows). Those of you who have (like me) been turned off by the shallow, pretentious, meaningless paintings and sculptures that litter our museums of modern art may have a different perspective after hearing what Turkle has to say. This is a psychoanalytical book, not a technical one. However, software developers and engineers will find it highly accessible because of the depth of the author's technical understanding and credibility. Unlike most other authors in this genre, Turkle does not constantly jar the technically-literate reader with blatant errors or bogus assertions about how things work. Although I personally don't have time or patience for MUDs,view most of AI as snake-oil, and abhor postmodern architecture, I thought the time spent reading this book was an extremely good investment.

4,965 citations

Book ChapterDOI
John R. Douceur1
07 Mar 2002
TL;DR: It is shown that, without a logically centralized authority, Sybil attacks are always possible except under extreme and unrealistic assumptions of resource parity and coordination among entities.
Abstract: Large-scale peer-to-peer systems face security threats from faulty or hostile remote computing elements. To resist these threats, many such systems employ redundancy. However, if a single faulty entity can present multiple identities, it can control a substantial fraction of the system, thereby undermining this redundancy. One approach to preventing these "Sybil attacks" is to have a trusted agency certify identities. This paper shows that, without a logically centralized authority, Sybil attacks are always possible except under extreme and unrealistic assumptions of resource parity and coordination among entities.

4,816 citations

01 Jan 2011
TL;DR: To understand the central claims of evolutionary psychology the authors require an understanding of some key concepts in evolutionary biology, cognitive psychology, philosophy of science and philosophy of mind.
Abstract: Evolutionary psychology is one of many biologically informed approaches to the study of human behavior. Along with cognitive psychologists, evolutionary psychologists propose that much, if not all, of our behavior can be explained by appeal to internal psychological mechanisms. What distinguishes evolutionary psychologists from many cognitive psychologists is the proposal that the relevant internal mechanisms are adaptations—products of natural selection—that helped our ancestors get around the world, survive and reproduce. To understand the central claims of evolutionary psychology we require an understanding of some key concepts in evolutionary biology, cognitive psychology, philosophy of science and philosophy of mind. Philosophers are interested in evolutionary psychology for a number of reasons. For philosophers of science —mostly philosophers of biology—evolutionary psychology provides a critical target. There is a broad consensus among philosophers of science that evolutionary psychology is a deeply flawed enterprise. For philosophers of mind and cognitive science evolutionary psychology has been a source of empirical hypotheses about cognitive architecture and specific components of that architecture. Philosophers of mind are also critical of evolutionary psychology but their criticisms are not as all-encompassing as those presented by philosophers of biology. Evolutionary psychology is also invoked by philosophers interested in moral psychology both as a source of empirical hypotheses and as a critical target.

4,670 citations