scispace - formally typeset
Search or ask a question
Author

David Gelernter

Other affiliations: Advanced Technology Center
Bio: David Gelernter is an academic researcher from Yale University. The author has contributed to research in topics: Tuple space & Parallel programming model. The author has an hindex of 34, co-authored 92 publications receiving 10535 citations. Previous affiliations of David Gelernter include Advanced Technology Center.


Papers
More filters
Journal ArticleDOI
David Gelernter1
TL;DR: This work is particularly concerned with implementation of the dynamic global name space that the generative communication model requires, and its implications for systems programming in distributed settings generally and on integrated network computers in particular.
Abstract: Generative communication is the basis of a new distributed programming langauge that is intended for systems programming in distributed settings generally and on integrated network computers in particular. It differs from previous interprocess communication models in specifying that messages be added in tuple-structured form to the computation environment, where they exist as named, independent entities until some process chooses to receive them. Generative communication results in a number of distinguishing properties in the new language, Linda, that is built around it. Linda is fully distributed in space and distributed in time; it allows distributed sharing, continuation passing, and structured naming. We discuss these properties and their implications, then give a series of examples. Linda presents novel implementation problems that we discuss in Part II. We are particularly concerned with implementation of the dynamic global name space that the generative communication model requires.

2,584 citations

Journal ArticleDOI
TL;DR: How can a system that differs sharply from all currently fashionable approaches score any kind of success?
Abstract: How can a system that differs sharply from all currently fashionable approaches score any kind of success? Here's how.

1,537 citations

Book
01 Mar 1995
TL;DR: In this article, the authors present a framework for parallel programming, based on three conceptual classes for understanding parallelism and three programming paradigms for implementing parallel programs, including result parallelism, which centers on parallel computation of all elements in a data structure, agenda parallelism which specifies an agenda of tasks for parallel execution, and specialist parallelism in which specialist agents solve problems cooperatively.
Abstract: We present a framework for parallel programming, based on three conceptual classes for understanding parallelism and three programming paradigms for implementing parallel programs. The conceptual classes are result parallelism, which centers on parallel computation of all elements in a data structure; agenda parallelism, which specifies an agenda of tasks for parallel execution; and specialist parallelism, in which specialist agents solve problems cooperatively. The programming paradigms center on live data structures that transform themselves into result data structures; distributed data structures that are accessible to many processes simultaneously; and message passing, in which all data objects are encapsulated within explicitly communicating processes. There is a rough correspondence between the conceptual classes and the programming methods, as we discuss. We begin by outlining the basic conceptual classes and programming paradigms, and by sketching an example solution under each of the three paradigms. The final section develops a simple example in greater detail, presenting and explaining code and discussing its performance on two commercial parallel computers, an 18-node shared-memory multiprocessor, and a 64-node distributed-memory hypercube. The middle section bridges the gap between the abstract and the practical by giving an overview of how the basic paradigms are implemented.We focus on the paradigms, not on machine architecture or programming languages: The programming methods we discuss are useful on many kinds of parallel machine, and each can be expressed in several different parallel programming languages. Our programming discussion and the examples use the parallel language C-Linda for several reasons: The main paradigms are all simple to express in Linda; efficient Linda implementations exist on a wide variety of parallel machines; and a wide variety of parallel programs have been written in Linda.

384 citations

Patent
14 Mar 2008
TL;DR: In this article, a document stream operating system and method is disclosed in which documents are stored in one or more chronologically ordered streams, the location and nature of file storage is transparent to the user, information is organized as needed instead of at the time the document is created, sophisticated logic is provided for summarizing a large group of related documents at the user wants a concise overview, and archiving is automatic.
Abstract: A document stream operating system and method is disclosed in which: (1) documents are stored in one or more chronologically ordered streams; (2) the location and nature of file storage is transparent to the user; (3) information is organized as needed instead of at the time the document is created; (4) sophisticated logic is provided for summarizing a large group of related documents at the time a user wants a concise overview; and (5) archiving is automatic. The documents can include text, pictures, animations, software programs or any other type of data.

349 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The objective is to describe the performance of design-science research in Information Systems via a concise conceptual framework and clear guidelines for understanding, executing, and evaluating the research.
Abstract: Two paradigms characterize much of the research in the Information Systems discipline: behavioral science and design science The behavioral-science paradigm seeks to develop and verify theories that explain or predict human or organizational behavior The design-science paradigm seeks to extend the boundaries of human and organizational capabilities by creating new and innovative artifacts Both paradigms are foundational to the IS discipline, positioned as it is at the confluence of people, organizations, and technology Our objective is to describe the performance of design-science research in Information Systems via a concise conceptual framework and clear guidelines for understanding, executing, and evaluating the research In the design-science paradigm, knowledge and understanding of a problem domain and its solution are achieved in the building and application of the designed artifact Three recent exemplars in the research literature are used to demonstrate the application of these guidelines We conclude with an analysis of the challenges of performing high-quality design-science research in the context of the broader IS community

10,264 citations

Journal ArticleDOI
Jeffrey O. Kephart1, David M. Chess1
TL;DR: A 2001 IBM manifesto noted the almost impossible difficulty of managing current and planned computing systems, which require integrating several heterogeneous environments into corporate-wide computing systems that extend into the Internet.
Abstract: A 2001 IBM manifesto observed that a looming software complexity crisis -caused by applications and environments that number into the tens of millions of lines of code - threatened to halt progress in computing. The manifesto noted the almost impossible difficulty of managing current and planned computing systems, which require integrating several heterogeneous environments into corporate-wide computing systems that extend into the Internet. Autonomic computing, perhaps the most attractive approach to solving this problem, creates systems that can manage themselves when given high-level objectives from administrators. Systems manage themselves according to an administrator's goals. New components integrate as effortlessly as a new cell establishes itself in the human body. These ideas are not science fiction, but elements of the grand challenge to create self-managing computing systems.

6,527 citations

Proceedings Article
22 Jun 2010
TL;DR: Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.
Abstract: MapReduce and its variants have been highly successful in implementing large-scale data-intensive applications on commodity clusters. However, most of these systems are built around an acyclic data flow model that is not suitable for other popular applications. This paper focuses on one such class of applications: those that reuse a working set of data across multiple parallel operations. This includes many iterative machine learning algorithms, as well as interactive data analysis tools. We propose a new framework called Spark that supports these applications while retaining the scalability and fault tolerance of MapReduce. To achieve these goals, Spark introduces an abstraction called resilient distributed datasets (RDDs). An RDD is a read-only collection of objects partitioned across a set of machines that can be rebuilt if a partition is lost. Spark can outperform Hadoop by 10x in iterative machine learning jobs, and can be used to interactively query a 39 GB dataset with sub-second response time.

4,959 citations

Journal ArticleDOI
TL;DR: This survey characterizes an emerging research area, sometimes called coordination theory, that focuses on the interdisciplinary study of coordination, that uses and extends ideas about coordination from disciplines such as computer science, organization theory, operations research, economics, linguistics, and psychology.
Abstract: This survey characterizes an emerging research area, sometimes called coordination theory, that focuses on the interdisciplinary study of coordination. Research in this area uses and extends ideas about coordination from disciplines such as computer science, organization theory, operations research, economics, linguistics, and psychology.A key insight of the framework presented here is that coordination can be seen as the process of managing dependencies among activities. Further progress, therefore, should be possible by characterizing different kinds of dependencies and identifying the coordination processes that can be used to manage them. A variety of processes are analyzed from this perspective, and commonalities across disciplines are identified. Processes analyzed include those for managing shared resources, producer/consumer relationships, simultaneity constraints, and task/subtask dependencies.Section 3 summarizes ways of applying a coordination perspective in three different domains:(1) understanding the effects of information technology on human organizations and markets, (2) designing cooperative work tools, and (3) designing distributed and parallel computer systems. In the final section, elements of a research agenda in this new area are briefly outlined.

3,447 citations

Journal ArticleDOI
TL;DR: This paper factors out the common denominator underlying these variants: full decoupling of the communicating entities in time, space, and synchronization to better identify commonalities and divergences with traditional interaction paradigms.
Abstract: Well adapted to the loosely coupled nature of distributed interaction in large-scale applications, the publish/subscribe communication paradigm has recently received increasing attention. With systems based on the publish/subscribe interaction scheme, subscribers register their interest in an event, or a pattern of events, and are subsequently asynchronously notified of events generated by publishers. Many variants of the paradigm have recently been proposed, each variant being specifically adapted to some given application or network model. This paper factors out the common denominator underlying these variants: full decoupling of the communicating entities in time, space, and synchronization. We use these three decoupling dimensions to better identify commonalities and divergences with traditional interaction paradigms. The many variations on the theme of publish/subscribe are classified and synthesized. In particular, their respective benefits and shortcomings are discussed both in terms of interfaces and implementations.

3,380 citations