scispace - formally typeset
Search or ask a question
Author

M. Tamer zsu

Bio: M. Tamer zsu is an academic researcher from University of Alberta. The author has contributed to research in topics: Data administration & Distributed algorithm. The author has an hindex of 1, co-authored 1 publications receiving 2328 citations.

Papers
More filters
Book
01 Aug 1990
TL;DR: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels and concentrates on fundamental theories as well as techniques and algorithms in distributed data management.
Abstract: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels. The material concentrates on fundamental theories as well as techniques and algorithms. The advent of the Internet and the World Wide Web, and, more recently, the emergence of cloud computing and streaming data applications, has forced a renewal of interest in distributed and parallel data management, while, at the same time, requiring a rethinking of some of the traditional techniques. This book covers the breadth and depth of this re-emerging field. The coverage consists of two parts. The first part discusses the fundamental principles of distributed data management and includes distribution design, data integration, distributed query processing and optimization, distributed transaction management, and replication. The second part focuses on more advanced topics and includes discussion of parallel database systems, distributed object management, peer-to-peer data management, web data management, data stream systems, and cloud computing. New in this Edition: New chapters, covering database replication, database integration, multidatabase query processing, peer-to-peer data management, and web data management. Coverage of emerging topics such as data streams and cloud computing Extensive revisions and updates based on years of class testing and feedback Ancillary teaching materials are available.

2,395 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This paper is aimed to demonstrate a close-up view about Big Data, including Big Data applications, Big Data opportunities and challenges, as well as the state-of-the-art techniques and technologies currently adopt to deal with the Big Data problems.

2,516 citations

Journal ArticleDOI
Amit P. Sheth, James A. Larson1
TL;DR: In this paper, the authors define a reference architecture for distributed database management systems from system and schema viewpoints and show how various FDBS architectures can be developed, and define a methodology for developing one of the popular architectures of an FDBS.
Abstract: A federated database system (FDBS) is a collection of cooperating database systems that are autonomous and possibly heterogeneous. In this paper, we define a reference architecture for distributed database management systems from system and schema viewpoints and show how various FDBS architectures can be developed. We then define a methodology for developing one of the popular architectures of an FDBS. Finally, we discuss critical issues related to developing and operating an FDBS.

2,376 citations

Journal ArticleDOI
01 Sep 2002
TL;DR: This paper introduces the Cougar approach to tasking sensor networks through declarative queries, and proposes a natural architecture for a data management system for sensor networks, and describes open research problems in this area.
Abstract: The widespread distribution and availability of small-scale sensors, actuators, and embedded processors is transforming the physical world into a computing platform. One such example is a sensor network consisting of a large number of sensor nodes that combine physical sensing capabilities such as temperature, light, or seismic sensors with networking and computation capabilities. Applications range from environmental control, warehouse inventory, and health care to military environments. Existing sensor networks assume that the sensors are preprogrammed and send data to a central frontend where the data is aggregated and stored for offline querying and analysis. This approach has two major drawbacks. First, the user cannot change the behavior of the system on the fly. Second, conservation of battery power is a major design factor, but a central system cannot make use of in-network programming, which trades costly communication for cheap local computation.In this paper, we introduce the Cougar approach to tasking sensor networks through declarative queries. Given a user query, a query optimizer generates an efficient query plan for in-network query processing, which can vastly reduce resource usage and thus extend the lifetime of a sensor network. In addition, since queries are asked in a declarative language, the user is shielded from the physical characteristics of the network. We give a short overview of sensor networks, propose a natural architecture for a data management system for sensor networks, and describe open research problems in this area.

1,468 citations

Journal ArticleDOI
TL;DR: This survey describes a wide array of practical query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
Abstract: Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate the problem: In order to manipulate large sets of complex objects as efficiently as today's database systems manipulate simple records, query-processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.

1,427 citations

Journal ArticleDOI
TL;DR: The paper presents the “textbook” architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems, and discusses different kinds of distributed systems such as client-server, middleware (multitier), and heterogeneous database systems and shows how query processing works in these systems.
Abstract: Distributed data processing is becoming a reality. Businesses want to do it for many reasons, and they often must do it in order to stay competitive. While much of the infrastructure for distributed data processing is already there (e.g., modern network technology), a number of issues make distributed data processing still a complex undertaking: (1) distributed systems can become very large, involving thousands of heterogeneous sites including PCs and mainframe server machines; (2) the state of a distributed system changes rapidly because the load of sites varies over time and new sites are added to the system; (3) legacy systems need to be integrated—such legacy systems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distributed database and information systems. The paper presents the “textbook” architecture for distributed query processing and a series of techniques that are particularly useful for distributed database systems. These techniques include special join techniques, techniques to exploit intraquery paralleli sm, techniques to reduce communication costs, and techniques to exploit caching and replication of data. Furthermore, the paper discusses different kinds of distributed systems such as client-server, middleware (multitier), and heterogeneous database systems, and shows how query processing works in these systems.

980 citations