Federated database systems for managing distributed, heterogeneous, and autonomous databases
TL;DR: In this paper, the authors define a reference architecture for distributed database management systems from system and schema viewpoints and show how various FDBS architectures can be developed, and define a methodology for developing one of the popular architectures of an FDBS.
Abstract: A federated database system (FDBS) is a collection of cooperating database systems that are autonomous and possibly heterogeneous. In this paper, we define a reference architecture for distributed database management systems from system and schema viewpoints and show how various FDBS architectures can be developed. We then define a methodology for developing one of the popular architectures of an FDBS. Finally, we discuss critical issues related to developing and operating an FDBS.
Citations
More filters
••
01 Dec 2001TL;DR: A taxonomy is presented that distinguishes between schema-level and instance-level, element- level and structure- level, and language-based and constraint-based matchers and is intended to be useful when comparing different approaches to schema matching, when developing a new match algorithm, and when implementing a schema matching component.
Abstract: Schema matching is a basic problem in many database application domains, such as data integration, E-business, data warehousing, and semantic query processing. In current implementations, schema matching is typically performed manually, which has significant limitations. On the other hand, previous research papers have proposed many techniques to achieve a partial automation of the match operation for specific application domains. We present a taxonomy that covers many of these existing approaches, and we describe the approaches in some detail. In particular, we distinguish between schema-level and instance-level, element-level and structure-level, and language-based and constraint-based matchers. Based on our classification we review some previous match implementations thereby indicating which part of the solution space they cover. We intend our taxonomy and review of past work to be useful when comparing different approaches to schema matching, when developing a new match algorithm, and when implementing a schema matching component.
3,693 citations
••
TL;DR: This work introduces a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federatedLearning, and federated transfer learning, and provides a comprehensive survey of existing works on this subject.
Abstract: Today’s artificial intelligence still faces two major challenges. One is that, in most industries, data exists in the form of isolated islands. The other is the strengthening of data privacy and security. We propose a possible solution to these challenges: secure federated learning. Beyond the federated-learning framework first proposed by Google in 2016, we introduce a comprehensive secure federated-learning framework, which includes horizontal federated learning, vertical federated learning, and federated transfer learning. We provide definitions, architectures, and applications for the federated-learning framework, and provide a comprehensive survey of existing works on this subject. In addition, we propose building data networks among organizations based on federated mechanisms as an effective solution to allowing knowledge to be shared without compromising user privacy.
2,593 citations
••
[...]
TL;DR: This article places data fusion into the greater context of data integration, precisely defines the goals of data fusion, namely, complete, concise, and consistent data, and highlights the challenges of data Fusion.
Abstract: The development of the Internet in recent years has made it possible and useful to access many different information systems anywhere in the world to obtain information. While there is much research on the integration of heterogeneous information systems, most commercial systems stop short of the actual integration of available data. Data fusion is the process of fusing multiple records representing the same real-world object into a single, consistent, and clean representation.This article places data fusion into the greater context of data integration, precisely defines the goals of data fusion, namely, complete, concise, and consistent data, and highlights the challenges of data fusion, namely, uncertain and conflicting data values. We give an overview and classification of different ways of fusing data and present several techniques based on standard and advanced operators of the relational algebra and SQL. Finally, the article features a comprehensive survey of data integration systems from academia and industry, showing if and how data fusion is performed in each.
1,797 citations
••
TL;DR: This survey describes a wide array of practical query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
Abstract: Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate the problem: In order to manipulate large sets of complex objects as efficiently as today's database systems manipulate simple records, query-processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
1,427 citations
••
TL;DR: Ontology mapping is seen as a solution provider in today's landscape of ontology research as mentioned in this paper and provides a common layer from which several ontologies could be accessed and hence could exchange information in semantically sound manners.
Abstract: Ontology mapping is seen as a solution provider in today's landscape of ontology research. As the number of ontologies that are made publicly available and accessible on the Web increases steadily, so does the need for applications to use them. A single ontology is no longer enough to support the tasks envisaged by a distributed environment like the Semantic Web. Multiple ontologies need to be accessed from several applications. Mapping could provide a common layer from which several ontologies could be accessed and hence could exchange information in semantically sound manners. Developing such mappings has been the focus of a variety of works originating from diverse communities over a number of years. In this article we comprehensively review and present these works. We also provide insights on the pragmatics of ontology mapping and elaborate on a theoretical approach for defining ontology mapping.
1,384 citations
References
More filters
•
17 Oct 2013TL;DR: A data model, called the entity-relationship model, is proposed that incorporates some of the important semantic information about the real world and can be used as a basis for unification of different views of data: the network model, the relational model, and the entity set model.
Abstract: A data model, called the entity-relationship model, is proposed. This model incorporates some of the important semantic information in the real world. A special diagramatic technique is introduced as a tool for data base design. An example of data base design and description using the model and the diagramatic technique is given. Some implications on data integrity, information retrieval, and data manipulation are discussed.The entity-relationship model can be used as a basis for unification of different views of data: the network model, the relational model, and the entity set model. Semantic ambiguities in these models are analyzed. Possible ways to derive their views of data from the entity-relationship model are presented.
5,941 citations
•
01 Jan 1989TL;DR: Fundamentals of Database Systems combines clear explanations of theory and design, broad coverage of models and real systems, and excellent examples with up-to-date introductions to modern database technologies.
Abstract: From the Publisher:
Fundamentals of Database Systems combines clear explanations of theory and design, broad coverage of models and real systems, and excellent examples with up-to-date introductions to modern database technologies. This edition is completely revised and updated, and reflects the latest trends in technological and application development. Professors Elmasri and Navathe focus on the relational model and include coverage of recent object-oriented developments. They also address advanced modeling and system enhancements in the areas of active databases, temporal and spatial databases, and multimedia information systems. This edition also surveys the latest application areas of data warehousing, data mining, web databases, digital libraries, GIS, and genome databases. New to the Third Edition
Reorganized material on data modeling to clearly separate entity relationship modeling, extended entity relationship modeling, and object-oriented modeling Expanded coverage of the object-oriented and object/relational approach to data management, including ODMG and SQL3 Uses examples from real database systems including OracleTM and Microsoft AccessAE Includes discussion of decision support applications of data warehousing and data mining, as well as emerging technologies of web databases, multimedia, and mobile databases Covers advanced modeling in the areas of active, temporal, and spatial databases Provides coverage of issues of physical database tuning Discusses current database application areas of GIS, genome, and digital libraries
4,242 citations
•
01 Jan 1975
TL;DR: Readers of this book will gain a strong working knowledge of the overall structure, concepts, and objectives of database systems and will become familiar with the theoretical principles underlying the construction of such systems.
Abstract: From the Publisher:
For over 25 years, C. J. Date's An Introduction to Database Systems has been the authoritative resource for readers interested in gaining insight into and understanding of the principles of database systems. This revision continues to provide a solid grounding in the foundations of database technology and to provide some ideas as to how the field is likely to develop in the future.. "Readers of this book will gain a strong working knowledge of the overall structure, concepts, and objectives of database systems and will become familiar with the theoretical principles underlying the construction of such systems.
3,867 citations
•
01 Aug 1990TL;DR: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels and concentrates on fundamental theories as well as techniques and algorithms in distributed data management.
Abstract: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels. The material concentrates on fundamental theories as well as techniques and algorithms. The advent of the Internet and the World Wide Web, and, more recently, the emergence of cloud computing and streaming data applications, has forced a renewal of interest in distributed and parallel data management, while, at the same time, requiring a rethinking of some of the traditional techniques. This book covers the breadth and depth of this re-emerging field. The coverage consists of two parts. The first part discusses the fundamental principles of distributed data management and includes distribution design, data integration, distributed query processing and optimization, distributed transaction management, and replication. The second part focuses on more advanced topics and includes discussion of parallel database systems, distributed object management, peer-to-peer data management, web data management, data stream systems, and cloud computing. New in this Edition: New chapters, covering database replication, database integration, multidatabase query processing, peer-to-peer data management, and web data management. Coverage of emerging topics such as data streams and cloud computing Extensive revisions and updates based on years of class testing and feedback Ancillary teaching materials are available.
2,395 citations
••
TL;DR: The aim of the paper is to provide first a unifying framework for the problem of schema integration, then a comparative review of the work done thus far in this area, providing a basis for identifying strengths and weaknesses of individual methodologies, as well as general guidelines for future improvements and extensions.
Abstract: One of the fundamental principles of the database approach is that a database allows a nonredundant, unified representation of all data managed in an organization. This is achieved only when methodologies are available to support integration across organizational and application boundaries.Methodologies for database design usually perform the design activity by separately producing several schemas, representing parts of the application, which are subsequently merged. Database schema integration is the activity of integrating the schemas of existing or proposed databases into a global, unified schema.The aim of the paper is to provide first a unifying framework for the problem of schema integration, then a comparative review of the work done thus far in this area. Such a framework, with the associated analysis of the existing approaches, provides a basis for identifying strengths and weaknesses of individual methodologies, as well as general guidelines for future improvements and extensions.
1,648 citations