scispace - formally typeset
Search or ask a question

Showing papers on "Data access published in 1993"


Patent
26 Oct 1993
TL;DR: In this article, a data communication system providing for the secure transfer and sharing of data via a local area network and/or a wide area network is described, which includes a secure processing unit which communicates with a personal keying device and a crypto media controller attached to a user's Workstation.
Abstract: A data communication system providing for the secure transfer and sharing of data via a local area network and/or a wide area network. The system includes a secure processing unit which communicates with a personal keying device and a crypto media controller attached to a user's Workstation. The communication between these processing elements generates a variety of data elements including keys, identifiers, and attributes. The data elements are used to identify and authenticate the user, assign user security access rights and privileges, and assign media and device attributes to a data access device according to a predefined security policy. The data elements are manipulated, combined, protected, and distributed through the network to the appropriate data access devices, which prevents the user from obtaining unauthorized data.

448 citations


Journal ArticleDOI
TL;DR: The structure of the mm-array database and the implementation of a data analysis program are described, both of which make extensive use of Sybase, a commercial database management system with application development software.
Abstract: A relational database management system has been implemented on the Caltech millimeter-wave array for both real-time astronomical engineering data and post-processing calibration and analysis This system provides high storage-efficiency for the data and on-line access to data from multiple observing seasons The ability to access easily the full database enables more accurate calibration of the raw data and greatly facilitates the calibration process In this article we describe both the structure of the mm-array database and the implementation of a data analysis program, both of which make extensive use of Sybase, a commercial database management system with application development software This use of relational database technology in real-time astronomical data storage and calibration may serve as a prototype for similar systems at other observatories

307 citations


Journal ArticleDOI
TL;DR: This work provides experimental results and proposes a two-phase access strategy, to be implemented in a runtime system, in which the data distribution on computational nodes is decoupled from storage distribution, and shows that performance improvements of several orders of magnitude over direct access based data distribution methods can be obtained.
Abstract: As scientists expand their models to describe physical phenomena of increasingly large extent, I/O becomes crucial and a system with limited I/O capacity can severely constrain the performance of the entire program.We provide experimental results, performed on an lntel Touchtone Delta and nCUBE 2 I/O system, to show that the performance of existing parallel I/O systems can vary by several orders of magnitude as a function of the data access pattern of the parallel program. We then propose a two-phase access strategy, to be implemented in a runtime system, in which the data distribution on computational nodes is decoupled from storage distribution. Our experimental results show that performance improvements of several orders of magnitude over direct access based data distribution methods can be obtained, and that performance for most data access patterns can be improved to within a factor of 2 of the best performance. Further, the cost of redistribution is a very small fraction of the overall access cost.

273 citations


Journal ArticleDOI
01 Mar 1993
TL;DR: New research problems include management of location dependent data, wireless data broadcasting, disconnection management and energy efficient data access in mobile computing.
Abstract: Mobile Computing is a new emerging computing paradigm of the future. Data Management in this paradigm poses many challenging problems to the database community. In this paper we identify these new challenges and plan to investigate their technical significance. New research problems include management of location dependent data, wireless data broadcasting, disconnection management and energy efficient data access.

182 citations


Patent
30 Apr 1993
TL;DR: In this article, a distributed data access system in which a plurality of computers maintain and provide access to a database of stock exchange information, 1-for-N redundancy is provided by operating one computer in a standby mode, while the other computers operate online.
Abstract: In a distributed data access system in which a plurality of computers maintain and provide access to a database of stock exchange information, 1-for-N redundancy is provided by operating one computer in a standby mode, while the other computers operate online. Each on-line computer provides access to the database to a predefined set of a geographically broad plurality of users. The set of users for any on-line computer is defined by user connectivity data structures that define connectivity between the user set and the computer. When a failure is detected in any one of the computers, the user connectivity data structures of that computer are provided to the standby computer, which then assumes all operations of the failed computer. An arbitrator computer facility observes the health and determines the status of each of the computers, including the standby computer, and controls the transfer of online status from a failed computer to the standby computer. The arbitrator computer facility is a pair of redundant computers, one of which executes the arbitration function and the other of which operates as a standby.

168 citations


Patent
Eung-Moon Yeon1, Young-Ho Lim1
08 Oct 1993
TL;DR: In this paper, a semiconductor memory device includes two latch circuits, each for holding data corresponding to a single normal address, one for storing new data while the other latch circuit outputs its data to the page decoder for subsequent output.
Abstract: A semiconductor memory device includes two latch circuits, each for holding data corresponding to a single normal address. When sequentially used, one after the other, one latch circuit can be storing new data while the other latch circuit outputs its data to the page decoder for subsequent output. Thus, data access delay times for page mode operation are further reduced because the delay which typically results from addressing a normal address is eliminated.

131 citations


Proceedings ArticleDOI
01 Dec 1993
TL;DR: A schedulability bound for SSP (Similarity Stack Protocol) is given and simulation results show that SSP is especially useful for scheduling real-time data access on multiprocessor systems.
Abstract: We propose a class of real-time data access protocols called SSP (Similarity Stack Protocol). The correctness of SSP schedules is justified by the concept of similarity which allows different but sufficiently timely data to be used in a computation without adversely affecting the outcome. SSP schedules are deadlock-free, subject to limited blocking and do not use locks. We give a schedulability bound for SSP and also report simulation results which show that SSP is especially useful for scheduling real-time data access on multiprocessor systems. Finally, we present a variation of SSP which can be implemented in an autonomous fashion in the sense that scheduling decisions can be made with local information only. >

123 citations


11 Jan 1993
TL;DR: This dissertation proposes and evaluates data prefetching techniques that address the data access penalty problems and suggests an approach that combines software and hardware schemes is shown to be very promising for reducing the memory latency with the least overhead.
Abstract: Recent technological advances are such that the gap between processor cycle times and memory cycle times is growing. Techniques to reduce or tolerate large memory latencies become essential for achieving high processor utilization. In this dissertation, we propose and evaluate data prefetching techniques that address the data access penalty problems. First, we propose a hardware-based data prefetching approach for reducing memory latency. The basic idea of the prefetching scheme is to keep track of data access patterns in a reference prediction table (RPT) organized as an instruction cache. It includes three variations of the design of the RPT and associated logic: generic design, a lookahead mechanism, and a correlated scheme. They differ mostly on the timing of the prefetching. We evaluate the three schemes by simulating them in a uniprocessor environment using the ten SPEC benchmarks. The results show that the prefetching scheme effectively eliminates a major portion of data access penalty and is particularly suitable to an on-chip design and a primary-secondary cache hierarchy. Next, we study and compare the substantive performance gains that could be achieved with hardware-controlled and software-directed prefetching on shared-memory multiprocessors. Simulation results indicate that both hardware and software schemes can handle programs with regular access patterns. The hardware scheme is good at manipulating dynamic information, whereas software prefetching has the flexibility of prefetching larger blocks of data and of dealing with complex data access patterns. The execution overhead of the additional prefetching instructions may decrease the software prefetching performance gains. An approach that combines software and hardware schemes is shown to be very promising for reducing the memory latency with the least overhead. Finally, we study non-blocking caches that can tolerate read and write miss penalties by exploiting the overlap between post-miss computations and data accesses. We show that hardware data prefetching caches generally outperform non-blocking caches. We derive a static instruction scheduling algorithm to order instructions at compile time. The algorithm is shown to be effective in exploiting instruction parallelism available in a basic block for non-blocking loads.

60 citations


Patent
04 May 1993
TL;DR: In this paper, the security of data elements which represent an industrial process, which are manipulated by users on a data processing system and in which the industrial process includes a series of industrial process steps, are controlled by permitting groups of users to access predetermined data elements based on the industrial processes step at which the user is currently active.
Abstract: The security of data elements which represent an industrial process, which are manipulated by users on a data processing system and in which the industrial process includes a series of industrial process steps, are controlled by permitting groups of users to access predetermined data elements based on the industrial process step at which the industrial process is currently active. A user is prevented from accessing the requested element if the industrial process is not at an industrial process step corresponding to one of the industrial process steps for which the user has authority to access the data element. Thus, access to data is prevented based on the status of the data, in addition to the type of data. When selected database elements are associated with one of many locations, access is also denied to a user based on the location. Security access based on status and location may be provided in response to a change in the current industrial process step. Access authority to the data elements is changed compared to the access authority at the immediately preceding industrial process step based on mappings in one or more tables. Improved security of data elements which represent an industrial process is thereby provided.

56 citations


Patent
16 Sep 1993
TL;DR: In this paper, a Shared Data Access Serialization (SADAS) mechanism for sharing data among a plurality of systems while maintaining data integrity is proposed, where each data store contains a set of lock blocks, one for each system sharing the data.
Abstract: A Shared Data Access Serialization mechanism for sharing data among a plurality of systems while maintaining data integrity. User data is maintained on a primary and optionally an alternate data store. Each data store contains a set of lock blocks, one for each system sharing the data. The contents of the lock blocks, normally a time-of-day value, indicate system ownership status of the associated data. "Lock Rules" are disclosed for determining resource ownership, as well as a "lock stealing" mechanism for obtaining resource ownership from a temporarily stopped system. Suffix records and check records are used to insure data integrity. Error indications deduced from inconsistent suffix and/or check records are used to trigger a data recovery mechanism, and the recovery mechanism can synchronize a primary and secondary data store without the necessity of suspending access to the primary during the synchronization process.

55 citations


Patent
Norbert Lenz1
06 May 1993
TL;DR: In this paper, a system and method for controlling access to data in storage which is shared by a plurality of processors are disclosed, where the shared storage is located outside of the main storage of each of the processors and stores a lock file.
Abstract: A system and method for controlling access to data in storage which is shared by a plurality of processors are disclosed. The shared storage is located outside of main storage of each of the processors and stores a lock file. The lock file comprises a plurality of control fields containing access administration information (ZVI) authorizing the processors to access the data when not currently being accessed by another processor and a status identification code (SKC) to indicate the status of the access administration information. In response to a data access request from one of the processors, the status identification code provided by the processor is compared to the stored status identification code. If the comparison indicates that the requesting processor is authorized to update the access administration information, the access administration information associated with the requesting processor for the type of data access request is written from the requesting processor to the shared storage without first reading the stored access administration information from the lock file. The status identification code in the storage is updated to indicate that the processor has updated the access administration information.

Journal ArticleDOI
TL;DR: Data conflict security, (DC-security), a property that implies a system is free of covert channels due to contention for access to data, is introduced and a definition of DC-security based on noninterference is presented.
Abstract: Concurrent execution of transactions in database management systems (DBMSs) may lead to contention for access to data, which in a multilevel secure DBMS (MLS/DBMS) may lead to insecurity. Security issues involved in database concurrency control for MLS/DBMSs are examined, and it is shown how a scheduler can affect security. Data conflict security, (DC-security), a property that implies a system is free of covert channels due to contention for access to data, is introduced. A definition of DC-security based on noninterference is presented. Two properties that constitute a necessary condition for DC-security are introduced along with two simpler necessary conditions. A class of schedulers called output-state-equivalent is identified for which another criterion implies DC-security. The criterion considers separately the behavior of the scheduler in response to those inputs that cause rollback and those that do not. The security properties of several existing scheduling protocols are characterized. Many are found to be insecure. >

Proceedings Article
24 Aug 1993
TL;DR: A model of authorization for object-oriented databases which includes a set of policies, a structure for authorization rules and their administration, and evaluation algorithms is developed, and algorithms for access evaluation at compile-time and at run-time are discussed.
Abstract: Object-oriented databases are a recent and important development and many studies of them have been performed. These consider aspects such as data modeling, query languages, performance, and concurrency control. Relatively few studies address their security, a critical aspect in systems like these that have a complex and rich data structuring. We developed previously a model of authorization for object-oriented databases which includes a set of policies, a structure for authorization rules and their administration, and evaluation algorithms. In that model the high-level query requests were resolved into read and writes at the authorization level. In this pa per we extend the set of access primitives to include ways to control the execution of methods or functions. Policy issues are discussed first, and then algorithms for access evaluation at compile-time and at run-time.

Patent
15 Apr 1993
TL;DR: In this paper, a data communication system providing for the secure transfer and sharing of data via a local area network and/or a wide area network is described, which includes a secure processing unit which communicates with a personal keying device and a crypto media controller attached to a user's workstation.
Abstract: A data communication system providing for the secure transfer and sharing of data via a local area network and/or a wide area network. The system includes a secure processing unit which communicates with a personal keying device and a crypto media controller attached to a user's workstation. The communication between these processing elements generates a variety of data elements including keys, identifiers, and attributes. The data elements are used to identify and authenticate the user, assign user security access rights and privileges, and assign media and device attributes to a data access device according to a predefined security policy. The data elements are manipulated, combined, protected, and distributed through the network to the appropriate data access devices, which prevents the user from obtaining unauthorized data.

Proceedings ArticleDOI
01 Jun 1993
TL;DR: An object oriented data model for VLSI/CAD data is presented and a design data manager based on such a model has been implemented under the UNIX/C++ environment, showing it to perform better as compared to commercial object oriented database systems.
Abstract: In this paper we present an object oriented data model for VLSI/CAD data. A design data manager (DDB) based on such a model has been implemented under the UNIX/C++ environment. It has been used by a set of diverse VLSI/CAD applications of our organization. Benchmarks have shown it to perform better as compared to commercial object oriented database systems. In conjunction with the ease of data access, the data manger served to improve software productivity and a modular program architecture for our CAD system.


Patent
03 Nov 1993
TL;DR: In this paper, the identification and location of vehicle emission control system (ECS) components is discussed. And a method for systematically creating, updating, and using the relational database is also disclosed.
Abstract: A computer system, including a relational database, especially for use by an inspector at a Motor Vehicle Inspection facility, for capturing, storing, retrieving, and displaying visual images disclosing the identification and location of vehicle Emission Control System (ECS) components. A method for systematically creating, updating, and using the relational database is also disclosed. The database is composed of three data libraries, one for ECS Vehicle Underhood Images, one for ECS Component Overlays, and another for ECS Component Lists. These libraries include visual and factual information regarding the identity and location of ECS required components for a plurality of vehicles. The libraries are maintained and used in the database in such a way as to minimize storage space and maximize the speed of data access and display.

Proceedings Article
24 Aug 1993
TL;DR: This research introduces an integrated algebra that includes traditional database operators for pattern matching and search as well as numeric operators for scientific analysis, and identifies a set of transformation rules for this algebra that can be used to achieve significant performance improvements.
Abstract: Although scientific data analysis increasingly requires access to and manipulation of large quantities of data, current database technology fails to meet the needs of scientific processing Shortcomings include data modeling facilities for scientific data types, physical storage structures for these types, and scientific analysis operations on data objects Database systems for scientific users must address these shortcomings A database system can offer numerous functionality improvements over the current combinations of scientific programs and file systems commonly used in scientific data analysis Unfortunately, the inclusion of a database layer between the application and the file system holding the application's data can result in degraded performance To overcome acceptance problems among scientists, scientific databases must provide performance comparable to, and functionality superior to, current systems used by scientists Algebraic query optimization is one of many techniques used within database systems to improve performance This technique has not been explored for scientific data types and operations I have proposed expanding the concept of a database query to include numeric computations over scientific databases, thereby allowing algebraic query optimization to be applied to the full scientific computation and data access operations This research introduces an integrated algebra that includes traditional database operators for pattern matching and search as well as numeric operators for scientific analysis The use of a single integrated algebra enables automatic optimization of computations, realizing all of the benefits provided by optimization in traditional database systems To experiment with this integrated algebra, a prototype system has been implemented for use at the University of Colorado's Space Grant College The prototype supports many basic scientific operations such as interpolation and digital filtering, in addition to standard relational operations I identify a set of transformation rules for this algebra, and show that these transformations can be used to achieve significant performance improvements The results from the prototype demonstrate that scientific database computations can be effectively optimized and permit performance gains that could not be realized without the integration of scientific operators into database systems These results suggest that future scientific database systems will be expected to be based on integrated retrieval and computational algebras

Patent
22 Jan 1993
TL;DR: In this paper, a system for code generation and data access which overcomes many of the problems in conventional database and spreadsheet applications is presented, where a user is able to build up program steps, having available, as needed, information on the permissible operations, on the fields present in the data files in use, and on the actual contents of pertinent fields in data files.
Abstract: A system for code generation and data access which overcomes many of the problems in conventional database and spreadsheet applications. A user is able to build up program steps, having available, as needed, information on the permissible operations, on the fields present in the data files in use, and on the actual contents of pertinent fields in the data files. When setting up a selection statement, for example, the user is able to view in real time a concordance of contents of a field; in the concordance display the actual contents are shown in sorted sequence and with duplicates suppressed. The display allows the user to generate code without any need for repeated referral to lists of fields. Duration from start to finish of generating workable code, including meaningful selection statements, is greatly improved.

Proceedings ArticleDOI
05 Jan 1993
TL;DR: The authors introduce a decoupled access/execute architecture (DAE) that decouples the data access tasks from the data computation tasks and overlaps execution of the two types of tasks and showed that the access mechanisms provided by this architecture lead to substantial reductions in theData computation units tall time and to significantly more efficient use of the data bandwidth.
Abstract: The authors introduce a decoupled access/execute architecture (DAE) that decouples the data access tasks from the data computation tasks and overlaps execution of the two types of tasks. Such overlapping allows at least part of the data access overhead to be hidden. A method for maximizing the decoupling of structured data references from the data computation tasks and a DAE architecture with adequate mechanisms to support and take advantage of such decoupling are presented. The simulation study showed that the access mechanisms provided by this architecture lead to substantial reductions in the data computation units tall time and to significantly more efficient use of the data bandwidth. >

Patent
02 Mar 1993
TL;DR: In this article, the dividing condition of a data base according to a computer resource situation and an access distribution, and rearranging temporarily arranged data according to the dividing conditions is discussed.
Abstract: PURPOSE:To efficiently distribute a load by setting the dividing condition of a data base according to a computer resource situation and an access distribution, and rearranging temporarily arranged data according to the dividing condition. CONSTITUTION:This system is equipped with a computer resource managing means 21 which stores the computer resource situation of an entire site, access distribution generating means 17 which obtains the number of times of access to data from the access log of the entire site, dividing condition generating means 18 which generates the dividing condition based on the computer resource situation obtained by the computer resource managing means 21 and the access distribution obtained by the access distribution generating means, and data moving means 19 which moves the data so as to be matched with the dividing condition generated by the dividing condition generating means 18.

Proceedings ArticleDOI
B.B. McHarg1
11 Oct 1993
TL;DR: The General Atomics DIII-D tokamak fusion experiment is now collecting over 80 MB of data per discharge once every 10 min, and that quantity is expected to double within the next year.
Abstract: The General Atomics DIII-D tokamak fusion experiment is now collecting over 80 MB of data per discharge once every 10 min, and that quantity is expected to double within the next year. The size of the data files, even in compressed format, is becoming increasingly difficult to handle. Data is also being acquired now on a variety of UNIX systems as well as MicroVAX and MODCOMP computer systems. The existing computers collect all the data into a single shot file, and this data collection is taking an ever increasing amount of time as the total quantity of data increases. Data is not available to experimenters until it has been collected into the shot file, which is in conflict with the substantial need for data examination on a timely basis between shots. The experimenters are also spread over many different types of computer systems (possibly located at other sites). To improve data, availability and handling, software has been developed to allow individual computer systems to create their own shot files locally. The data interface routine PTDATA that is used to access DIII-D data has been modified so that a user's code on any computer can access data from any computer where that data might be located. This data access is transparent to the user. Breaking up the shot file into separate files in multiple locations also impacts software used for data archiving, data management, and data restoration.

Journal ArticleDOI
TL;DR: A class of constrained-latency storage access (CLSA) applications that require both large amounts of storage and guarantees for short latencies are presented and none of the current approaches supports the automatic anticipation of scripted data access as required for interactive editing and playback of video segments.
Abstract: A class of constrained-latency storage access (CLSA) applications that require both large amounts of storage and guarantees for short latencies are presented. A range of existing approaches to meeting the requirements of CLSA applications is surveyed. Their limitations indicate that the technology does not yet exist to support complex CLSA applications on general-purpose storage architectures. A variety of good solutions exists for meeting throughput and latency requirements for continuous media data, including dedication of resources, presequencing, and greedy prefetching. However, none of the current approaches supports the automatic anticipation of scripted data access as required for interactive editing and playback of video segments. >

Patent
Toyohiko Yoshida1
27 Aug 1993
Abstract: A microprocessor and a data processor therefor which have separate data and instruction buses, and wherein a data address and an instruction address are output over a single address bus in a time-shared manner, thereby allowing a data access and an instruction access to be pipelined without the need for separate address buses between the microprocessor and caches holding data and instructions.

Proceedings ArticleDOI
01 Jun 1993
TL;DR: This paper develops a methodology for managing the interactions among sub-computations, avoiding strict synchronization where concurrent or pipelined relationships are possible, and demonstrates that these dynamic techniques substantially improve performance on a range of production applications including climate modeling and x-ray tomography.
Abstract: Many parallel programs contain multiple sub-computations, each with distinct communication and load balancing requirements. The traditional approach to compiling such programs is to impose a processor synchronization barrier between sub-computations, optimizing each as a separate entity. This paper develops a methodology for managing the interactions among sub-computations, avoiding strict synchronization where concurrent or pipelined relationships are possible.Our approach to compiling parallel programs has two components: symbolic data access analysis and adaptive runtime support. We summarize the data access behavior of sub-computations (such as loop nests) and split them to expose concurrency and pipelining opportunities. The split transformation has been incorporated into an extended FORTRAN compiler, which outputs a FORTRAN 77 program augmented with calls to library routines written in C and a coarse-grained dataflow graph summarizing the exposed parallelism.The compiler encodes symbolic information, including loop bounds and communication requirements, for an adaptive runtime system, which uses runtime information to improve the scheduling efficiency of irregular sub-computations. The runtime system incorporates algorithms that allocate processing resources to concurrently executing sub-computations and choose communication granularity. We have demonstrated that these dynamic techniques substantially improve performance on a range of production applications including climate modeling and x-ray tomography, expecially when large numbers of processors are available.


22 Sep 1993
TL;DR: The main hypothesis supported in this thesis is that a powerful self-descriptive object-oriented data model can play a central role in achieving these goals.
Abstract: textThis thesis describes the design of a system to integrate data access and various forms of data analysis and editing tools. -sets of operations-, under a single interactive user interface. The projected system is called YANUS (Yet ANother Unifying System). One of the major goals of this study is to create an environment for the user where he/she1 has optimal freedom to combine operations and apply these directly to his data, and where he is being supported to do so correctly. i.e. in conformance with the meaning of the data and of the operations. Direct application means that the user does not need to copy or transform the data. Another major goal is the encapsulation of existing external systems. Certain operations may, without the user's knowledge, be executed in these existing systems. Also, data to which operations are applied may be stored in existing databases. Thus. the user should be able to analyze data from existing databases with existing tools with minimal effort. He does not need to know about the different interfaces of different systems and about possible data translations. Finally, the environment must be extensible. with respect to the data and operations which may be accessed by the user. and with respect to the databases and software packages to be used to provide data and operations in the integrated system. The main hypothesis supported in this thesis is. that a powerful self-descriptive object-oriented data model can play a central role in achieving these goals

Proceedings ArticleDOI
01 Jun 1993
TL;DR: Corporate Subject Data Bases (CSDB) are being introduced to reduce data redundancy, maintain the integrity of the data, provide a uniform data access interface, and have data readily available to make business decisions.
Abstract: Corporate Subject Data Bases (CSDB) are being introduced to reduce data redundancy, maintain the integrity of the data, provide a uniform data access interface, and have data readily available to make business decisions. During the transition phase, there is a need to maintain Legacy Systems (LS), CSDB, and to synchronize between them. Choosing the right granularity for migration of data and functionality is essential to the success of the migration strategy. Technologies being used to support the transition to CSDB include relational systems supporting stored procedures, remote procedures, expert systems, object-oriented approach, reengineering tools, and data transition tools. For our Customer CSDB to be deployed in 1993, cleanup of data occurs during initial load of the CSDB. Nightly updates are needed during the transition phase to account for operations executed through LS. There is a lack of an integrated set of tools to help in the transition phase.

Patent
02 Nov 1993
TL;DR: A process control data processing apparatus has a database controller (20) with various components (34 - 39) constructed for real-time reception of process data, processes being initiated upon reception of upload data via an upload input interface (30).
Abstract: A process control data processing apparatus has a database controller (20) with various components (34 - 39) constructed for real-time reception of process control data, processes being initiated upon reception of upload data via an upload input interface (30). Various data structures are used including an input file structure (26) having data records and separate sections, process files (27) and a history file (24). Input file data are verified by a verifier (32). A process control interface (37) writes in parallel to both process files (27) and an input file (26), updating of the history file (24) being via the input file (26) for vast data retrieval and data integrity. The history file (24) provides fast data access via a read-only bus (22) and a data filter (21).

01 Jan 1993
TL;DR: A prototype system called BIRD (Bi-directional Incremental Revising of Data) is developed that offers one solution to the problem of providing incremental update propagation in the presence of OIDs, and a major contribution is the development of the Witness Generator Generator (WGG) algorithm that constructs witness generators from a user-specified ALD.
Abstract: With the advent of the "information superhighway", database interoperation is emerging as one of the most important topics in database research in the 90's. Most previous academic and commercial work in this area (e.g, schema integration, federated databases) has focused on providing read-only access to data in diverse databases. The research in this thesis addresses a fundamentally different issue, that of incrementally propagating updates between databases that hold overlapping information. A primary focus of the research is on the impact of object identifiers (OIDs). The presence OIDs complicates the situation because the meaning of an OID is local to its own database, and sometimes the object classes in one database do not correspond directly to the object classes in the second database. Results presented in this thesis shows that (1) in the context of uni-directional incremental update propagation involving OIDs, auxiliary witness relations are needed, and (2) for the bi-directional case, maintaining the contents of the witness relations is quite subtle; a mechanism is described in this thesis to construct new rules, called witness generators, that can be used to properly update the witness relations. This research also develops a prototype system called BIRD (Bi-directional Incremental Revising of Data) that offers one solution to the problem of providing incremental update propagation in the presence of OIDs. BIRD uses a high-level database query language for specifying the correspondence between two databases, and uses active database technology to perform incremental update propagation. One of the major contributions of the BIRD system is the development of the Witness Generator Generator (WGG) algorithm that constructs witness generators from a user-specified ALD. This algorithm is based on a variation of SLD-resolution. A theoretical analysis of the algorithm is also presented in this thesis. The analysis demonstrates that (a) the algorithm is sound and (b) the termination of the algorithm on a given input is decidable. (Copies available exclusively from Micrographics Department, Doheny Library, USC, Los Angeles, CA 90089-0182.)