scispace - formally typeset
Search or ask a question

Showing papers on "Data warehouse published in 1981"


Proceedings Article
02 Dec 1981
TL;DR: The design of a data management system which seeks to meet the requirements of statistical analysis in terms of data descriptions, data manipulation functions, and logical and physical data structures is described in detail.
Abstract: Statistical analysis of large data sets often involves an initial data editing and preparation phase to check the validity of individual data items, check for consistency among related data, correct erroneous data, and supply (impute) values for missing data where possible. During this preparatory phase of analysis, it is often necessary to partition the data set into a number of subsets by logical selection and/or random sampling techniques for purposes of hypothesis testing. This paper examines the data management support required by these editing and subsetting operations in terms of data descriptions, data manipulation functions, and logical and physical data structures. The design of a data management system which seeks to meet these requirements is described in detail. The system, called SDB, is built around a self-describing transposed file structure and supporting data access software. SDB representations of some logical data structures which are commonly encountered in statistical databases are also described. Experiences with a partial implementation of the system and its application in an interactive data editor have been encouraging.

16 citations


Proceedings Article
09 Sep 1981
TL;DR: A method is shown, suitable to automatically control the rights propagation in the data base, of a model defining the security conceptual schema in a data base with a binary type.
Abstract: This paper describes a model defining the security conceptual schema in a data base. The proposed model is a binary type. A language is presented to define the model. Finally a method is shown, suitable to automatically control the rights propagation in the data base.

5 citations


Book ChapterDOI
01 Jan 1981
TL;DR: The data editor provides an environment to explore arid manipulate data sets with particular attention to the implications of large data sets, and utilizes a relational data model and a self describing binary data format which allows data transportability to other data analysis packages.
Abstract: The process of analyzing large data sets often includes an early exploratory stage to first, develop a basic understanding of the data and its interrelationships and second to prepare and cleanup the data for hypothesis formulation and testing This preliminary phase of the data analysis process usually requires facilities found in research data management systems, text editors, graphics packages, and statistics packages. Also this process usually requires the analyst to write special programs to cleanup and prepare the data for analysis. This paper describes a technique now implemented as a single computational tool, a data editor, which combines a cross facilities from the above emphasis on research manipulation and subsetting The data editor provides an environment to explore arid manipulate data sets with particular attention to the implications of large data sets. It utilizes a relational data model and a self describing binary data format which allows data transportability to other data analysis packages. Some impacts of editing large data sets will be discussed. A technique for manipulating portions or subsets of large data sets without physical replication is introduced. Also an experimental command structure and operating environment are presented.

3 citations


Book ChapterDOI
01 Jan 1981
TL;DR: The data descriptors and data structures permitted by the file design are described and data buffering, file segmentation and a segment overflow handler are also discussed.
Abstract: A major goal of the Analysis of Large Data Sets (ALDS) research project at Pacific Northwest Laboratory (PNL) is to provide efficient data organization, storage, and access capabilities for statistical applications involving large amounts of data. As part of the effort to achieve this goal, a self-describing binary (SDB) data file structure has been designed and implemented together with a set of data manipulation functions and supporting SDB data access routines. Logical and physical data descriptors are stored in SDB files preceding the data values. SDB files thus provide a common data representation for interfacing diverse software components. This paper describes the data descriptors and data structures permitted by the file design. Data buffering, file segmentation and a segment overflow handler are also discussed.

3 citations


Journal ArticleDOI
TL;DR: This paper aims at describing the transition procedure from conventional data files to a data base, beginning with data models of reality and ending with the data definition using the CODASYL DDL.

1 citations


Book ChapterDOI
01 Jan 1981
TL;DR: The construction of a knowledge system begins with the selection of techniques for representing elemental pieces of data and for representing their interrelationships, and the chosen techniques then define a pattern according to which actual problem domain data can be organized.
Abstract: The construction of a knowledge system begins with the selection of techniques for representing elemental pieces of data and for representing their interrelationships. The chosen techniques then define a pattern according to which actual problem domain data can be organized. This pattern according to which data are organized is referred to as a data structure or a schema. The trend in file management–data base management (DBM) has been toward the invention of techniques that allow the construction of increasingly complex data structures. This complexity is in terms of the data relationships that can be represented. A frequently drawn distinction in the DBM field is the one between logical data organization and physical data organization. The former views data organization abstractly, whereas the latter views data organization at the level of physical implementation. A physical data organization method is concerned with the relationship between the physical location of a bundle of data in auxiliary memory and the contents of that data bundle. Logical organization is not concerned about where a bundle of data is stored but is involved with specifying how various data bundles are semantically related to each other.