# A relational model of data for large shared data banks

E. F. Codd

^{1}•IBM

^{1}TL;DR: In this article, a model based on n-ary relations, a normal form for data base relations, and the concept of a universal data sublanguage are introduced, and certain operations on relations are discussed and applied to the problems of redundancy and consistency in the user's model.

Abstract: Future users of large data banks must be protected from having to know how the data is organized in the machine (the internal representation). A prompting service which supplies such information is not a satisfactory solution. Activities of users at terminals and most application programs should remain unaffected when the internal representation of data is changed and even when some aspects of the external representation are changed. Changes in data representation will often be needed as a result of changes in query, update, and report traffic and natural growth in the types of stored information.Existing noninferential, formatted data systems provide users with tree-structured files or slightly more general network models of the data. In Section 1, inadequacies of these models are discussed. A model based on n-ary relations, a normal form for data base relations, and the concept of a universal data sublanguage are introduced. In Section 2, certain operations on relations (other than logical inference) are discussed and applied to the problems of redundancy and consistency in the user's model.

##### Citations

More filters

••

22 Sep 1975TL;DR: A data model, called the entity-relationship model, which incorporates the semantic information in the real world is proposed, and a special diagramatic technique is introduced for exhibiting entities and relationships.

Abstract: A data model, called the entity-relationship model, is proposed. This model incorporates some of the important semantic information about the real world. A special diagrammatic technique is introduced as a tool for database design. An example of database design and description using the model and the diagrammatic technique is given. Some implications for data integrity, information retrieval, and data manipulation are discussed.The entity-relationship model can be used as a basis for unification of different views of data: the network model, the relational model, and the entity set model. Semantic ambiguities in these models are analyzed. Possible ways to derive their views of data from the entity-relationship model are presented.

3,693 citations

••

01 Dec 1995

TL;DR: In this article, the authors present a two-dimensional framework for research in information technology, based on broad types of design and natural science research activities: build, evaluate, theorize, and justify.

Abstract: Research in IT must address the design tasks faced by practitioners. Real problems must be properly conceptualized and represented, appropriate techniques for their solution must be constructed, and solutions must be implemented and evaluated using appropriate criteria. If significant progress is to be made, IT research must also develop an understanding of how and why IT systems work or do not work. Such an understanding must tie together natural laws governing IT systems with natural laws governing the environments in which they operate. This paper presents a two dimensional framework for research in information technology. The first dimension is based on broad types of design and natural science research activities: build, evaluate, theorize, and justify. The second dimension is based on broad types of outputs produced by design research: representational constructs, models, methods, and instantiations. We argue that both design science and natural science activities are needed to insure that IT research is both relevant and effective.

3,433 citations

••

TL;DR: This paper introduces workflow management as an application domain for Petri nets, presents state-of-the-art results with respect to the verification of workflows, and highlights some Petri-net-based workflow tools.

Abstract: Workflow management promises a new solution to an age-old problem: controlling, monitoring, optimizing and supporting business processes. What is new about workflow management is the explicit representation of the business process logic which allows for computerized support. This paper discusses the use of Petri nets in the context of workflow management. Petri nets are an established tool for modeling and analyzing processes. On the one hand, Petri nets can be used as a design language for the specification of complex workflows. On the other hand, Petri net theory provides for powerful analysis techniques which can be used to verify the correctness of workflow procedures. This paper introduces workflow management as an application domain for Petri nets, presents state-of-the-art results with respect to the verification of workflows, and highlights some Petri-net-based workflow tools.

2,862 citations

••

[...]

TL;DR: It is shown that when the clause data base and the queries satisfy certain constraints, which still leaves us with a data base more general than a conventional relational data base, the query evaluation process will find every answer that is a logical consequence of the completed data base.

Abstract: A query evaluation process for a logic data base comprising a set of clauses is described. It is essentially a Horn clause theorem prover augmented with a special inference rule for dealing with negation. This is the negation as failure inference rule whereby ~ P can be inferred if every possible proof of P fails. The chief advantage of the query evaluator described is the effeciency with which it can be implemented. Moreover, we show that the negation as failure rule only allows us to conclude negated facts that could be inferred from the axioms of the completed data base, a data base of relation definitions and equality schemas that we consider is implicitly given by the data base of clauses. We also show that when the clause data base and the queries satisfy certain constraints, which still leaves us with a data base more general than a conventional relational data base, the query evaluation process will find every answer that is a logical consequence of the completed data base.

2,221 citations

••

P. Griffiths Selinger

^{1}, Morton M. Astrahan^{1}, Donald D. Chamberlin^{1}, Raymond A. Lorie^{1}, T. G. Price^{1}•IBM

^{1}TL;DR: System R as mentioned in this paper is an experimental database management system developed to carry out research on the relational model of data, which chooses access paths for both simple (single relation) and complex queries (such as joins), given a user specification of desired data as a boolean expression of predicates.

Abstract: In a high level query and data manipulation language such as SQL, requests are stated non-procedurally, without reference to access paths. This paper describes how System R chooses access paths for both simple (single relation) and complex queries (such as joins), given a user specification of desired data as a boolean expression of predicates. System R is an experimental database management system developed to carry out research on the relational model of data. System R was designed and built by members of the IBM San Jose Research Laboratory.

2,082 citations

##### References

More filters

•

01 Jan 1964TL;DR: This long-established text continues to expose students to natural proofs and set-theoretic methods and offers enough material for either a one- or two-semester course on mathematical logic.

Abstract: Retaining all the key features of the previous editions, Introduction to Mathematical Logic, Fifth Edition explores the principal topics of mathematical logic. It covers propositional logic, first-order logic, first-order number theory, axiomatic set theory, and the theory of computability. The text also discusses the major results of Gdel, Church, Kleene, Rosser, and Turing.New to the Fifth Edition A new section covering basic ideas and results about nonstandard models of number theoryA second appendix that introduces modal propositional logicAn expanded bibliography Additional exercises and selected answers This long-established text continues to expose students to natural proofs and set-theoretic methods. Only requiring some experience in abstract mathematical thinking, it offers enough material for either a one- or two-semester course on mathematical logic.

1,981 citations

••

TL;DR: A high level programming language for large, complex associative structures has been designed and implemented using a hash-coding technique and the discussion includes a comparison with other work and examples of applications.

Abstract: A high level programming language for large, complex associative structures has been designed and implemented. The underlying data structure has been implemented using a hash-coding technique. The discussion includes a comparison with other work and examples of applications of the language.

149 citations

••

TL;DR: The Relational Data File (RDF) project as discussed by the authors was concerned with the use of computers as assistants in the logical analysis of large collections of factual data and was developed for this purpose.

Abstract: This paper presents a RAND project concerned with the use of computers as assistants in the logical analysis of large collections of factual data.A system called the Relational Data File was developed for this purpose. The Relational Data File is briefly detailed and problems arising from its implementation are discussed.

58 citations

••

01 Aug 1968TL;DR: These problems are resolved in this paper with the introduction of the concept of a 'complex' which has an additional feature of allowing a natural extension of properties of binary relations to properties of general relations.

Abstract: : The paper is motivated by an assumption that many problems dealing with arbitrarily related data can be expedited on a digital computer by a storage structure which allows rapid execution of operations within and between sets of datum names. In order for such a structure to be feasible, two problems must be considered: (1) the structure should be general enough that the sets involved may be unrestricted, thus allowing sets of sets of sets...; sets of ordered pairs, ordered triples...; sets of variable length n-tuples, n-tuples of arbitrary sets; etc.; (2) the set-operations should be general in nature, allowing any of the usual set theory operations between sets as described above, with the assurance that these operations will be executed rapidly. A sufficient condition for the latter is the existence of a well-ordering relation on the union of the participating sets. These problems are resolved in this paper with the introduction of the concept of a 'complex' which has an additional feature of allowing a natural extension of properties of binary relations to properties of general relations.

56 citations

••

01 Jan 1967TL;DR: This paper describes how a method has been devised for maintaining hierarchical associations within logical entries of a data base within the Time-Shared Data Management System (TDMS), and permits the automatic association of related data through a device known as a repeating group.

Abstract: An important consideration in the design of programming systems for the management of large files of data is the method of treating hierarchical data (that is, data among which logical relationships exist at more than two levels). Recent systems have accomplished this by simply duplicating some essential item of data at several levels. Such duplication makes the storage of even small data bases inefficient; for large masses of data, storage becomes economically unfeasible. Other systems have provided the means to specify and construct hierarchies, but none have provided language that affords control over retrieval and output levels, and control over the scope of output.Within the Time-Shared Data Management System (TDMS), currently being produced at System Development Corporation, a method has been devised for maintaining hierarchical associations within logical entries of a data base. Basically, the technique permits the automatic association of related data through a device known as a repeating group. The term “repeating group” is not new, but the TDMS treatment of the repeating group concept is. This paper describes how this technique is implemented in the language and tables of TDMS.

46 citations