Bases gy paramorcpavia ACKa5pR 03, 2010 39 pagos The Entity-Relationship Unified View of Data PETER PIN-SHAN Massachusetts CHEN Model-Toward Institute of Technology A data model, called the entity-relationship model, is proposed. This model incorporates some of the important semantic information about the real world. A special diagrammatic technique is introduced as a tool for database design. An example of database desi n and descri tion using the model and the diagrammati PACE 1 or3g data integrity, inform on discussed.

The entity latio for unification of diff relational model, and d elc implications for manipulation are be used as a basis network model, the mantic ambiguities in these models are analyzed. Possible ways to derive their views of data from the entity-relationship model are presented. Key Words and Phrases: database design, logical View of data, semantics of data, data models, entity-relationship model, relational model, Data Base Task Group, network model, entity set model, data definition and manipulation, data integrity and consistency CR Categories: 3. 50, 3,70, 4. 3, 4. 34 1. INTRODUCTION The logical View of data has been an important issue in recent years. Three major data models have been proposed: the network model [2, 3, 71 , the relational model [S), and the entity set model [25]. These models have their own strengths and and relationships (to a certain extent), but its capability to achieve data independence has been challenged [S]. The relational model is based on relational theory and can achieve a high degree of data independence, but it may lose some important semantic information about the real World [12, 15, 231.

The entity set model, which is based on set theory, also achieves a high degree of data independence, but its viewing of values such as 3″ or «red» may not be natural to some people [25]. This paper presents the entity-relationship model, which has most of the advantages of the above three models. The entity-relationship model adopts the more natural View that the real world consists of entities and relationships. It Copyright @ 1975, Associatlon for Computing Machinery, Inc.

General permission to republish, but not for profit; all or part of this material is granted provided that ACM’s copyright notice is given and that reference is made to the publication, to its date of issue, and to the fact that reprinting privileges were granted by ermission of the Association for Computing Machinery. A version of this paper was presented at the International Conference on Very Large Data Bases, Framingham, Mass. , Sept. 22-24, 1975. Author’s address: Center for Information System Research, Alfred p. Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA 02139.

ACM Transactions on Database Systems, Vol. 1, No. 1. March 1976, pages 9-36. 10 p. p. -S. Chen incorporates some of the important semantic information about the real world (Other work in database semantics can 39 some ofthe important semantic information about the real orld (other work in database semantics can be found in [l, 12, 15, 21, 23, and 29)). The model can achieve a high degree of data independence and is based on set theory and relation theory, The entity-relationship model can be used as a basis for a unified View of data.

Most Ivork in the past has emphasized the difference between the network model and the relational model [221. Recently, several attempts have been made to reduce the differences of the three data models [4, 19, 26, 30, 31 1 . This paper uses the entity-relationship model as a framework from which the three existing data models may be derived. The reader may View the entity-relationship model as a generalization or extension of existing models. This paper is organized into three parts (Sections 24). Section 2 introduces the entity-re ationship model using a framework of multilevel views of data.

Section 3 describes the semantic information in the model and its implications for data description and data manipulation. A special diagrammatric technique, the entity-relationship diagram, is introduced as a tool for database design. Section 4 analyzes the network model, the relational model, and the entity set model, nd model. describes how they may be derived from the entity- relatlonship 2. THE ENTITY-RELATIONSHIP 2. 1 MLIItileveI Views of Doto MODEL In the study of a data model, we should identify the levels of logical views of data with which the model is concerned.

Extending the framework developed in [IS, 251, we can identify four levels of views of data (Figure 1) : (1 framework developed in [IS, 251, we can identify four levels of views of data (Figure 1) : (1) Information concerning entities and relat,ionships which exist in our minds. (2) Information struct,ure- rganization of information In whlch entities and relatlonships are represented by data. (3) Access-path-independent data structure- the data structures which are not involved with search schemes, indexing schemes, etc. (4) Access-path-dependent data st. ucture. In the following sections, we shall develop the entity-relationship model Step by Step for the first, two levels. As we shall see later in the paper, the network model, as currently implemented, is mainly concerned with level 4; the relational model is mainly concerned with levels 3 and 2; the entity set model is mainly concerned with levels 1 and 2. . 2 Information Concerning Entities and Relationships ( evel 1) At this level we consider entities and relationships. An entity is a «thing’ which can be distinctly identified.

A specific person, company, or event is an example of an entity. A relationship is an association among entities. For instance, «father-son» is a relationship between two CLperson» entities. ‘ Ilt. is possible that some people may View something (e. g. marriage) as an entity while other people may View it as a relationship. We think that this is a decision which has to be made by the enterprise administrator [27]. He should define What are entit,ies and What are relationships so that the distinction is suitable for his environment. 1976.


There is a predicate associated with each entity set to test whether an entity belongs to it, For example, if we know an entity is in the entity set EMPLOYEE, then we know that it has the properties common to the other entities in the entity set EMPLOYEE. Among these properties is the aforementioned test predicate. Let Ri denote entity sets. Note that entity sets may not be mutually disjoint. For example, an entity which belongs to the entity set MALEPERSON also belongs to the entity set PERSON. In this case, MALE-PERSON is a subset of PERSON. . 2. 2 Relationship, Role, and Relationship Set. Consider associations among entities. A relationship set, Ri, is a mathematical relation [5] among n entities, ACM Transactions on Dstabase Systems, Vol. 1, No. 1, March 12 P. P. -s. Chen each taken from an entity set: (Gel,e2, . , e,] lel E El, e2 E . , e, E En}, and each tuple of entities, [el, et, , relationship. Note that the Et in the above definition may not be distinct. For example, a «‘marriage» is a relationship between two entities in the entity set PERSON.

The role of an entity in a elationship is the function that it performs in the relationship. LLHusband» and «wife» are roles. The ordering of entities in the definition of relationship (note that square brackets were used) can be dropped if roles of entities in the relationship are explicitly . , r,/e,) , where ri is the role of ei stated as follows: (rr/er, rs/e2, in the relatio 6 9 stated as follows: (rr/er, rs/e2, , r,/e,) , where ri is the role of ei in the relationship. 2. 2. Attribute, Value, and Value Set. The information about an entity or a relationship is obtained by observation or measurement, and is expressedby a set f attribute-value pairs. «3», «red», «Peter», and «Johnson» are values. Values are classified into different va2ue sets, such as FEET, COLOR, FIRST-NAME, and LAST-NAME. here is a predicate associated with each value set to test whether a value belongs to it. A value in a value set may be equivalent to another value in a different value set.

For example, «1 2» in value set INCH is equivalent to «1» in value set FEET. An attribute can be formally defined as a function which maps from an entity set or a relatlonship set into a value set or a Cartesian product of value ets: f: Ei or Ri + Vi or Vi, X Vi, X *-a X Vi,. Figure 2 illustrates someattributes defined on entity set PERSON- The attribute AGE maps into value set NO-OF-YEARS. An attribute can map into a Cartesian product of value sets. For example, the attribute NAME maps into value sets FIRST-NAME, and LAST-NAME.

Note that more than one attribute may map from the same entity set into the same value set (or same group of value sets). For example, NAME and ALTERNATIVE-NAME map from the entity set EMPLOYEE into value sets FIRST-NAME and LAST-NAME. Therefore, attribute and value set are different concepts although hey may have the same name in some cases (for example, EMPLOYEE-NO maps from EMPLOYEE to value set EMPLOYEE- NO). This distinction is not clear in the network model and in many existin EMPLOYEE to value set EMPLOYEE-NO).

This distinction is not Clear in the network model and in many existing data management systems. Also note that an attribute is defined as a function. Therefore, it maps a given entity to a single value (or a single tuple of values in the caseof a Cartesian product of value sets). Note that relationships also have attributes. consider the relationship set PROJECT-WORKER (Figure 3). The ttribute PERCENTAGE-OF-TIME, which is the portion of time a particular employee is committed to a particular project, is an attribute defined on the relationship set PROJECT-WORKER.

It is neither an attribute of EMPLOYEE nor an attribute of PROJECT, since its meanlng depends on both the employee and project involved. The concept of attribute of relationship is important in understanding the semantics of data and in determining the functional dependenciesamong data. 2. 2. 4 Conceptual Information Structure. We are now concerned with how to organize the information associated with entities and relationships. e method proposed in this paper is to separate the information about entities from the inforACM Tranaaotions on Database Systems. Vol. 1, No. 1, Marah 1076.

The Entity-Relationship ENTITY El (EMPLOYEE) SET ATTRIBUTES VALUE «l (EMPLOYEE-NO) SETS 13 F4 (AGE) v4 39 Figure 4 illustrates in table form the information about entities in an entity set. Each row of values is related to the same entity, and each column is related to a value set which, in turn, is related to an attribute. The ordering of rows and columns is insignificant. Figure 5 illustrates information about relationships in a relationship set. Note that each row of values is related to a relationship which is indicated by a group of entities, each having a specific role and belonging to a specific entity set.

Note that Figures 4 and 2 (and also Figures 5 and 3) are diferent forms of the same information. The table form is used for easily relating to the relational model. ACM Transaotions on Database Systems, Vol. 1, No. 1, Maroh 14 ENTITY SETS RELATIONSHIP SETS ATTRIBUTE VALUE SE EK (EMPLOYEE) (PROJECT) Fig. 3. Attributes defined on the relationship set PROJECT-WORKER 2. 3 is possible that more than one attribute is needed to identify the ntities in an entity set. It is also possible that several groups of attributes may be used to identify entities.

Basically, an entity key is a group of attributes such that the mappng from the entity set to the corresponding group of value sets is one-to-one. Ifwe cannot find such one-to-one mapping on available data, or if simplicity in identifying entities is desired, we ma. y define an artificial attribute and a value set so that such mapping is possible. In the case where ACM Transactions on Database Systems,Vol. 1, No. 1. March 15 CAGE) (kzEi I(No-OF-YEARS) (PETER) (JONES) (SAM)