We have covered the basics of entity-relationship modeling and now we will extend our design capabilities to include supertypes and subtypes. It is often the case that an entity type has subgroupings that are useful to include in a data model. For instance, in a University environment, persons could be grouped into employees and students. Courses could be grouped into graduate courses and undergraduate courses.
Previously, we considered a Library database where one entity type was book; instances of book are loaned out to library members. A library could have many other kinds of things that it loans out such as videos and magazines. A more general thing the library loans out can be referred to as an item. Videos, magazines, and books can be considered subtypes of item.
We will consider only supertype and subtype hierarchies. Hierarchies arise when an entity type appears as a subtype of only one supertype. So we are disallowing cases where an entity type has two or more supertypes.
B.1 Drawing Supertypes and Subtypes on the Entity Relationship Diagram
There are different ways that supertypes and subtypes can be shown on an Entity Relationship diagram (ERD or ER diagram). We will continue with the Peter Chen notation in this appendix. Between a supertype and its subtypes, we show a connection symbol (a circle) where one line is drawn from the supertype to the connection symbol and then lines are drawn from the connection symbol to each subtype.
A collection of related subtypes can be regarded as overlapping or disjoint. Subtypes are considered as disjoint if it is impossible for an instance of a supertype to be regarded as being an instance of more than one subtype. For example, a library item will be one of the subtypes (and only one). Subtypes are considered as overlapping if it is possible for an instance of a supertype to be regarded as being an instance of more than one subtype. An example of overlapping can exist with people in a university environment: it is possible that some person could be both an employee and a student at the same time. In our Peter Chen notation, we will use a “d” in the connection symbol to represent disjoint subtyping, and we will use “o” to represent overlapping.
In our notation, we also include an arc on each of the lines joining the connection symbol to the subtypes that implies “containment”. To illustrate the drawing technique, consider a library where items are loaned to members and where an item can be either a video, a magazine, or a book. Suppose also that an item belongs to exactly one (i.e. disjoint subtypes) of these subtypes. We can show this as:
We extend our notation once more. To indicate that a supertype must exist as one of its subtypes, we show total participation in subtyping by using a double line. For example, if we want to show that each item must be one of the subtypes:
The double line from Item to the connection symbol shows total participation of Item in the subtyping: whenever there is an instance of an Item, then that item must also be one of the subtypes shown – a video, a magazine, or a book. If we did not specify total participation then we would be allowing an item to exist where that item is not a video, nor is it a magazine, nor is it a book. So, participation of a supertype in the subtyping is either total or optional. The converse is always true: if we have an instance of a subtype then that instance is an instance of the supertype. In the library model then, if we have an instance of book then that instance is of course an item.
B.2 Supertypes, Subtypes AND Relationships
If a supertype participates in a relationship then all of its subtypes also participate in that relationship. We say that a supertype’s relationships are inherited by its subtypes. The converse is not true: if the model specifies specifically that a subtype participates in a relationship, then its siblings (other entity types that are subtypes of the same supertype) and its supertype do not participate in that relationship.
As an example, consider that members can borrow items (i.e. any item of any type) from the library but only books have authors. Our model can be extended as follows:
This model excludes the database from storing an author of a magazine or that a video has an author, but the model allows videos, magazines, and books to be borrowed by members.
B.3 Supertypes, Subtypes and Attributes
All entity types including supertypes and subtypes can have attributes. Continuing with our library example suppose:
- All items have a call number and a title, and call number is a key (each item has a unique call number);
- Videos have a duration (time required to play);
- Books have a length (number of pages).
Just as subtypes inherit relationships, they also inherit any attributes of their supertype. We also have know that supertypes do not inherit the attributes of their subtypes. Attributes that are common to a supertype and its subtypes are only shown at the supertype. Consider our model now:
Our examples have been two-level hierarchies. In general, a hierarchy can be as many levels as the designer requires. For instance, books could be categorized as fiction and non-fiction and so book can be a subtype of item and at the same time a supertype of fiction and non-fiction.
B.3.1 Discriminator Attributes
It is common for designers to introduce or discover an attribute such that its value can be used to explicitly determine the subtype an entity belongs to. For example, the item entity type can have an attribute, say itemType, which can have a value from the domain {“video”, “magazine”, “book”}. When this is done, the diagram must include the attribute of course, but additionally the attribute is shown as a discriminator attribute for subtyping purposes and the pertinent value for discriminating shown as well.
Below, you will see how these are laid out above and below the connection symbol.
This works well for disjoint subtyping, but not necessarily for overlapping subtypes. When overlap is possible, a designer may include a discriminator for each subtype, and so there are as many discriminator attributes as there are subtypes. Typically, this is a boolean-valued attribute. In the overlapping case, we not show discriminating values on the diagram.
B.4 Mapping Supertypes and Subtypes To A Relational Database
In chapter 8, we covered rules to be used when an ERD is mapped to a relational database. In this section, we add rules for mapping supertypes and subtypes to relations. There are three basic options a designer considers when mapping these structures to a database:
- Create a relation for each entity type in the hierarchy.
- Create relations for only the bottom-most entity types.
- Create one relation to represent the whole hierarchy.
We use two examples to exhibit the mapping options; one where total participation is specified for the supertype and the other where participation is optional.
The previous library model is modified to show that an item can be out on loan to a member, and that one of the subtypes, video, is produced by a producer:
A university model applies where a person may be a student and/or an employee, and where students declare a major subject area:
Regardless of the option selected for hierarchies, the rules for mapping an ERD to a relational database discussed previously (Chapter 8) still apply. We must apply rules regarding relationships and attributes consistently. For example, if any entity type in a hierarchy is involved in a one-to-many relationship we must ensure the proper use of foreign keys.
B.4.1 Relations For All Entity Types
With this option, each entity type in a hierarchy is represented by its own relation. Important points here are that
- All relations representing entity types in the same hierarchy have the same primary key.
- The primary key of a subtype relation will also be a foreign key that references its supertype relation.
- Attributes of a supertype (except for the primary key) appear only in the relation that represents the supertype.
Example: The library model maps to the following relational design:
Note the foreign keys:
- Item has a foreign key referencing Member
- Video has a foreign key referencing Producer
- Each of Video, Book, and Magazine has a foreign key referencing Item. If a row exists in Video, Book, or Magazine then there must be a corresponding row in Item.
The tables are shown here with sample data. Note that
- Each row of Video, Book, and Magazine has a related row in Item
- Some items are out on loan to a member
- Each video has a producer
In the relationships diagram, note the one-to-one relationships between the supertype relation and each of its subtype relations:
Example: Now consider the university model. The relational design for this mapping option:
Since subtyping is optional in the university model, there can be a row in Person with no corresponding row in Employee or Student. A person does not have to exist as one of the subtypes.
Note the foreign keys:
- Student has a foreign key referencing SubjectArea
- Employee and Student have foreign keys referencing Person. If a row exists in Employee or Student then a corresponding row must exist in Person.
We will now show tables with some sample data and the relationships diagram.
A sample database is presented below. Note that person 2 is both a student and an employee, and that person 4 is neither a student nor an employee.
In the relationships diagram, note the relationships are one-to-one between the supertype relation and each of its subtype relations:
B.4.2 Relations For Bottom-Most Entity Types
In this case, relations are created for only entity types that are at the “bottom” of the hierarchy. There are no relations created for a supertype. Important points here are that
- All relations derived from entity types in the same hierarchy will have the same primary key.
- No primary key value can be repeated (We have not seen how to handle this in MS Access. Further study of relational systems can include techniques that automate the checking for this kind of integrity constraint.)
- Attributes of a supertype must be included in each of its subtype relations.
Example: For the library model and since there is total participation in subtyping, this option works well. Every item will be stored in a relation, and each item is stored exactly once. The resulting design:
Note the foreign keys:
- Because there is no Item relation, each of Video, Book, and Magazine have foreign keys referencing Member.
- Video is the only relation with a foreign key referencing Producer.
An issue the designer should be aware of is that callNumbers across the three relations must be unique (call number is the primary key of Item). Further study of database systems is needed to know how this rule can be enforced.
It is left as an exercise for the student to create a database with sample data.
Example: Consider the university model. This approach (creating relations for bottom-most entity types) is not suitable for the university model because of the overlapping subtypes and because the participation in subtyping is not total. Applying the option we have:
If an entity exists in more than one subtype then such an entity will have data stored redundantly in the database. In the design above if a person is both an employee and a student then that person’s first and last names would be stored twice (in two different relations).
The Employee and Student relations are not sufficient to store Person data. The participation is optional and so a person may exist who is neither an employee nor a student; in such a case the data for the person cannot be stored!
It is left as an exercise for the student to create a database with sample data.
B.4.3 One Relation Representing The Whole Hierarchy
When this option is applied, one relation is created for a complete hierarchy. All attributes appearing in the hierarchy are placed in one relation. Note that the value of a discriminator attribute will enable the user to know easily the subtype of a particular entity. For our example models, when we map a hierarchy to a single relation we obtain very simple relational designs.
It is left as an exercise for the student to create databases with sample data.
Example: The library model maps to the following design
In the Item relation, the itemType attribute indicates if the row represents a video, a magazine, or a book. The memberId may have a value if the item is out on loan. producerId can only have a value if itemType is “video”.
Example: When mapping a hierarchy to a single relation for the university model, the designer should include discriminator attributes that are boolean-valued with one discriminator attribute per subtype. Applying this option to the university model we have:
With this database, each person is stored at most once in the database. There is no duplicated data as with the previous mapping option.
If a person is neither an employee nor a student then the only attributes that can have values are: personId, employeeFlag, studentFlag, first and last – the others must be null. The values of employeeFlag and studentFlag would be false.
Exercises
- Consider the database designs illustrated in this appendix. Implement one or more of these and populate with data.
- Consider the two designs used in the examples of this appendix. Combine these two designs by replacing Member with the Person hierarchy. Illustrate the relational structures when the model is mapped to a database. Choose mapping options for the hierarchies.
- Consider the design you created in Exercise 2 but modify the one-to-many borrows relationship to be a many-to-many with attributes dateBorrowed and dateReturned where dateBorrowed is a discriminator for the relationship. Recall this discriminator is not the same as the discriminators suggested for mapping supertypes and subtypes.Note that this modification to the library example will allow history to be recorded for the borrowing of items.
- For Exercise 3, create the database and populate the database with sample data.
- Create an ERD for a service station business that provides goods and services to its customers. Typically, a customer comes in with their vehicle and requests certain work to be performed. For example a customer may request an oil change and for a new set of four tires to be provided and installed.
The work items that can be performed or supplied can be of two types: a service (such as the oil change) and actual physical items (such as litres of oil). There will be several services that can be performed such as tire installation, changing oil, or fixing a flat tire. Each of these will have some cost to be charged to a customer. There are many concrete items that are supplied and charged to a customer such as fan belts, litres of oil, or tires – these are things that are kept in inventory. Consider creating a hierarchy for products (goods / services); make up reasonable attributes.
This service station has customers that fall into two groups: some are private individuals and others are businesses. Individuals will have a first name, last name, address and phone number. A business will have a business name, address, phone number and a contact person who has a first name and last name. Consider creating a hierarchy for customers.
The service station needs to keep track of all the goods and services it provides to its customers so that it has a historical record and knows what it has charged to each customer. Each visit to the service station by a customer will generate a work order that keeps track of the work that was done for the customer’s vehicle. Vehicles have license plate numbers, and other attributes to describe them (make, model, color, …). For each visit of a customer to the station, the system needs to know the date the visit occurred, the details of the work performed and goods provided, and the total charge to the customer.