cmiller1137 and Ron McFadyen
A database is an organized collection of data. A database may be on paper or held in computer files such as spreadsheets or more formally in a software system known as a computerized database management system (for example DB2, db4o, IMS, MS Access, MS SQL Server, MySQL, Oracle, Sybase, Total, Versant). In this book, we focus on Relational databases and one specific relational database system: Microsoft Access.
There are many different commercial relational database systems and what you learn here will assist you in using those others. Because Microsoft Access is a workstation/personal system it is a convenient system for beginners.
1.1 Relational Databases
Relational Databases were introduced by E. F. Codd in 1969[1]; Codd’s 1970 paper[2] is considered one of the great papers in Computer Science.
We begin with a very small example: a database with one relation, the list of employees shown in figure 1.1. You should notice this looks just like a two-dimensional table of rows and columns. The name of the table is Employees, each column of the table has its own title, and each row has the same structure. Each row has a value for employee number, first name, last name, and gender. As tables of data appear in so many places (newspaper articles, textbooks, web pages, etc.) it is very likely you have seen and used this representation for data previously.
|
Employees |
|||
|
Employee ID |
First Name |
Last Name |
Gender |
|
123 |
Joe |
Smith |
Male |
|
333 |
Jim |
Jones |
Male |
|
456 |
April |
Smith |
Female |
|
842 |
Jenny |
Jones |
Female |
|
777 |
Tom |
Lee |
Male |
Figure 1.1 A list of employees
Let us assume the Employees table in figure 1.1 has one row for each employee who works for some hypothetical company. Data kept for each employee comprises their employee identification number, their first and last names, and their gender. Information structured in tables is very concise; at a glance, we can obtain useful information.
According to the database design methodology in Information Modeling and Relational Databases [3], a database designer must be able to express structured information as verbalizations. A verbalization that fits the information in one row of the Employees table is:
Employee with ID … has a first name …, a last name …, and is of … gender
In verbalizations like this, the ellipses are placeholders: we can use values from a single row to create complete statements that explain the meaning of a row. For example,
Employee with ID 123 has a first name Joe, a last name Smith, and is of Male gender
Employee with ID 333 has a first name Jim, a last name Jones, and is of Male gender
A similar approach to organizing knowledge about data appears in the literature on literacy. In the Journal of Reading several articles by Kirsch and Mosenthal discuss the organization of information and its conceptualization as document sentences. In Building Documents by Combining Simple Lists[4], Kirsch and Mosenthal present an example based on information from The World Almanac and Book of Facts: 1980 (Newspaper Enterprise Association, p. 427). That data is reproduced in figure 1.2.
|
Circulation of Leading U.S. Magazines |
|
|
Magazines |
Circulation |
|
TV Guide |
19,547,763 |
|
Reader’s Digest |
18,094,192 |
|
National Geographic |
10, 249,748 |
|
Better Homes & Gardens |
8,007,202 |
|
Family Circle |
7,611,578 |
|
Woman ‘s Day |
7,535,855 |
|
McCall’s |
6,502,880 |
Figure 1.2 Circulation of leading U.S. magazines
A major point the authors make is that such information can be re-conceptualized as a series of simple document sentences formed from a basic document sentence. This document sentence expresses an understanding of the tabular data in natural language. The document sentence for figure 1.2 is:
Magazine X has a circulation of Y.
Kirsch and Mosenthal use variables (X and Y) to stand for data that comes from a table. Taking values from a row, we plug values for X and Y into the document sentence to obtain sentence instantiations:
TV Guide has a circulation of 19,547,763.
Reader’s Digest has a circulation of 18,094,192.
National Geographic has a circulation of 10,249,748.
Better Homes & Gardens has a circulation of 8,007,202.
Family Circle has a circulation of 7,611,578.
Woman’s Day has a circulation of 7,535,855.
McCall’s has a circulation of 6,502,880.
Document sentences and verbalization sentences are essentially the same. Both sentences use natural language to express in words the meaning of tabular data. Whether one is designing databases or reading structured information, it can be useful for understanding to re-formulate data as statements in natural language.
Let us be a bit formal for a moment. Commercial relational database systems are systems where data is organized into relations. Figure 1.3 shows the general structure of a relation. We say a relation comprises a set of tuples where each tuple has the same number of attribute values, where each attribute value is taken from some corresponding domain, and where a domain represents a set of valid values for an attribute.
Figure 1.3 General structure of a relation
The Employees table in figure 1.1 can be considered a relation of 5 tuples where each tuple has 4 values drawn from each of the employee identifier, first name, last name, and gender domains.
Similarly, we can say the lists comprising the Circulation of leading U.S. Magazines in figure 1.2 can be considered a relation with 7 tuples each having 2 attribute values.
Relations are typically implemented in commercial databases as tabular structures comprising rows and a fixed number of columns. Everybody is familiar with tables as they are commonplace in textbooks, papers, magazines, etc. This simplicity of representation is one reason why relational databases have been very successful as repositories for important data.
1.1 Exercises
To design a database, a database engineer needs to find good representations of how an organization uses data. Good sources include: input forms, reports, web pages, etc. A challenge for database designers is to find these sources and interpret them.
-
- Consider the following table example of product information sold by ABC Foods. Verbalize the information presented.
Product ID
Product Name
Unit Price
Units In Stock
1
Black Tea
$2.00
44
2
Green Tea
$3.00
33
3
Vegetarian Lasagne
$10.00
20
4
Cajun Seasoning
$11.00
29
5
Cranberry Sauce
$21.00
0
- Consider the following report example that the Human Resources department of ABC Foods must produce. Verbalize the information in that report.
Employee ID
First Name
Last Name
Department
1
John
Smith
Receiving
2
Lee
Daniels
Sales
3
April
Turner
Sales
4
Thomas
Trump
Marketing
5
Lee
Smith
Marketing
- Suppose the following input form example is used to enter contact information. Verbalize the information that is being collected.
- Consider the following table example of product information sold by ABC Foods. Verbalize the information presented.
1.2 Microsoft Access
Microsoft Access is a relational database system for workstations that run the Microsoft Windows operating system. Microsoft Access (MS Access) is an integrated Microsoft Office suite application. Microsoft Access is typically used by individuals for data they use personally, but in some situations, a single MS Access database may be used by a group of people or small department.
Microsoft Access databases are stored in a single file that has a file suffix of “.accdb” or “.mdb”. Databases created using MS Access 2007 and later have a file suffix “.accdb”, and databases created using MS Access 2003 or earlier have a file suffix “.mdb”. We will be using databases where the files have names ending in “.accdb”. You need to use MS Access 2007 or later to open these databases.
This open resources will use Microsoft Office 365 – Access. If you are using a different Access version, your version may look slightly different. The basic Access functions applying to different versions of Access will be similar so you should be able to follow along without any problems.
Our first sample database is in a file named Library.accdb; this database is available from the website associated with this text. Open the file Library.accdb included in your data files and then save the database as Library.accdb onto your storage device.
To use this database, you must first download the file containing the database, and then open the database by double-clicking the file name:
Figure 1.4 Double-click the database file to open the sample database
When you open this database you see a list of objects (figure 1.5) in the database; you will see three tables: Book, Loan, Member:
Figure 1.5 The Home tab in MS Access shows you the table names
Double-click a table name and MS Access opens the table in Datasheet View; you can see the contents of Book in figure 1.6. The datasheet view for a table is easily obtained, but it’s not a particularly user-friendly way to view and manage data in a table. We will learn other ways of handling data with MS Access Forms. The Book table has three fields (i.e. attributes): callNo, title, author. When we view a table we see data organized into rows and columns. The data in one row corresponds to one book; if there are 11 books, then we have a table of 11 rows.
Figure 1.6 Datasheet View of a table
The Book table contains one row for each book in the library. We can verbalize the content of a row as: The book identified by call number … is titled … and is authored by …
Substituting actual values from rows we can make explicit statements such as:
- The book identified by call number PC 14 V48 1965 is titled Medieval miscellany and is authored by Frederick Whitehead
- The book identified by call number QA 76.76 A65P76 2011is titled
- Programming Android and is authored by Zigurd R Mednieks
Knowing that books are identified by their call number and since the above statements use the conjunction ‘and’, the above verbalization can be expressed in an elementary form as:
- The book identified by call number … is titled …
- The book identified by call number … is authored by …
Each of these expressions is considered elementary because each states one fact about a specific book. We cannot make these statements any simpler.
Of course, we can now substitute values from the table and obtain:
- The book identified by call number PC 14 V48 1965 is titled Medieval miscellany
- The book identified by call number PC 14 V48 1965 is authored by Frederick Whitehead
- The book identified by call number QA 76.76 A65P76 2011 is titled Programming Android
- The book identified by call number QA 76.76 A65P76 2011 is authored by Zigurd R Mednieks
At this point, expressing verbalizations this way may seem trivial and unnecessary, but they do serve a purpose. These scenarios make it clear that the title and the author’s name serve only to describe a book, and that the call number identifies the book. An aim of a database designer is to understand data requirements in terms of these elementary forms. We’ll have more to say about this in a later chapter.
Up to this point, we have learned how to
- open a Microsoft Access database;
- recognize that a database contains a number of tables;
- open a table to display a collection of rows and columns;
- verbalize the information in a table.
In the next section, we will examine the basic table features in Microsoft Access to insert, modify, and delete data records using our sample database named Library.accdb.
1.2 Exercises
Recall that an elementary verbalization is one where the verbalization cannot be simplified in any further way. Simpler statements would result in a loss of information.
- Rewrite the verbalization for the Employees table using elementary verbalizations.
- Is the verbalization given for Circulation of Leading U.S. Magazines in elementary form?
- What verbalizations apply to the Loan table in the Library database?
- What verbalizations apply to the Member table in the Library database?
- View the data in the Loan table. Each row in the table corresponds to a member borrowing a book. Notice how the call number field contains values that appear in the Book table and how the id field contains values that appear in the Member table. All rows have a value for the data borrowed field. Why would some of the date returned fields appear to have no value at all?
- The web site for these notes has a number of databases. Download the University database and examine its contents. This database contains information about departments and courses in a fictional university. Typically a university is organized into faculties which comprise departments and those departments offer courses. For instance many universities have a Faculty of Science which itself may contain departments such as Mathematics, Statistics, and Physics. Each of these departments will offer courses for students to take: Introduction to Calculus, Introduction to Statistics, Discrete Mathematics, etc.
1.2.1 Modifying Rows
With Microsoft Access, open the Library.accdb database file. Open the Book table in Datasheet view. With the cursor positioned in a row, try modifying the data recorded for that book. If you position the mouse cursor, you can change the value recorded for the book’s call number, title or author. Try doing this – remember you can always download this database again if you wish to get back to what you started with. As you begin modifying a value (e.g. adding an ‘s’ to make the last name Matthews) an editing symbol appears to the left of the row:
Figure 1.7 Editing a row
If you recognize that you are making a mistake you can undo your editing action by pressing the Escape key (keyboard).
To make your change permanent you must move the cursor to another row for the update to be completed – when you do this, you will note the editing symbol disappears.
In some situations, you will find MS Access provides a formal Undo capability. Consider the following figure that shows an Undo icon in the upper left corner that appeared after changed Matthew to Matthews and moving the cursor to the next row:
Figure 1.8 The Undo icon – click it and the last action is undone
1.2.2 New Record Button to Add New Records
Try adding a new book record to the Book table. You can add a new book by first clicking on the New Record button shown near the bottom of the window:
Figure 1.9 Add new record button
To complete your action you must type values for callNo, title, and author. As a first example use a call number that does not appear for any other book. As we will soon see the Book table is designed in such a way that each book must have a different call number. Your addition will be successful if your book is given a call number that no other book has. When you add a new row you must move the cursor out of the row for the addition to be completed.
As a second example try to add a new book, but this time, use a call number that already appears in the table. In this case, MS Access will reject your new record. Try this and you will see a response similar to:
Figure 1.10 MS Access message for duplicate primary key
The important part of this message for us is the part that refers to duplicate values or duplicate data. When we try to add a row with the same call number as some other row MS Access refers to the duplicated call number value. Note that you can press the Escape key to remove the new row from the table display. Soon we discuss table design where you will see that the call number field is designed to be the primary key of the Book table.
Adding a row to a table is also referred to as inserting, or appending a row.
1.2.3 Deleting Rows
You can remove a record in the Book table by highlighting a row (click in the cell just to the left of a call number) and then press the Delete key on the keyboard:
Figure 1.11 Delete a record: select record, press delete
When you press the Delete key, Microsoft Access will respond in one of two ways depending on whether or not there is an existing reference to the row you are trying to delete:
|
No, a reference to the book does not appear in the Loan table. |
|
|
Yes, a reference to the book appears in the Loan table. |
Figure 1.12 Attempting to delete a record (row) in an Access table Datasheet View
When you view the Loan table you are able to see the books that library members have taken out and whether or not a book has been returned. Rows in the Loan table have references to rows in the Book table and to rows in the Member table. The default action in MS Access is that deletion to this record is disallowed if there is a row in a table that has a similar reference to it. So we cannot delete a book if there is a Loan row referencing it.
We have briefly shown how to modify, add and delete data in tables. Next, we will introduce the design perspective for tables.
1.2.4 Table Design View
So far, we have been opening Access tables in Datasheet View where we can view and change data in rows of a table. When in Datasheet View, we can switch from datasheet view to Design View by clicking on the design icon located near the upper left hand corner (see figure 1.13). When the Design View icon is clicked, the display changes: the icon becomes a Datasheet View icon and the display changes to reveal design information (see figure 1.14).
Figure 1.13 Click on the Design View icon to switch to Design View
When you click the Design View icon, you will see the display change as shown in figure 1.14. You will see the field names listed along with their data type. According to the field where the cursor is located, you see other properties for that field. Data types vary somewhat from one database system to another, but of course there are many similarities too. Properties are other characteristics that you can define for a field such as the maximum length of values stored for the field.
Generally, we want data in a database to be reasonable and correct. We can use data types and properties to achieve certain types of correctness. Consider the following integrity rules as rules we would like to enforce:
- Call numbers, titles, and authors are alphanumeric. Any text you can type on the keyboard is acceptable.
- Each call number must be unique (there can be no duplicates)
- Each book must have a title
- A value for call number must be no more than 50 characters long
- A value for title must be no more than 255 characters long
- A value for author must be no more than 255 characters long
The author field can be left out (it can be null).Now we discuss how these integrity rules are obtained in Table Design View.
Figure 1.14 Table Design View
In figure 1.14, the cursor is located on the callNo field; some properties of callNo are circled and discussed below:
- Beside the callNo field you can see the key icon. This means the callNo field is the primary key. A primary key is a unique identifier – every row in the table must have a unique value in that field. Every table should have a PK specified and there can be only one PK for a table. When a field is defined as the PK then a value must be provided in each and every row.
- The callNo field has a datatype of Short Text and a field size of 50. Any value you can type on the keyboard is acceptable but the overall length, number of characters, is restricted to at most 255.
- The callNo field is indexed and in this case no duplicates are allowed. The index constructed by MS Access is similar in purpose to the index at the back of any book: the index allows MS Access to quickly locate a specified row. However, this index is different from that at the back of a book because it allows only one entry per indexed value (No Duplicates is specified for the Indexed property). Each call number is unique. As you move the cursor up and down you should note the following for this sample table: For title:
- The title field has a data type of Long Text with a field size of 255. A text field can comprise any combination of letters, digits, and punctuation. Any value entered by a user cannot exceed a gigabyte of text. Access forms and reports can only display 64,000 characters.
- A value is required. When entering data for some book, the user cannot omit the title.
- There is no index on title.
For author:
- The author field has a data type of Long Text with a field size of 255. A text field can comprise any combination of letters, digits, and punctuation. Any value entered by a user cannot exceed a gigabyte of text.
- A value is not required. When entering data for some student, the user can omit the author.
- There is no index on author.
Now, open the Member table and then the Loan table in Design view. Examine the properties of each field. For reference see figure 1.15.
Member Table
|
Loan Table
| Field | Data Type |
| callNo | Short Text |
| id | Number: numeric values must be entered by the user. This data type is a proper match to the Autonumber data type. |
| dateBorrowed | Date/Time: the user must enter a date |
| dateReturned | Date/Time |
| dateDue | Date/Time |
| fine | Currency: the user must enter values in dollars and cents. |
Figure 1.15 The fields of the Member and Loan tables
Later, we will examine data types and properties in much more detail.
1.24 Exercises
Use your Library.accdb to complete the following:
- Use Design View to add the following two fields to the Member table.
Gender: Text field of length 25 to accommodate the values male, female, and prefer not to disclose. Make this a required field that is not indexed.
BirthDate: a Date/Time field; required; not indexed. Switch back to Datasheet View (You must reply yes to the system prompt to save your changes). You should notice there are no values for gender nor birthDate. - Now enter values you deem appropriate in the gender and birthdate fields for each member. Close the table and reopen it. You will see the values you entered are still there.
- When new members join the library information about them must be entered into the Member table. Each member is given an id value automatically. Add new members to the library and note how MS Access will not let you enter id values; instead, MS Access generates those values for you – id values are generated sequentially. Close the table and then reopen the table to confirm your additions worked.
- Typically, a library assesses a fine the user must pay if they keep a book out past the due date. As well the library needs to track the amount, if any, the member has paid. In this exercise, add two fields to the Loan table so we can keep track of fines that are assessed and the amount the member has paid.
- Open the Loan table in design view and add two new fields named fineAssessed and finePaid. These fields must have a Currency data type. In lower pane of the Design view window, verify the Field Properties for fineAssessed and finePaid fields. Format property should contain a Currency data type and Decimal Places property contains Auto.
- Save the Loan table and then view the rows of the table. There are no amounts for these fields.
- Choose some row(s) in the Loan table and enter values for the fineAssessed and finePaid fields. Note the values you enter will appear as dollars and cents.
- In exercise 3, you added a new member and in exercise 4 you added fields to the Loan table. Consider that the person you added now borrows a book and so a row must be entered into the Loan table. Enter such a row.
- After successfully entering data for exercises 3, 4, and 5, you are aware of a member and a book for which there are references in the Loan table.
- View the Member table and try to delete that member, and then view the Book table and try to delete that book. These deletion attempts are unsuccessful because of the references to the Loan table.
- Now open the Loan table and find the loan record you entered in exercise 5. If you delete this row you will find that you are able to delete the member (provided you did not enter more loans for this person). These actions mirror the way in which data would typically be deleted from a database: if you want to delete a row you must first delete (or modify appropriately) any rows that reference it.
- Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks, IBM Research Report, 1969. ↵
- A Relational Model of Data for Large Shared Data Banks, CACM 13, No. 6, June 1970. ↵
- Information modeling and relational databases, 2nd edition, by Terry Halpin and Tony Morgan; Morgan Kaufmann Publishers; ISBN -13 978-0-12-373568-3. ↵
- Irwin S. Kirsch and Peter B. Mosenthal. Building documents by combining simple lists. Journal of Reading, Vol. 33, No. 2, pp. 132-134. ↵