"

7.1 Introduction and Background

How were things managed before technology became a major influence in our lives? Tracking information was difficult before the employment of databases. There was plenty of room for error with the pen and paper method. It was not until the 1960s when databases were used from a computer-based format. However, most computerized databases still use the principles and methods developed in the previous age.

Databases now are used everywhere to store information. Whether it be in a customer management system or tracking bank information, databases are utilized to store the necessary data for later use. Data is structured in rows and columns featuring different fields for queries and stored in multiple tables to showcase the relationship between them. According to Oracle.com, “Databases have evolved dramatically since their inception in the early 1960s” (Oracle). In the beginning, only navigational databases, such as the hierarchical and network database, were employed. As time went on, new types of databases were created based on the needs of organizations and the management of their data.

Source: ​https://www.smartsheet.com/database-management

In the graph shown above, a database is used in a database management system (DBMS) for short, is a form of software that allows an organization to access and manipulate data that will be showcased in a form that is unable to be changed by other applications and users.

Limitations of Conventional File Processing

Files are used to store specific data for future use and recollection. When computers first became mainstream, files were stored like paper, in the form of flat files. This information was collected in notepads separated by spaces, commas, semicolons or other symbols. Organization of files was often based on their categories, consisting of only related information with specific names. The downside to this is that you were unable to open the files without using a specific coding language to edit it. While it appeared convenient at the time, it is easy to identify the many disadvantages to using this system.

Data Redundancy         

One of the major problems with this system was data redundancy and inconsistency. Since the files and programs jammed into files were created by several different programmers over a long period of time, the files were certain to be in different formats, involving several different programming languages. Most of the information is also constantly duplicated due to how tedious it would be to access others’ code and double check the information. For example, if a customer of a bank has two accounts, the data accompanied by these accounts would be stored in two separate files in order to satisfy both accounts as they are made. This leads to data redundancy. This would lead to bigger storage sizes for the same information, increasing the cost.

Data Accuracy

The countless copies of this data could also have discrepancies, making it impossible to know which information is accurate. Whenever a new value needs to be entered into the database, every single file with this data has to be updated to prevent this. This would lead to tedious work that wasn’t 100% accurate in the end. For example, a company could have stored customer data, including name, address, and city. There could be a request in which the record of a customer who lives in a specific city is needed. In order to achieve this, a new program would need to be written and executed, and the file containing the customer’s city had to be accessed. Every single customer who belonged to this city would need to be specifically selected and taken out into this new program in order to organize the data. This is neither convenient nor reliable. These copies also contributed to the difficulty involving the creation of new applications, as they may be unable to find the appropriate data. This also ensured atomicity didn’t work. Atomicity is a sequence of database processes such that either all occur, or nothing occurs. This could be used to prevent updates to a database occurring only partially; however, atomicity is unable to work unless it is able to read and write to every single file, which in this structure, is extremely difficult.

There was also a difficulty in accessing data due to the “spaghetti code” structure of this system. If a specific set of information is needed to be organized in a new way, unless it was anticipated prior to the initially being created, it was nearly impossible to achieve this. The application needed to display the information in the requested way would not have existed. This system doesn’t allow data to be retrieved in a convenient manner, leading to different systems created down the line.

Integrity problems were also created due to the data values in a database needing to satisfy certain types of consistency constraints. Since most of the code involving these files is in different languages, it is almost impossible to change them all to enforce new constraints. The file system also lacks concurrent access. In modern systems, multiple users can update the data simultaneously. This is to ensure a faster response time and to improve the overall performance of the system. The involvement of multiple users may result in inconsistent data, which is normally prevented using supervision. However, in a file processing system, this supervision is lackluster due to the several applications and various languages. It all leads to the same problems in the end.

Data Security

Security is also a major issue in this system. In a database, every user in the database system shouldn’t be able to access all the data. Each user should be delegated and only allowed to access specific data requiring a password of sorts. In a file processing system since different programmers add their own application, there is either a universal password or so many passwords that the information is scrambled and the people requiring it can’t access it. Since every new file is only added when needed, it is difficult to constantly change the permission for each individual file in order to ensure security standards.

These disadvantages would lead many to convert to a database approach rather than a file system. A database corrected many of these errors reducing the development time and increasing the data integrity of every file. It is true that file processing systems were full of many errors, but they are known as a stepping stone towards more perfected systems of data storage.

Advantages of Databases

In today’s world, data is prevalent in every aspect of our lives as human beings. Data is constantly being created, organized, and stored. With all this data being transferred and exchanged around the world, it is important to have an efficient and organized method to storing this data. This is where databases come in. Databases offer improved efficiency and versatility, they allow categorization and structuring of available data, and they allow multi-user access, creating an organized work environment and newer and better ways to manage data.

Efficiency comes into play specifically with businesses. Databases can handle large amounts of data as well as multiple types of data. Businesses can use databases to have data easily accessible to make operational decisions on a daily basis.

Versatility is also important in terms of accessing data. Databases can be accessed via desktop, laptop, tablet and even mobile devices. This is incredibly helpful in a time where so much importance is placed on accessing things immediately, as data can be easily retrieved at any moment. This benefit is applicable to consumers as well as businesses.

Categorization and organization are both major advantages. They allow the structuring of information in ways that are easily understandable and accessed. Certain DBMS allow relationships between entities in order to simplify the organization of data.

Source: Liz Parody (Databases for Front-End Developers;Medium.com)

Accessing data in a multitude of ways by multiple different users is also a huge advantage that databases have; this is called multi-access. Multi-access is what allows multiple authorized users to have access to the same data. For example, a human resources manager at a company will have access to the same set of potential hires at a certain location as the general manager of that same location. The picture below visually describes the relationship between this shared data and the users that have access to it. (WD, 2005)

Source: https://www.workingdata.co.uk/spreadsheets-vs-databases-round-1-multi-user/

Databases offer businesses a smoother operating work situation. The implementation of a database management language such as SQL (Structured Query Language) allows businesses to access and modify data that is stored in a relational database.

Databases are constantly being used and accessed in new ways. With all the advantages that databases offer, uses will continue to grow. The accessibility, versatility and efficiency that a database can provide when paired with a DBMS is the reason why so many successful businesses are using them to this day.

Attribution

By Sarah North and Xiaohua Xu, Introduction to Database Systems, textbook was developed as part of a Round 16 Textbook Transformation Grant, and licensed under  CC BY-NC-SA 4.0.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Data Analytics for Public Policy and Management Copyright © 2022 by Luis F. Luna-Reyes, Erika G. Martin and Mikhail Ivonchyk is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.