A modern DBMS is more realistic which makes the process easier in retrieving, manipulating, and producing information. In the conventional method, data has been arranged in a file and all the research helped to overcome the lack in the traditional system of data management. Thus, the modern database system is associated with following the basic rules of normalization by storing metadata. The study will evaluate the process by which the database has made the evolution from a desktop system to a modern DBMS system by demonstrating the reason. In addition, the study will analyze the alternative data warehouse and business intelligence architecture and also the modern approaches.
How Database Management Systems (DBMS) have evolved over the years (L1 and L5)
The traditional database system has evolved from a file-based system to a database approach. In other words, it has evolved from a simple record-oriented navigational database system to a relational database system that is object-oriented and supports multimedia. In the 1960s, fiat file-based systems helped to store information in a single file and every line consisted of one record where the field had a fixed length or it had a separation from tabs, commas, and whitespace (Poltavtseva, 2019).
If one record is misplaced or deleted from the flat file database, then all the data from different files has to be omitted manually, which makes for inefficient data manipulation. In addition, the princess was very time-consuming and it also wasted the space of a computer.
In the 1970s' a hierarchical model of the database was introduced that contained data in a hierarchical manner. It was visualized as a family tree that indicated the parent-child relationship. In this system, parents can have many children but each child has only one parent. Additionally, it contains segments that are similar to the file record system and the main entity was formed in a table. In this process, the speed of accessing information was relatively fast and it also increased the level of performance of data. On the other hand, this model lacks flexibility, and maintenance of this kind of database was difficult
In the late 80s,' Charles Bachmann invented a network model which ensured the allowing of multiple parent and child relationships which indicated a graph structure (Anwar et al., 2020). It is an undeniable fact that it has improved the performance of databases and it helps to represent complicated data in a simpler form. It has ensured data model integrity and it also ensures data independence. However, it has a deficiency of structural independence and for this reason, a user-friendly database management system could not be built.
At the present, the relational database model has been introduced by E.F. Codd which is associated with allowing the entities which are involved in a common attribute (Cao, 2021). The main benefit of using relational databases is that data can be stored in a small table and the property of the database is flexible. The languages that are used in relational database models are human-readable and it ensures a security control that also imposes a control of authentication (Liu et al., 2018).
Another modern database is the object-oriented database model where information is stored as an object and object-oriented programming languages. In this process, the navigation of data is easier and it allows complexity. Thus, it can be stated that the main reason for the evolution of data is the increment in data performance, more flexibility, complex data stores, increment in data efficiency, increment in speed, and saving time.
Describe a DBMS system used for Transactional Processing (L2)
DBMS transaction processing refers to logical techniques of processing units that include collections of databases that access the operations. It has been noticed that in DBMS, all types of databases are associated with accessing the operation and launching and finishing statements of transactions that can be measured in a simple logical transaction (Groomer & Murthy, 2018). At the time of this transaction on the server, the database can be inconsistent.
It indicates the database has been structured to optimize the performance of the process of the transaction. It has been noticed that relational databases have become one of the main data management technologies despite big data analytics and NoSQL.
DBMS is capable of updating, storing, and querying large tables. It also manages query structured data. However, the main challenge is associated with analyzing large data efficiently. Many researchers have proposed many efficient algorithms. Data mining is focused on discovering knowledge presented by models and patterns in a large database. DBMS has become addressed as a recent technology that is used in data mining in large datasets on the basis of automatic data parallelism. It is also engaged in processing transactions and SQL queries on the relational table.
In DBMS, transaction processing intends to conserve database integrity. In the IT infrastructure and database management, there are common database problems that can make a huge difference and make disruption while managing information. In the present context, the Oracle database is addressed as a multi-model database management system that is used in online transaction processing, data warehousing, and database workload. After the basic configuration, the whole database system conducts its process automatically. Thus, the main problem can occur at the time of configuration. In terms of managing the Oracle database management system, the database should be enabled with Oracle database QoS and it must enable the access of APPQOSSYS. After selecting the "Enable Quality of Service Management'' in the cloud control option, it reduces the issue of databases (Oracle, 2022).
It is important at the time of implementation to ensure the usage of a particular resource plan that can help in moving the performance classes. If the resource plan is not enabled, it might provide error results in the database. Another important aspect is an unmanageable server pool when the oracle database cannot be able in predicting or measuring data performance classes. This database measures the basic ability of an organization to deliver its service, Thus, if one database performs not well, it would automatically reduce the performance of dependent data. Thus, it requires to be resilient, frequently backed up, and reviewed in a continuous manner to identify the potential hazards that can slow down the intrastate.
Every business needs to focus on increasing its security, the nature of data storage and an option of analytics increases the complexity. Since there are many options like columnar databases, relational databases, and object-oriented databases, it has become difficult to evaluate and select the right solution. It has been noticed that the software that has been used in the computer has limitations of scalability and usage of resources. At the present, enterprises are focused on processing the transaction capacity by knowing the components of cataloging, and the architecture of the database that affects the whole level of scalability (Wang & Kogan, 2018). Databases have been identified as one of the most hidden factors of a company that needs high focus security. The risk of a data breach can cost a company an average of $ 4 million and also it can lead to a loss of reputation. In the decentralized database management system, the challenges of distribution pose the chance of lacking centralized knowledge of the whole database.
Discuss the architecture of a Data Warehouse and its alternative options (L4)
A data warehouse architecture refers to the procedure of describing the whole architecture of processing data communication and presentation for end-cloud computing.
Figure 1: Basic Data Warehouse Architecture
Source: java point, 2022
In a basic data architecture, an operational system is used in the data warehouse that processes day-to-day transactions. A flat file system stores transactional data. Metadata provides information about other data. The area of the data warehouse is engaged in saving highly and lightly processed data.
Figure 2: Data warehouse architecture with a staging area and Data Mart
(Source: java point, 2022))
Figure 3: Data warehouse architecture with a staging area and Data Mart
(Source: java point, 2022)
A Customised data warehouse needs a data mart that refers to a segment of a data warehouse that can give information on the reporting and analysis of a section. In a business, Effective decision-making processes rely upon high-quality data which needs agile access to a data storage warehouse, arranged in an efficient way that helps in improving business performance. With the rapid evolution of in-memory computing and data visualization, the modern database management system and business intelligence suture an alternative data warehouse.
In contrast to a data warehouse, a collection of unstructured data is referred to as “data lakes,” and the word is used to characterise the collection. Relational database management systems, also known as RDBMS, are put to use in a typical data warehouse for the purposes of storing and retrieving information. One example of this kind of database is the SQL database. [Case insensitive] Data Lakes’ multidimensional data cubes are what are referred to as non-relational datastores. This is in contrast to the tables that are found in relational databases. On the other hand, the data warehouse contains a greater quantity of information than the cubes, which enables analysts to formulate queries at a more granular level. When dealing with data that is either too sparse or too unstructured to be included within the OLAP cube itself, one can supplement the cube with relational databases in a number of different methods. When linking tables together in a data warehouse, a star system is typically utilised. The same information is attributed to several different dimensions using the star schema's method of linking primary keys and foreign keys (Bhatia, 2019). On the other side, the data lake is comprised of data that is neither structured nor schematized in any way.
The vast majority of the data is processed before it is saved, and there is no usage of a rigid structure such as that which is utilised in data warehouses. The usage of data lakes, which can be accessible through web-scraping and social media, can also be beneficial to big data organisations. These organisations can reap the rewards of this use. The handling of terabytes and petabytes of data occurs on a petabyte size, which is linked with a massive deployment of sensors of a huge scale. At the moment, businesses are apprehensive that the process of creating data lakes will be challenging and expensive due to the perceived complexity and cost. On the other side, data lakes have layers other than data sourcing, data staging, data storage, and data presentation. These levels include ingestion, distillation, processing, insight, and unified operations (Roth et al., 2018) Layers of data input and distillation collaborate to produce a logical framework that may be utilised for analytical purposes with the data that is kept in lakes of data. The analytical tools are processed by a layer of the infrastructure. The Insight layer offers a query interface, while the Unified operation layer is responsible for workflow management. Both layers are part of the overall system.
Figure 4: Alternative data warehouse: Data lakes and their different layers
(Source: Samuel, Sharma & Varshney, 2022)
Business intelligence architecture is associated with enabling fast data performance and efficient data analysis that helps in extracting relevant information from data. At the present, it has unclosed the usage of OLAP and reporting tools in the traditional database system. There are various alternative options of BI architecture options such as NoSQL and NewSQL databases (Armbrust et al., 2021). NoSQL database refers to a non-tabula database that stores information in a different way rather than a relational database table. The main types of entities are wide columns, key values, and graphs. It has unique features such as horizontal scaling, flexible schemas, and user-friendliness (Behbahani Nejad & Rashidi, 2022). Additionally, it includes agile developments and stores a high volume of data.
However, business intelligence in NoSQL is not relatively improved for its unstructured nature. The query languages are not consistent in the NoSQL database; it does not support joining the tables (Lopes, Guimarães & Santos, 2020). NoSQL databases are most useful in storing and modeling unstructured, semi-structured, and structured data in one database. It is a fact that the databases using these models are the most significant from a business perspective and are the most commonly used in a variety of applications.
From the above study, it can be concluded that the DBMS has evolved from flat files to relational database management systems and object-oriented database management systems. DBMS can efficiently process the transaction and increase the performance of the business. However, it poses various challenges like data securing issues, and data breach issues. Moreover, the study has highlighted the architecture of data warehouses.
Anwar, A., Mahmood, A., Ray, B., Mahmud, M.A. & Tari, Z., (2020). Machine learning to ensure data integrity in power system topological network database. Electronics, 9(4), p.693. Retrieved on 11th July, From: https://www.mdpi.com/699902
Armbrust, M., Ghodsi, A., Xin, R. & Zaharia, M., (2021), January. Lakehouse: a new generation of open platforms that unify data warehousing and advanced analytics. In Proceedings of CIDR. Retrieved 11 July 2022, from: https://eva.fing.edu.uy/pluginfile.php/338191/mod_resource/content/1/cidr2021_lakehouse.pdf
Behbahani Nejad, M.R. & Rashidi, H., (2022). A Novel Architecture based on Business Intelligence Approach to Exploit Big Data. Journal of Electrical and Computer Engineering Innovations (JECEI). Retrieved 11 July 2022, from: https://jecei.sru.ac.ir/article_1727.html
Bhatia, P.,(2019). Data mining and data warehousing: principles and practical techniques. Cambridge University Press.Retrieved 11 July 2022, from: https://ieeexplore.ieee.org/abstract/document/8622206/ https://www.javatpoint.com/data-warehouse-architecture#:~:text=A%20data%20warehouse
Cao, T.H., 2021. A relational database model and algebra integrating fuzzy attributes and probabilistic tuples. Fuzzy Sets and Systems.Retrieved on 11th July, From: https://www.sciencedirect.com/science/article/pii/S0165011421004127
Data Warehouse Architecture - javatpoint. (2022). Retrieved 11 July 2022, from https://www.javatpoint.com/data-warehouse-architecture#:~:text=A%20data%20warehouse%20architecture%20is,characterized%20by%20standard%20vital%20components
Groomer, S.M. & Murthy, U.S., (2018). Continuous Auditing of Database Applications: An Embedded Audit Module Approach1. In Continuous auditing. Emerald Publishing Limited.Retrieved on 11th July, From: https://www.emerald.com/insight/content/doi/10.1108/978-1-78743-413-420181005
Liu, Z.H., Lu, J., Gawlick, D., Helskyaho, H., Pogossiants, G. & Wu, Z., (2018). Multi-model database management systems-a look forward. In Heterogeneous Data Management, Polystores, and Analytics for Healthcare (pp. 16-29). Springer, Retrieved on 11th July, From Cham.https://helda.helsinki.fi/bitstream/handle/10138/314140/Poly_3_.pdf?sequence=1
Lopes, J., Guimarães, T., & Santos, M. F. (2020). Adaptive business intelligence: A new architectural approach. Procedia Computer Science, 177, 540-545. Retrieved 11 July 2022, from https://www.sciencedirect.com/science/article/pii/S1877050920323450/pdf?md5=b05c5df5831580932f57394d375f3d65&pid=1-s2.0-S1877050920323450-main.pdf
Oracle, Quality of Service Management User's Guide. (2022). Retrieved 11 July 2022, from https://docs.oracle.com/en/database/oracle/oracle-database/12.2/apqos/common-problems.html#GUID-92D6BE79-06FE-4B6C-8820-6890549D4586
Poltavtseva, M.A., (2019), March. Evolution of data management systems and their security. In 2019 International Conference on Engineering Technologies and Computer Science (EnT) (pp. 25-29). IEEE. Retrieved on 11th July, From: https://ieeexplore.ieee.org/abstract/document/8711971/
Roth, J. A., Goebel, N., Sakoparnig, T., Neubauer, S., Kuenzel-Pawlik, E., Gerber, M., ... & PATREC Study Group Abshagen Christian Fucile Geoffrey Gerber Martin Goebel Nicole Hug Balthasar L Jaegle Bernd Kuenzel-Pawlik Eleonore Neubauer Simon Padiyath Rakesh Roth Jan A Sakoparnig Thomas Sengstag Thierry Spyra Damian Widmer Andreas F. (2018). Secondary use of routine data in hospitals: description of a scalable analytical platform based on a business intelligence system. JAMIA open, 1(2), 172-177. Retrieved 11 July 2022, from: https://ieeexplore.ieee.org/abstract/document/9071769/
Samuel, N., Sharma, D., & Varshney, H. (2022). Data Lake Architecture: 10 Critical Aspects | Learn - Hevo. Retrieved 11 July 2022, from https://hevodata.com/learn/data-lake-architecture-a-comprehensive-guide/
Wang, Y. & Kogan, A., (2018). Designing confidentiality-preserving Blockchain-based transaction processing systems. International Journal of Accounting Information Systems, 30, pp.1-18. Retrieved on 11th July, From: