FOUNDATION OF BUSINESS INTELLIGENCE: DATABASE AND INFORMATION SYSTEM
Organizing Data In Traditional File Environment
In a traditional file environment each department store file separately which is available to other department. This system is associated with the following problems- Data Redundancy: Duplicate data in multiple data file are found because each department independently collect same information leading to wastage of storage space.
- Data Inconsistency: Various copies of the same data may not agree leading to lack of integrity of data.
- Lack of Flexibility: Traditional file system can deliver routine schedule report after extreme programing but can not deliver an ad hoc report or respond to unexpected information in timely fashion.
- Poor security: There is little control or management of data which make access to and dissemination of information to be out of control.
- Lack of data sharing and availability: Pieces of information store in different file and in different part of the organization can not be related to one another and that makes it impossible for information to be share.
- Program data dependence: Data store in file can be maintain and updated by specific program such that a change in program require changes in data
Database Approach To Data Management
Database: Is a collection of data organized to serve many application efficiently by centralizing the data and controlling redundant dataDatabase Management System (DBMS): DBMS is a software that permit an organization to centralize data, manage them efficiently, and provide access to the store data by application program. DBMS act as an interface between application program and the physical data file.
Relational DBMS: Relational DBMS represent data as two dimensional tables (called relation) or file. Each table contains data on entity and attributes.
Operation Of a Relational DBMS
Three basic operations used to develop useful set of data incudes:
- Selection: Is used to create a subset of data of all records that meet stated criteria.
- join: Involves combining rational tables that provide user with more information available in individual tables
- project: This operation create subset of columns in tables which permit the user to create new table that contain only the information require.
Object Oriented DBMS (OODBMS): OODM store data and procedure as object that can be automatically retrieved and share. the object can be- graphics, multimedia, java applets. OODBMS can store more complex information than the rational DBMS but relatively slow due to processing large number of transaction.
Hybrid Object-Rational DBMS: This system provide the capabilities of both OODBMS and rational DBMS.
Database In The Cloud: Cloud computing service providers can provide cloud DBMS but such service is less functionality than on-premises database.
Capabilities of DBMS
- Data Definition: This capability is used to specified the structure of the content of the database. It help in creating tables and defining the characteristics of field in each table.
- Data Dictionary: is an automated or manual file that store definition of data element and their characteristics.
- Data Manipulation: It also use to add, change, delete, retrieve data from the database. This capability contains language commands that permit end users and programing specialist to extract data from the database to meet information request and develop applications.
Designing Database
creating a database require two dimensional exercise. Including
- Conceptual Design: Is an abstract model of database from a business perspective
- Physical Design: This show how database is arrange on direct-access storage device.
Normalization: Involves streamlining complex group of data to minimize redundant data element and awkward many to many relationship.
Entity Relation Diagram: It illustrate the relationship between entities. Is use by database designers to document data model
Distributing Database: This involves storing data in more than one place.
- Partitioned: This is where separate locations store different parts of database
- Replicated: Central database duplicated is entirely at different location.
Using Database To Improve Business Performance And Decision Making
Companies with large database require special capability and tools to analys and for accessing data from multiple system.Three Key Techniques For Handling Large Database
1. Data Warehousing: Is a database which store current and historical data of potential interest to decision makers throughout the organization. Such data originate from core operational transactions system.
Data Marts: Is a subset of data warehousing in which summarize focus portion of the organization data is place in a separate database for a specific population of users
Tools for Business Intelligence
These are tools which enable users to analys data to see pattern, relationship, and insight that guide decision making. These tools are:
- Software for database query and reporting
- Tools for multidimensional data analysis ( online analytical process - OLAP)
- Data mining
2. Data mining: Is more discovery driven than the OLAP. Provide insight into corporate data that can not be obtain with OLAP by finding hidden patterns, relationship in large database and infer rules to predict future behavior. The pattern and rules guide decision making and forecast the effect of those decision.
Types Of Information Obtain From Data Mining
a) Association: These are occurrences link to a single event.
b) Sequence: This are event that are link overtime.
c) classification: Recognize the pattern that describe the group to which an entity belongs by examining existing items that have been classified and infer asset of rules.
d) Cluster: Discover different grouping within data where no group have defined.
e) Forecasting: uses a series of existing values to predict what other values will be.
Predictive Analysis: involves using data mining techniques, historical data, and assumption about future condition to predict outcomes of events.
Text Mining: Is a tool that extract large unstructured data set, discover pattern and relationship and summarize the information. Is useful tool to organization because most organization information are in the form of a text file like e-mails memo's, survey responses.
Web Mining: Is the discovery and analysis of useful information from the WWW. Web mining help business understand customer behavior, evaluate the effectiveness of a particular Web site.
Web content: Extract knowledge from the content of Web pages which includes: text, images, audio, video.
Web structure mining: Extract useful information from link embedded I web document.
Web usage mining: Examines data recorded by web server whenever request for a web site resources are received.
Databases and The Web
some organizations use the web to make internal database available to customers . typical configurations include:
- web server
- application server
- database server
Advantages of using web for database access
- Ease of use of browser software
- Web interface require few or no changes to database
- Inexpensive to add web interface to system
Managing Data Resources
Establishing Information Policy: This is a set of rule for sharing, dissemination and keeping of information. This policies are require to manage information well in organization.Data administration: they are responsibility for specific policies and procedures through which data can be manage as an organizational resources. Responsibilities include: information policy, data planning, overseeing logical database design etc.
Data governance: Policies and processes for managing availability, usability, integrity and security of enterprise data especially as it relate to government regulation.
Ensuring Data Quality
To ensure data quality organizations should ensure the following is done:
- Data Quality Audit: Is a structure survey of accuracy and level of completeness of data in an information system. the survey sample data file and also survey end user perception of quality.
- Data cleansing: Involves using a software to detect and correct data that are incorrect, incomplete, improperly formatted or redundant.