# Essays on Business Intelligence and Enterprise Data Mining Case Study

This information can be used to increase revenue and cuts costs. Technically, Data Mining is the process of finding correlations or patterns among various fields in large relational databases. Data Mining is used by retail, financial, communication, and marketing companies with a strong consumer focus to determine relationships among internal factors (price, product positioning, or staff skills) and external factors (economic indicators, competition, and customer demographics).

Data Mining enables companies to determine the impact on sales, customer satisfaction, and corporate profits. It also helps in summarising the data/ information to analyze transactions. Data Mining helps a company to use customer purchase records and send targeted promotions based on an individual's purchase history.

The company can also use demographic data from comment or warranty cards to develop products and target promotions to specific customer segments. Attributes of the Data Mining Process Classification: Data is classified according to type or category, e.g. marketing data can be classified into categories like people who buy and people who do not buy. Association: Data is associated together as per their relation, e.g. people buy tables along with computers. Sequence: Data can be associated depending on the sequence of events, e.g.

people buy insurance after buying a car. Clustering or Segmentation: Data can be grouped as per classification or segments, e.g. products are separated and grouped together in malls so that they can be easily found.   Major Data Mining Techniques Three types of major Data Mining techniques are: Statistical Approach: Statistical Approaches use statistical tools like Bayesian Network, Regression Analysis, Cluster Analysis, and Correlation Analysis for data mining. Statistical models are built using a set of training data. The optimal model can be built based on a hypothesis. We get a set of rules, patterns, and regularities from these models. Bayesian Network is a directed graph that is computed using the Bayesian Probability theorem.

It represents the causal relationships among the variables. Regression is the function derivation, which maps a set of attributes of objects to an output variable. Cluster analysis finds groups from a set of objects based on distance measures. Correlation analysis studies the correspondence of variables to each other, such as the multiples. Example of Bayesian Network for a medical problem: Patient’ s age, occupation, and diet are nodes that represent variables. Irregularity in these nodes affects the patients, causing symptoms of diseases.

Arrows between variables and effectively represent the dependencies between nodes.

References

Data Mining

Retrieved Oct 07, 2009 from

http://en.wikipedia.org/wiki/Data_mining

http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm

Books- Data Mining: Concepts and Techniques (Second Edition) authored: (Jiawei Han and Micheline Kamber, University of Illinois at Urbana-Champaign, 2006)

Book chapters: S. Sumathi and S.N. Sivanandam: Data Mining Tasks, Techniques, and Applications, Studies in Computational Intelligence (SCI) 29, 195–216 (2006)