Business Intelligence and Enterprise Data Mining Case Study

Business Intelligence and Enterprise Data Mining Task: A Case Study What is Data Mining? It is estimated that the amount of information (data) in the world doubles every 20 months and creates data explosion. The collected data is stored for record purpose in all businesses, companies, shops etc. This data is important for any business as it contains information and knowledge about the business which helps to take decisions for better business opportunities. Data Any facts, numbers, or text that can be processed by a computer. Operational or Transactional data Sales, cost, inventory, payroll, and accounting Nonoperational data Industry sales, forecast data, and macro economic data Meta data Data about the data itself, such as logical database design or data dictionary definitions Information Patterns, associations, or relationships among all data Knowledge Summary information about historical patterns and future trends of a product or service for decision making Data Mining is the process in which data from different sources and perspectives is analysed, categorised and summarised into useful information and used for making better business decisions and conducting better business. This information can be used to increase revenue and cuts costs. Technically, Data Mining is the process of finding correlations or patterns among various fields in large relational databases. Data Mining is used by retail, financial, communication, and marketing companies with a strong consumer focus to determine relationships among internal factors (price, product positioning, or staff skills) and external factors (economic indicators, competition, and customer demographics). Data Mining enables the companies to determine the impact on sales, customer satisfaction, and corporate profits. It also helps in summarising the data/ information to analyse transactions. Data Mining helps a company to use customer purchase records and send targeted promotions based on an individual's purchase history. The company can also use demographic data from comment or warranty cards to develop products and target promotions to specific customer segments. Attributes of Data Mining Process Classification: Data is classified according to type or category, e.g. marketing data can be classified into categories like people who buy and people who do not buy. Association: Data is associated together as per their relation, e.g. people buy tables along with computers. Sequence: Data can be associated depending on sequence of events, e.g. people buy insurance after buying a car. Clustering or Segmentation: Data can be grouped as per classification or segments, e.g. products are separated and grouped together in malls so that they can be easily found. Major Data Mining Techniques Three types of major Data Mining techniques are: 1. Statistical Approach: Statistical Approaches use statistical tools like Bayesian Network, Regression Analysis, Cluster Analysis, and Correlation Analysis for data mining. Statistical models are built using a set of training data. Optimal model can be built based on hypothesis. We get a set of rules, patterns, and regularities from these models. Bayesian Network is a directed graph that is computed using the Bayesian Probability theorem. It represents the causal relationships among the variables. Regression is the function derivation, which maps a set of attributes of objects to an output variable. Cluster analysis finds groups from a set of objects based on distance measures. Correlation analysis studies the correspondence of variables to each other, such as the multiples. Example of Bayesian Network for a medical problem: Patient’s age, occupation, and diet are nodes that represent variables. Irregularity in these nodes affects the patients, causing symptoms of diseases. Arrows between variables and effect represent the dependencies between nodes. 2. Machine Learning Approach: Machine learning method for data mining searches for the best model that matches the testing data. This method uses cognitive techniques and heuristics in the search process. This process includes decision tree, inductive concept learning, and conceptual clustering. Decision Tree is used for classifying data into object classes in the form of root to leaf node path, where the branches are attributes of the object. Decision tree can be prepared from the training data set and Classification rules can be extracted from it. Inductive Concept Learning derives a concise and logical description of a concept from a set of examples. Conceptual Clustering finds groups or clusters in a set of objects based on conceptual closeness among objects. Example of Machine Learning Approach is to determine the mileage of a car based on its size, transmission type, and weight. The leaf nodes represent mileage classes. Using a decision tree technique, we can conclude which car gives better mileage. 3. Database-oriented Approach: Database-oriented methods use data modelling or database-specific heuristics to exploit the characteristics of data in hand. Some database-oriented methods are Attribute-oriented Induction, Iterative Database Scanning for frequent item sets, and Attribute focusing. In Attribute-oriented Induction, primitive and low-level data is generalised into high-level concepts using conceptual hierarchies. Iterative Database Method is employed to search for frequent item sets in a transactional database, from which the association rules are derived. Attribute-Focusing Method looks for patterns with unusual probabilities by adding attributes selectively into the patterns. Example of Database-oriented Approach can be organization chart with employee names, designations, qualifications, age, salary, and company codes presented in a simple conceptual hierarchy. Case Study: Tayko Mail Promotion Tayko is a software firm selling games. Over the years, the company has collected a large numbers of customer names and wants to mail their catalogues to the customers. The customer name list is very big consisting 500,000 names. It will be a costly affair to send product catalogues to all the customers and the customers may also become tired of frequent catalogue promotion mails. Before the company sends off a new catalogue, the marketing department tries some customers randomly. The department gets some responses based on their trials. The department will use the trial data to find potential customers for the final promotion list. The company has collected the customer data in an Excel file. We have to mine the data to help the company to improve their mailing promotion performance. How this problem can be resolved by Data Mining Technique? To improve the mailing promotion performance of Tayko software firm, we can use any of the techniques explained above. For the data mining technique for Tayko software firm, we can do the following things: Classification of data: Classify the data according to customer names and their addresses. Association of available data: Associate customer names with regions, states, cities etc. Sequencing of events: Sequence the events i.e. frequency of the mails received by the customers as per their occurrence. Clustering or Segmentation of data: Group the customer names that have received mails frequently. Using Data Mining Techniques 1. Using Bayesian Network Statistical Approach technique, the mailing performance can be improved. Bayesian Network is statistical approach of causal relationships among the variables. We can relate variables or states for Bayesian Network as customer names, mails sent to customer, mail responses from customers and dependencies between them as causes and effect. We can determine which customers receive frequent mails and what are its effects on the mailing performance 2. Using Machine Learning Approach technique, the best model that matches the testing data can be searched. Using Decision Tree method, we can get Classification rules for the data. We can classify the names of the customers of Tayko software firm according to their addresses and mails received by them. This technique will provide us exact information about number of customers, who are classified according to their addresses and mails received by them. 3. Using Database-oriented Approach Technique, the iterative database method helps us to search for frequent item sets in transactional database. The association rules are derived from these frequent item sets. In case of Tayko software firm, the frequent mailing to same customer can be determined using this technique. Data Pre-Processing Method Used We have used “Sampling” as the pre-processing method to select a representative subset from a large population of data. The Marketing department of Tayko software firm has sent the new catalogue to some of the customers out of total 500,000 customers. Attributes of Data Mining Extract, transform, and load (ETL) transaction data in the Data Warehouse System. Store and manage the data in a multi-dimensional Database System. Provide the access of such data to Business Analysts and Information Technology professionals for their decision purpose. Analyse the data by relevant application software. Present the data in a useful format, like graphical representation or tabular representation. Data Mining Method Used We have used Neural Network method/ tool for Data Mining in this case study. Neural Network is a set of interlinked nodes called Neurons, which is a simple device that computes a function of its inputs. The inputs can be outputs of other neurons or attribute values of an object. By adjusting the connection and the functional parameters of the neurons, Neural Network can be trained to model the relationship between a set of input and output attributes. Neural Network can be used in Classification when the output attribute is the object class. Neural Network as data mining tool is useful to improve the mailing promotion performance of Tayko software firm. Using this tool, we can show input data such as customer list and other trial data used in the list as interlinked nodes. These neurons will then compute all the input data. As they are output of other neurons, we can set the relationship between these neurons. This will help in classification of the customer names for better mailing promotion performance of Tayko software firm. Data Mining Results & How the Company Can Use Them 1000 customers made a purchase through this promotion. Their total spending amounts to $ 205091. Out of all customers who made a purchase, 833 are US residents, 514 are male customers, and 223 customers ordered at their residence address. Tayko software should focus on the following observations that are based on the data mining results/ facts given above: US residents are more responsive to such promotional offer. Both genders are equally responsive. Customers are not purchasing for themselves, instead they are interested in gifting the gaming consoles to their relatives/ business friends. Improving Data Mining Results Data mining in interactive games space can contribute considerably to the bottom line of company business. In order to further improve upon the data mining results, company should focus on contacting prospects which have high likelihood of responding to an offer being made. More sophisticated methods should be used to optimise resources across campaigns so that it helps in predicting which channel and which offer an individual is most likely to respond to. Company can also use sophisticated automated mailing system to send and capture user responses. Once the results from data mining (potential prospect/customer) are determined, such application can either automatically send an e-mail or regular mail. Last but not least, uplift model can be used in order to determine the prospect/customer that shows higher chances in responding if offer is given to him/her. Company employing data mining techniques would definitely see a return on investment. However, company should also recognise the fact that the number of predictive models can quickly become very large. In order to overcome this problem, company should build separate model for each region as opposed to relying on one model to predict customer agitation towards promotional offer. Now, instead of sending a promotional offer to all people of all regions, company can take intelligent decision of sending an offer to customers who will likely take an offer! And finally, company may also want to determine which customers are going to be profitable over a period of time and they can send offers only to those customers who are likely to be profitable. In order to maintain this quantity of models, they need to manage model versions and move towards automated data mining. Conclusion Company can employ data mining as a secret weapon to gain competitive advantage. In current scenario, one highly successful and popular business application of data mining is Database Marketing. Customer patterns and their trends are extracted by mining historical customer data. Specific customer profiles are built based on these results in order to produce effective marketing campaign. The company must perform data mining operations within a specific time interval so that it allows the company to respond to a market opportunity before their competitors. For example, companies can send out a new version of their product catalogue every two months. During this period, they must collect the data about a set of consumers by sampling method (collecting and combining the sales data from the previous version of the product catalogue with demographic data), mine the collected data, identify the customers segments i.e. determine the consumers to whom a particular catalogue will be sent, and send the catalogue. References Data Mining Retrieved Oct 07, 2009 from http://en.wikipedia.org/wiki/Data_mining http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm Books- Data Mining: Concepts and Techniques (Second Edition) authored: (Jiawei Han and Micheline Kamber, University of Illinois at Urbana-Champaign, 2006) Book chapters: S. Sumathi and S.N. Sivanandam: Data Mining Tasks, Techniques, and Applications, Studies in Computational Intelligence (SCI) 29, 195–216 (2006) Read More

Business Intelligence and Enterprise Data Mining - Case Study Example

Extract of sample "Business Intelligence and Enterprise Data Mining"

CHECK THESE SAMPLES OF Business Intelligence and Enterprise Data Mining

The Role of Intelligent Agents in DSS Data Mining Applications - Potential Benefits and Pitfall

Business Intelligence

Data Warehouses & OLAP & Data Mining

Business Intelligence as a Decision-Making Tool

Significance of Implementing Business Intelligence in Decision-Making Process

Business Intelligence Strategy - HP

How to Improve Enterprise Operation Management

Data Driven Decision Making with SAS Enterprise Miner