Essays on Data Driven Decision Making with SAS Enterprise Miner Case Study

Download full paperFile format: .doc, available for editing

The paper "Data-Driven Decision Making with SAS Enterprise Miner" is a good example of a management case study.   The health care facility provides health care services to approximately five million veterans. Of the five million, approximately 200,000 thousand are HIV positive. In order to effectively provide the HIV care services, accurate data and information on the number of infected and registered persons are vital. A large clinical database called the Immunology Case Registry (ICR) currently holds approximately 20,000 HIV infected patients. The current algorithm used for the entry of patients into the ICR database is based on positive HIV test results.

Additionally, the algorithm is based on a minimal number of selected HIV and AIDS ICD-9 codes. Accuracy and completeness of the algorithm is a key determinant to realizing the desired results on the management of the patients. However, this algorithm is prone to errors originating from both random and systematic sources. Hence the questioning of the accuracy and completeness of the ICR. Application of supervised data mining methods in developing an algorithm for patient identification and testing will eliminate the shortfalls of the preceding methodologies. The current algorithm entirely depends on the diagnostic codes to determine the number of persons with HIV infection.

On the contrary, additional variables including socioeconomic, geographic, laboratory, service utilization and pharmacy play a major role. The proposed algorithm is developed by Inco-operating enterprise miner with logistic regression (LR), neural network (NN) and decision tree (DT) models. This is aimed at predicting a binary outcome variable for determining the HIV status and entry into the ICR. Data needs Data requirements and needs are met through a data mining approach.

This is a process of acquiring meaningful patterns in large data sets that explain past events. The process is carried out in a manner that the patterns can be applied to new data to predict future events. A source of cases with known status is needed for training and validation of predictive models. Data requirements are mined through a series of steps that are broadly classified into two; data preparation and data analysis. During data preparation a set of steps are followed; the building of the analysis dataset, preparation of the data, identification of pre-classified cases and creation of modelling sample. The first step entails building an Analysis Dataset (AD) which is obtained from varied sources of data.

The AD is characterized by a single record for each patient. Some of the databases of the VA include the National Patient care Database (NPCD), the Decision Support System (DSS) and the Pharmacy Benefits Management (PBM) package. Data preparation step entails inputting missing values, reducing and cleaning the dimensionality of data, deriving new variables and finally summarizing and transforming the data to keep one record per patient.

This is achieved through interaction with clinical experts who have a better knowledge of the most important variables. Using available records it is possible to identify pre-classified cases. This can either be True Positive, or True Negative. True positive are patients that have already been tested positive whilst true negative are people tested negative. Once data has been pre-classified, then it is possible to create samples from the data. The samples represent the different variables identified.


Bertsimas, D., & Thiele, A. (2006). Robust and data-driven optimization: modern decision-making under uncertainty. INFORMS Tutorials in Operations Research: Models, Methods, and Applications for Innovative Decision Making.

Hedgebeth, D. (2007). Data-driven decision making for the enterprise: an overview of business intelligence applications. VINE, 37(4), 414-420.

Ikemoto, G. S., & Marsh, J. A. (2007). chapter 5 Cutting Through the “Data‐Driven” Mantra: Different Conceptions of Data‐Driven Decision Making. Yearbook of the National Society for the Study of Education, 106(1), 105-131.

Isaacs, M. L. (2003). Data-Driven Decision Making: The Engine of Accountability. Professional School Counseling, 6(4), 288-95.

Mandinach, E. B., Honey, M., & Light, D. (2006, April). A theoretical framework for data-driven decision making. In annual meeting of the American Educational Research Association, San Francisco, CA.

Marsh, J. A., Pane, J. F., & Hamilton, L. S. (2006). Making sense of data-driven decision making in education.

Park, V., & Datnow, A. (2009). Co-constructing distributed leadership: District and school connections in data-driven decision-making. School leadership and Management, 29(5), 477-494.

Wayman, J. C. (2005). Involving teachers in data-driven decision making: Using computer data systems to support teacher inquiry and reflection. Journal of Education for Students Placed at Risk, 10(3), 295-308.

WEB, S. L. O., & LESSON, W. P. Data-Driven Decision Making.

Wohlstetter, P., Datnow, A., & Park, V. (2008). Creating a system for data-driven decision-making: Applying the principal-agent framework. School Effectiveness and School Improvement, 19(3), 239-259.

Download full paperFile format: .doc, available for editing
Contact Us