About the Book
This book arose out of a data mining course at MIT’s Sloan School of Management. Preparation for the course revealed that there are a number of excellent books on the business context of data mining, but their coverage of the statistical and machine learning algorithms and theoretical underpinnings is not sufficiently detailed to provide a practical guide for users who possess the raw skills and tools to analyze data. This book is intended for the business student (and practitioner) of data mining techniques, and the goal is threefold: (1) to provide both a theoretical and practical understanding of the key methods of classification, prediction, reduction and exploration that are at the heart of data mining; (2) to provide a business decision-making context for these methods; and (3) using real business cases and data, to illustrate the application and interpretation of these methods. The book employs the use of an Excel® add-in, XLMinerTM, at no cost to registered instructors, in order to illustrate and interpret the various data sets that are presented throughout. Real-life business cases are also presented so that readers can implement algorithms with a very low learning hurdle.
About the Author
Galit Shmueli is Assistant Professor of Statistics in the Department of Decision & Information Technologies of the Robert H. Smith School of Business at the University of Maryland. Her main research areas include models for unique data structures, discrete distributions, anomaly detection, data visualization, and nonparametric smoothing methods.
Nitin R. Patel is Chairman, Founder, and Chief Technology Officer of Cambridge-based Cytel Incorporated and a Visiting Professor in the Engineering Systems Division at MIT. He has published well over 100 journal articles in a multitude of applied areas. PETER C. BRUCE is President and Owner of statistics.com, a leading provider of on-line professional development education courses in statistics. He is also President of Resampling Stats, Inc., a statistical software developer.
Table of Contents: Foreword
Preface
Acknowledgments
1. Introduction
2. Overview of the Data Mining Process
3. Data Exploration and Dimension Reduction
4. Evaluating Classification and Predictive Performance
5. Multiple Linear Regression
6. Three Simple Classification Methods
7. Classification and Regression trees
8. Logistic Regression
9. Neural Nets
10. Discriminant Analysis
11. Association Rules
12. Cluster Analysis
13. Cases
References
Index