J. Comp. & Math. Sci. Vol. 1 (6), 680-685 (2010)
Optimization Technique in Data Mining YOGENDRA YATI Department of Computer Science and Application S.V. College, Bairagarh, Bhopal(M.P.) INDIA ABSTRACT Electronic customer relationship management (eCRM) problems has used either datamining (DM) or optimization methods, but has not combined the two approaches. By leveraging the strengths of both approaches, the eCRM problems of customer analysis, customer interactions, and the optimization of performance metrics (such as the lifetime value of a customer on the Web) can be better analyzed. In particular, many eCRM problems have been traditionally addressed using DM methods. There are opportunities for optimization to improve these methods, and this paper describes these opportunities. Key Words: Data Mining, Optimization, eCRM.
1. INTRODUCTION Optimization methods for solving data-mining (DM) problems1,5 and used DM methods for solving optimization problems2. In this survey, we systematically explore how optimization and DM can help one another for certain customer relationship management (CRM) applications in e-commerce, termed analytical eCRM3. Analytical eCRM includes customer analysis, customer interactions, and the optimization of various performance metrics, such as customer lifetime value in Web-based e-commerce. To illustrate how optimization and DM interact in eCRM settings, consider the following two important eCRM problems. The first deals with finding the optimal lifetime value (LTV) of a customer by determining proactive customer interaction strategies resulting in maximal lifetime profits from that customer. Because this LTV optimization problem also relies on the analysis of large volumes of customer data in the eCRM setting, it can be augmented with DM techniques to help improve the LTV model by better
estimating its various parameters. The second eCRM problem, primarily in the DM domain, pertains to preprocessing click-stream data as the basis for building DM models, such as purchase prediction models.4 show that inappropriate preprocessing of data can result in significantly worse DM models for critical eCRM problems. Given the nature of clickstream data and the fact that hundreds of derived variables can be created from this data for a user session, it is important to partition the click-stream data into an optimal set of sessions and to select an optimal set of derived variables to produce the best DM model from this data. Both these examples show how DM and optimization can interact when solving various eCRM problems. Understanding interactions between DM and optimization is particularly important in the eCRM context. First, many eCRM problems can be framed as optimization problems, and the domain is characterized by the availability of large volumes of data, often measured in terabytes. In this way, we can explore how patterns generated from DM can be combined with optimization approaches.
Journal of Computer and Mathematical Sciences Vol. 1, Issue 6, 31 October, 2010 Pages (636-768)
681
Yogendra Yati, J. Comp. & Math. Sci. Vol. 1 (6), 680-685 (2010)
Second, many eCRM problems deal with online customers interacting with an eCRM system and, therefore, require real-time solutions. Using the broader context of eCRM applications as motivation, this paper surveys from a theoretical perspective how optimization and DM can be synergistically used. In particular, we present in [section] 2, various eCRM applications and discuss opportunities for optimization. These eCRM applications include (1) maximization of customer LTV,(2) customer analysis, including preprocessing clickstream data and building profiles, and (3) customer interaction methods, including website design and personalization. Many of the eCRM applications discussed in §2 have been traditionally addressed using DM, and we note that there are significant opportunities for optimization to help the DM approaches. We present a theoretical discussion of how optimization can help in various DM problems in §3. The seven DM problems we discuss are: (1) feature selection, (2) active learning, (3) DM model optimization, (4) selection of the best DM model or pattern, (5) classification using mathematical programming, (6) clustering, and (7) rule and constraint discovery. While the interactions between DM and optimization can be bidirectional, a detailed treatment of how DM can help optimization is beyond the scope of this paper. However, an online appendix (at mansci.pubs.informs.org/ecompanion.html) of this paper presents various examples of how DM can help optimization and discusses additional research opportunities. We conclude in §4 by listing several key research opportunities that emerge from this paper. At a general level, in eCRM, these opportunities lie in (1) using DM to help exploit the data better for eCRM problems that are naturally formulated as optimization problems (e.g.,
selecting the 10 best items to recommend), and (2) using optimization for eCRM problems that are more naturally formulated or currently considered as DM problems (e.g., building good predictive models or learning customer patterns from data to build profiles). 2. OPTIMIZATION IN eCRM The purpose of CRM is to identify, acquire, serve, and retain profitable customers by interacting with them in an integrated way across a range of communication channels.3 describes analytical eCRM as a four-step iterative process consisting of (1) collecting and integrating online customer data, (2) analyzing this data, (3) building interactions with customers based on this analysis such that certain performance metrics such as LTV are optimized, and (4) measuring the effectiveness of these interactions in terms of these performance metrics. In this section, we will examine how various optimization and DM methods help in providing better eCRM solutions for these steps, with a focus on steps (2)–(4). We start in [section] 2.1 by studying one of the most important performance metrics—the LTV of a customer. In[section] 2.2, we discuss customer analysis problems and, in [section]2.3, we discuss customer interaction problems in eCRM. 2.1. Maximizing Lifetime Value A typical performance metric used in many CRM applications is the LTV of a customer. One of the key questions in CRM is how to develop proactive customer interaction strategies that maximize LTV. This key problem is viewed by some as the “holy grail” of eCRM. Substantial work has been done on modeling LTV, mainly in the marketing literature5,6.
Journal of Computer and Mathematical Sciences Vol. 1, Issue 6, 31 October, 2010 Pages (636-768)
682
Yogendra Yati, J. Comp. & Math. Sci. Vol. 1 (6), 680-685 (2010)
Traditionally, the problem of estimating LTV is divided into two components: (1) estimating how long a customer will stay, and (2) estimating the flow of revenue from the customer during this period. Estimating the flow of revenue during a customer’s lifetime was done with parametric models in the marketing literature. With respect to estimating customer tenure,5 provide analytical models that determined whether a customer at any point in time is “active,” by identifying a set of qualitative criteria that capture when a customer is more likely to be active. The concept of customer retention is a natural extension of this approach, because it deals with trying to prevent a customer from becoming inactive and, hence, is a proactive method of increasing LTV. This notion of customer interaction is further explored in7 in which the problem of optimal communication to maximize LTV is addressed. In their approach, they assume that communications (e.g., mailings) are sent at a fixed time interval (ө) and the problem is determining the optimal(ө). Based on simplifying assumptions regarding the probability of a customer being active at time i to be pi and on equal expected profit, A, from each communication, they derive value of a customer, V , as a function of (ө) as v(ө)=∑A(ө).(p(ө)i / (1+r) өi 7
show that both too little and too much communication can result in a firm’s failure to capture adequate value from its customers. This analysis provides useful insights into the LTV optimization problem. However, this work considers only optimal communication frequency with customers and the model does not take into account the existence of data on customer behavior.
In contrast, the DM community studied the LTV problem in the presence of large volumes of customer data8,9. In particular,8 predict customer tenure using classical survival analysis methods by building a neural network and training it on past customer data. However,8 do not address the problem of computing optimal parameters for LTV models. Similarly,9 compute LTV based on large volumes of customer data by focusing on using DM to estimate customer churn and future revenues. They use statistical and DM methods to estimate future revenues v(t) from the customer and probabilities S(t) that the customer will still be active at various times t in the future. To compute more realistic estimates of S(t),9 group customers into segments and make a simplifying segment homogeneity assumption. This assumption makes it possible to use simple nonparametric estimations of probabilities S(t) for the segments. Once the values v(t) and S(t) are estimated and the LTV values are computed, the next step is to design incentives for the customers that maximize their LTVs. As an example,9 discuss a case in the wireless industry where service providers need to select the best incentive (such as reduced prices, handset upgrade, a free battery) to offer to each customer. In general, this is an optimization problem, although the9 paper determines the best incentive by exhaustive search across a relatively small number of incentives. Because DM methods are used for computing LTV values, and optimization methods determine incentives resulting in highest LTV values, such applications provide examples of how DM methods can help solve optimization problems. We would like to point out that it is difficult to solve an LTV optimization problem in the presence of large volumes of data, and9 stopped short of doing this. Similarly,7 studied
Journal of Computer and Mathematical Sciences Vol. 1, Issue 6, 31 October, 2010 Pages (636-768)
683
Yogendra Yati, J. Comp. & Math. Sci. Vol. 1 (6), 680-685 (2010)
only an optimization problem and did not deal with large volumes of data. Therefore, developing new algorithms computing optimal LTVs and utilizing optimization and DM methods in the presence of large volumes of data constitutes an important and challenging problem for operations research/management science (OR/MS) researchers. One way to deal with the complexity of the general LTV optimization problem is to reduce it to simpler types of problems. One such reduction may consider various heuristics producing higher LTV values, rather than attempting to find an optimal solution. For example, we can study customer attrition problems and determine policies that will result in lower attrition rates and, therefore, higher customer LTVs. Another way to deal with this complexity is to seek to optimize simpler performance measures that can serve as proxies for LTV. For example, one can use customer satisfaction rates with various offerings as one such measure. Much of the eCRM literature on customer analysis and interactions discussed below illustrate these two ways of dealing with the complexity of the general LTV optimization problem. 2.2. Customer Analysis in eCRM Customer analysis includes two main steps in the eCRM context: (1) preprocessing data that tracks various online activities of the customers—this involves starting with individual user clicks on a site and constructing logical user “sessions” and summary variables; and (2) building customer profiles from this and other data. At a general level, a profile is a set of patterns that describe a user. DM is used to learn these patterns from data. Customer profiles are then built from these results. 2.2.1. Preprocessing Click-Stream Data. As argued by4, data preprocessing is a critical step of the knowledge discovery
process in eCRM, and the success of most DM methods to a large extent depends on this step. Therefore, any method that directly improves data preprocessing affects DM results. In this section, we examine how optimization can facilitate better data preprocessing of clickstream data. An alternative measure is the variance of page content as defined by a set of keywords in each page, i.e., based on how much the keywords in one page differ from those on another page. The problem can then be defined as follows. Given a number of desired sessions n, a minimum and maximum length for each session, and the requirement that only consecutive pages form a session, find an optimal partitioning of the click-stream data into n sessions. Optimality can be defined in many ways, one of which is to achieve dual objectives (maximizing intrasession similarities and intersession differences): Minimize ∑ σi / n and Maximize ∑ ( µi – avg(µi) )2 / n An opportunity for the optimization community is to determine how to formalize the session optimization approach outlined above, determine good similarity measures, and find optimal solutions for these measures. In this mode of interaction, optimization helps DM by preprocessing data to build better DM solutions. Note that a related use of optimization during preprocessing is the use of optimization models to select the set of variables used to build a DM model. Because this is not specific to eCRM, we discuss this further in §3.1. 2.3. Customer Interactions in eCRM In this section, we will review the problems facilitating better interactions with the customers. In particular, we will focus on website design and on personalization problems.
Journal of Computer and Mathematical Sciences Vol. 1, Issue 6, 31 October, 2010 Pages (636-768)
684
Yogendra Yati, J. Comp. & Math. Sci. Vol. 1 (6), 680-685 (2010)
2.3.1. Website Design. Websites can be viewed as user interfaces to information sources, and it is a challenge to construct efficient websites that provide customers with a good interface. Current DM approaches are based on mining Web logfiles for path traversal patterns and restructuring sites based on the analysis. 3. HOW OPTIMIZATION TRADITIONAL DM PROBLEMS Optimization can contribute to DM in one of two ways: (1) optimization can be a component of a larger DM process, or (2) new DM techniques can be built using entirely optimization-based methods. For example, optimization as a component of a larger DM technique can be used as a method for determining the selection of the best decision tree out of a set of previously generated decision trees using a genetic algorithm5, and estimating the optimal weights of a multilayer neural network using gradient descent10. Support Vector Machines (SVM)1 fall into the second category, because selection of an optimal separating hyperplane (or a surface) constitutes the DM method. 4. CONCLUSIONS Most firms today recognize the importance of building and maintaining strong relationships with their customers. As firms increasingly rely on their online presence to interact with customers, eCRM will continue to grow in importance. Addressing eCRM problems by utilizing the strengths of both DM and optimization can, therefore, be a useful area for future research. A better understanding 7. of the problems in eCRM and of the possible 8. ways in which DM and optimization9.can
interact, can help exploit the opportunities for synergistically using both approaches in the solution of eCRM problems. To this end, in this paper we (1) surveyed various eCRM applications where such interactions are important, and showed that relatively little work has been done on exploring these interactions in the eCRM context and (2) described different ways in which optimization can help DM by surveying extensive existing literature. More generally, the interactions between DM and optimization are bidirectional and the online appendix presents a treatment of how DM can help optimization. REFERENCES 1. Vapnik, V. N. The Nature of the Statistical Learning Theory. Springer, New York (1995). 2. Brijs, T., G. Swinnen, K. Vanhoof, G. Wets. Using association rules for product assortment decisions: A case study. Proc. ACM Internat. Conf. Knowledge Discovery Data Mining, San iego, CA (1999). 3. Swift, R. Analytical CRM powers profitable relationships: Creating success by letting customers guide you. DM Rev. (February) (2002). 4. Zheng, Z., B. Padmanabhan, S. Kimbrough. On data preprocessing biases in web usage mining. INFORMS J. Comput. 15(2) 148–170 (2003). 5. Fu, Z., B. Golden, S. Lele, S. Raghavan, E. Wasil. 2003. A genetic algorithm-based approach for building accurate decision trees. INFORMS J. Comput. 15(1) 3–22. 6. Dwyer, F. R. 1989. Customer lifetime valuation to support marketing decision making. J. Direct Marketing 3(4) 8–15. 7. Dreze, X., A. Bonfrer. To pester or leave alone: Lifetime value maximization through ptimal communication timing.
Journal of Computer and Mathematical Sciences Vol. 1, Issue 6, 31 October, 2010 Pages (636-768)
685
Yogendra Yati, J. Comp. & Math. Sci. Vol. 1 (6), 680-685 (2010)
Working paper, Marketing Department, University of California, Los Angeles, CA (2002). 8. Mannino, M. V., V. S. Mookerjee. Optimizing expert systems: Heuristics for efficiently generating low cost information acquisition strategies. INFORMS J. Comput. 11(3) 278–291 (1999).. 9. Rosset, S., E. Neumann, U. Eick, N. Vatnik, Y. Idan. Customer lifetime value
modeling and its use for customer retention planning. Proc. ACM Internat. Conf. Knowledge Discovery Data Mining. Edmonton, Alberta, Canada (2002).. 10. Rumelhart, D. E., G. E. Hinton, R. J. Williams. Learning internal representations by error propagation. D. E. Rumelhart, J. L. McClelland, eds. Parallel Data Processing, Vol. 1. MIT Press, Cambridge, MA, 318–362 (1986).
Journal of Computer and Mathematical Sciences Vol. 1, Issue 6, 31 October, 2010 Pages (636-768)