A Nonparametric Approach to Modeling Choice with Limited Data

V. Farias, S. Jagabathula, D. Shah
Management Science, Volume 59, No. 2, pp. 305-322, 2013

Choice models today are ubiquitous across a range of applications in operations and marketing. Real-world implementations of many of these models face the formidable stumbling block of simply identifying the “right” model of choice to use. Because models of choice are inherently high-dimensional objects, the typical approach to dealing with this problem is positing, a priori, a parametric model that one believes adequately captures choice behavior. This approach can be substantially suboptimal in scenarios where one cares about using the choice model learned to make fine-grained predictions; one must contend with the risks of mis-specification and overfitting/underfitting. Thus motivated, we visit the following problem: For a “generic” model of consumer choice (namely, distributions over preference lists) and a limited amount of data on how consumers actually make decisions (such as marginal information about these distributions), how may one predict revenues from offering a particular assortment of choices? An outcome of our investigation is a nonparametric approach in which the data automatically select the right choice model for revenue predictions. The approach is practical. Using a data set consisting of automobile sales transaction data from a major U.S. automaker, our method demonstrates a 20% improvement in prediction accuracy over state-of-the-art benchmark models; this improvement can translate into a 10% increase in revenues from optimizing the offer set. We also address a number of theoretical issues, among them a qualitative examination of the choice models implicitly learned by the approach. We believe that this paper takes a step toward “automating” the crucial task of choice model selection.