The paper examines the opportunities in and possibilities arising from big data in retailing, particularly along five major data dimensions—data pertaining to customers, products, time, (geo-spatial) ...location and channel. Much of the increase in data quality and application possibilities comes from a mix of new data sources, a smart application of statistical tools and domain knowledge combined with theoretical insights. The importance of theory in guiding any systematic search for answers to retailing questions, as well as for streamlining analysis remains undiminished, even as the role of big data and predictive analytics in retailing is set to rise in importance, aided by newer sources of data and large-scale correlational techniques. The Statistical issues discussed include a particular focus on the relevance and uses of Bayesian analysis techniques (data borrowing, updating, augmentation and hierarchical modeling), predictive analytics using big data and a field experiment, all in a retailing context. Finally, the ethical and privacy issues that may arise from the use of big data in retailing are also highlighted.
Firms using online advertising regularly run experiments with multiple versions of their ads since they are uncertain about which ones are most effective. During a campaign, firms try to adapt to ...intermediate results of their tests, optimizing what they earn while learning about their ads. Yet how should they decide what percentage of impressions to allocate to each ad? This paper answers that question, resolving the well-known “learn-and-earn” trade-off using multi-armed bandit (MAB) methods. The online advertiser’s MAB problem, however, contains particular challenges, such as a hierarchical structure (ads within a website), attributes of actions (creative elements of an ad), and batched decisions (millions of impressions at a time), that are not fully accommodated by existing MAB methods. Our approach captures how the impact of observable ad attributes on ad effectiveness differs by website in unobserved ways, and our policy generates allocations of impressions that can be used in practice. We implemented this policy in a live field experiment delivering over 750 million ad impressions in an online display campaign with a large retail bank. Over the course of two months, our policy achieved an 8% improvement in the customer acquisition rate, relative to a control policy, without any additional costs to the bank. Beyond the actual experiment, we performed counterfactual simulations to evaluate a range of alternative model specifications and allocation rules in MAB policies. Finally, we show that customer acquisition would decrease by about 10% if the firm were to optimize click-through rates instead of conversion directly, a finding that has implications for understanding the marketing funnel.
Data is available at
https://doi.org/10.1287/mksc.2016.1023
.
Market structure analysis is a basic pillar of marketing research. Classic challenges in marketing such as pricing, campaign management, brand positioning, and new product development are rooted in ...an analysis of product substitutes and complements inferred from market structure. In this article, the authors present a method to support the analysis and visualization of market structure by automatically eliciting product attributes and brand's relative positions from online customer reviews. First, the method uncovers attributes and attribute dimensions using the "voice of the consumer," as reflected in customer reviews, rather than that of manufacturers. Second, the approach runs automatically. Third, the process supports rather than supplants managerial judgment by reinforcing or augmenting attributes and dimensions found through traditional surveys and focus groups. The authors test the approach on six years of customer reviews for digital cameras during a period of rapid market evolution. They analyze and visualize results in several ways, including comparisons with expert buying guides, a laboratory survey, and correspondence analysis of automatically discovered product attributes. The authors evaluate managerial insights drawn from the analysis with respect to proprietary market research reports from the same period analyzing digital imaging products.
In recent years, customer lifetime value (CLV) has gained increasing importance in both academia and practice. Although many advanced techniques have been proposed, the recency/frequency/monetary ...value (RFM) segmentation framework, and its related probability models, remain a CLV mainstay. In this article, we demonstrate the deficiency in RFM as a basis for summarizing customer history (data compression), and extend the framework to include clumpiness (C) by a metric-based approach. Our main empirical finding is that C adds to the predictive power, above and beyond RFM and firm marketing action, of both the churn, incidence, and monetary value parts of CLV. Hence, we recommend a significant implementation change: from RFM to RFMC.
This work is also motivated by noting that although statistical models based on RFM summaries can fit well in aggregate, their use can lead to significant micro-level (e.g., ranking of customers) prediction errors unless C is captured. A set of detailed empirical studies using data from a large North American retailer, in addition to six companies that vary in their business model: two traditional (e.g., CDNow.com) and four Internet (e.g., Hulu.com), demonstrate that the “clumpiness phenomena” is widely prevalent, and that companies with “bingeable content” have both high potential and high risk segments, previously unseen, but now uncovered because of the new framework: RFM to RFMC.
CONTEXT In response to concerns about the quality of care in US hospitals, the Centers for Medicare & Medicaid Services began measuring hospital performance and reporting this performance on their ...Web site, Hospital Compare. It is unknown whether these process performance measures are related to hospital-level outcomes. OBJECTIVE To determine whether quality measured with the process measures used in Hospital Compare are correlated with and predictive of hospitals' risk-adjusted mortality rates. DESIGN, SETTING, AND PARTICIPANTS Cross-sectional study of hospital care between January 1 and December 31, 2004, for acute myocardial infarction, heart failure, and pneumonia at acute care hospitals in the United States included on the Hospital Compare Web site. Ten process performance measures included in Hospital Compare were compared with hospital risk-adjusted mortality rates, which were measured using Medicare Part A claims data. MAIN OUTCOME MEASURES Condition-specific inpatient, 30-day, and 1-year risk-adjusted mortality rates. RESULTS A total of 3657 acute care hospitals were included in the study based on their performance as reported in Hospital Compare. Across all acute myocardial infarction performance measures, the absolute reduction in risk-adjusted mortality rates between hospitals performing in the 25th percentile vs those performing in the 75th percentile was 0.005 for inpatient mortality, 0.006 for 30-day mortality, and 0.012 for 1-year mortality (P<.001 for each comparison). For the heart failure performance measures, the absolute mortality reduction was smaller, ranging from 0.001 for inpatient mortality (P = .03) to 0.002 for 1-year mortality (P = .08). For the pneumonia performance measures, the absolute reduction in mortality ranged from 0.001 for 30-day mortality (P = .05) to 0.005 for inpatient mortality (P<.001). Differences in mortality rates for hospitals performing in the 75th percentile on all measures within a condition vs those performing lower than the 25th percentile on all reported measures for acute myocardial infarction ranged between 0.008 (P = .06) and 0.018 (P = .008). For pneumonia, the effects ranged between 0.003 (P = .09) and 0.014 (P<.001); for heart failure, the effects ranged between −0.013 (P = .06) and −0.038 (P = .45). CONCLUSIONS Hospital performance measures predict small differences in hospital risk-adjusted mortality rates. Efforts should be made to develop performance measures that are tightly linked to patient outcomes.
We examine three sets of established behavioral hypotheses about consumers’ in‐store behavior using field data on grocery store shopping paths and purchases. Our results provide field evidence for ...the following empirical regularities. First, as consumers spend more time in the store, they become more purposeful—they are less likely to spend time on exploration and more likely to shop/buy. Second, consistent with “licensing” behavior, after purchasing virtue categories, consumers are more likely to shop at locations that carry vice categories. Third, the presence of other shoppers attracts consumers toward a store zone but reduces consumers’ tendency to shop there.
Recent trends in marketing have demonstrated an increased focus on in-store expenditures with the hope of "grabbing consumers" at the point of purchase, but does this make sense? To help answer this ...question, the authors examine the interplay between in-store and out-of-store factors on consumer attention to and evaluation of brands displayed on supermarket shelves. Using an eye-tracking experiment, they find that the number of facings has a strong impact on evaluation that is entirely mediated by its effect on visual attention and works particularly well for frequent users of the brand, for low-market-share brands, and for young and highly educated consumers who are willing to trade off brand and price. They also find that gaining in-store attention is not always sufficient to drive sales. For example, top- and middle-shelf positions gain more attention than low-shelf positions; however, only top-shelf positions carry through to brand evaluation. The results underscore the importance of combining eye-tracking and purchase data to obtain a full picture of the effects of in-store and out-of-store marketing at the point of purchase.
While single-brand reward programs encourage customers to remain loyal to that one brand, coalition programs encourage customers to be “promiscuous” by offering points redeemable across partner ...stores. Despite the benefits of this “open relationship” with customers, store managers face uncertainty as to how rewards offered by partners influence transactions at their own stores. We use a model of multi-store purchase incidence and spend to show how the value of points shared among partner stores can explain patterns in customer-level purchases across them. We also allow reward spillovers to be moderated by three measures of store affinity that characterize a coalition’s portfolio: the relative popularity, geographic distance, and overlap in product categories between each pair of stores.
For the coalition studied, popularity affinity was the main determinant of the valence of cross-reward effects, both before and after the devaluation. In contrast, category and geographic affinity had a smaller and more heterogenous impact. Through the use of an event where the loyalty program uniformly devalued the entire coalition’s value of reward points, we show that cross-reward effects changed (lessened), leading to larger financial losses for the most popular stores. While we do not observe changes to the composition of the coalition’s portfolio, our results also suggest that the value of a shared reward currency may be driven by the inclusion of smaller partners.
Following trends in entertainment streaming services, online educational platforms are increasingly offering users flexible “on-demand” content options. It is important to understand how the timing ...of content release affects learning behaviors and firm revenue drivers. The current research studies over 67,000 users taking a marketing course before versus after a natural experiment in which the platform switched the course from a scheduled weekly-release format to an on-demand format with all content immediately available. The switch to on-demand positively impacted short-term firm revenue by increasing the number and proportion of certificate-paying users, suggesting that on-demand content can attract a broader set of consumers who value flexibility. On the downside, the switch resulted in users exhibiting lower lecture completion rates and quiz performance and taking fewer additional business courses on the platform, representing a long-term cost. The results were robust to propensity score matching and stratification. The analyses also revealed that on-demand content enabled learning patterns that deviated from a standard evenly paced schedule, including “strategic” binge learning and stretching out engagement past the recommended course period. Thus, while on-demand formats can boost revenues by bringing in more paying users, managers must consider new strategies for maintaining performance and engagement levels within these environments.
We predict the popularity of short messages called tweets created in the micro-blogging site known as Twitter. We measure the popularity of a tweet by the time-series path of its retweets, which is ...when people forward the tweet to others. We develop a probabilistic model for the evolution of the retweets using a Bayesian approach, and form predictions using only observations on the retweet times and the local network or "graph" structure of the retweeters. We obtain good step ahead forecasts and predictions of the final total number of retweets even when only a small fraction (i.e., less than one tenth) of the retweet path is observed. This translates to good predictions within a few minutes of a tweet being posted, and has potential implications for understanding the spread of broader ideas, memes or trends in social networks.