•A formal decision analysis clarifies the economics of the customer targeting problem.•Optimal targeting considers the marketing effect and customer response probability.•Causal hurdle models jointly ...estimate treatment effect and response probability.•Results from e-coupon targeting confirm the superiority of the analytical policy.
This study provides a formal analysis of the customer targeting problem when the cost for a marketing action depends on the customer response and proposes a framework to estimate the decision variables for campaign profit optimization. Targeting a customer is profitable if the impact and associated profit of the marketing treatment are higher than its cost. Despite the growing literature on uplift models to identify the strongest treatment-responders, no research has investigated optimal targeting when the costs of the treatment are unknown at the time of the targeting decision. Stochastic costs are ubiquitous in direct marketing and customer retention campaigns because marketing incentives are conditioned on a positive customer response. This study makes two contributions to the literature, which are evaluated on an e-commerce coupon targeting campaign. First, we formally analyze the targeting decision problem under response-dependent costs. Profit-optimal targeting requires an estimate of the treatment effect on the customer and an estimate of the customer response probability under treatment. The empirical results demonstrate that the consideration of treatment cost substantially increases campaign profit when used for customer targeting in combination with an estimate of the average or customer-level treatment effect. Second, we propose a framework to jointly estimate the treatment effect and the response probability by combining methods for causal inference with a hurdle mixture model. The proposed causal hurdle model achieves competitive campaign profit while streamlining model building. All codes are available at https://github.com/Humboldt-WI/response-dependent-costs.
•New methodology to build predictive decision support models for campaign targeting.•Integration of statistical and economic objectives to improve business performance.•Comprehensive empirical ...analysis using 25 real-world marketing data sets.•Substantial improvements over challenging benchmarks in accuracy and profitability.
Marketing messages are most effective if they reach the right customers. Deciding which customers to contact is an important task in campaign planning. The paper focuses on empirical targeting models. We argue that common practices to develop such models do not account sufficiently for business goals. To remedy this, we propose profit-conscious ensemble selection, a modeling framework that integrates statistical learning principles and business objectives in the form of campaign profit maximization. Studying the interplay between data-driven learning methods and their business value in real-world application contexts, the paper contributes to the emerging field of profit analytics and provides original insights how to implement profit analytics in marketing. The paper also estimates the degree to which profit-concious modeling adds to the bottom line. The results of a comprehensive empirical study confirm the business value of the proposed ensemble learning framework in that it recommends substantially more profitable target groups than several benchmarks.
•We assess the applicability of graph metrics to predict purchase probabilities.•Real-world clickstream data of two online retailers is used.•Graphs are derived out of sessions of website ...visitors.•Distance- and centrality-based graph metrics are useful for prediction.•Closeness vitality, radius, number of circles and self-loops are most important.
The prediction of online user behavior (next clicks, repeat visits, purchases, etc.) is a well-studied subject in research. Prediction models typically rely on clickstream data that is captured during the visit of a website and embodies user agent-, path-, time- and basket-related information. The aim of this paper is to propose an alternative approach to extract auxiliary information from the website navigation graph of individual users and to test the predictive power of this information. Using two real-world large datasets of online retailers, we develop an approach to construct within-session graphs from clickstream data and demonstrate the relevance of corresponding graph metrics to predict purchases.
•We show the prevalence of email tracking in marketing communication.•We propose features that facilitate tracking detection using machine learning.•The new features are resilient against ...manipulation by trackers.•We assess the detection model through out-of-time-and-universe validation.•Tree learning algorithms achieve high detection rates and few false alarms.
Email tracking allows email senders to collect fine-grained behavior and location data on email recipients, who are uniquely identifiable via their email address. Such tracking invades user privacy in that email tracking techniques gather data without user consent or awareness. Striving to increase privacy in email communication, this paper develops a detection engine to be the core of a selective tracking blocking mechanism in the form of three contributions. First, a large collection of email newsletters is analyzed to show the wide usage of tracking over different countries, industries and time. Second, we propose a set of features geared towards the identification of tracking images under real-world conditions. Novel features are devised to be computationally feasible and efficient, generalizable and resilient towards changes in tracking infrastructure. Third, we test the predictive power of these features in a benchmarking experiment using a selection of state-of-the-art classifiers to clarify the effectiveness of model-based tracking identification. We evaluate the expected accuracy of the approach on out-of-sample data, over increasing periods of time, and when faced with unknown senders.
The Price of Privacy Baumann, Annika; Haupt, Johannes; Gebert, Fabian ...
Business & information systems engineering,
08/2019, Letnik:
61, Številka:
4
Journal Article
Recenzirano
The analysis of clickstream data facilitates the understanding and prediction of customer behavior in e-commerce. Companies can leverage such data to increase revenue. For customers and website ...users, on the other hand, the collection of behavioral data entails privacy invasion. The objective of the paper is to shed light on the trade-off between privacy and the business value of customer information. To that end, the authors review approaches to convert clickstream data into behavioral traits, which we call clickstream features, and propose a categorization of these features according to the potential threat they pose to user privacy. The authors then examine the extent to which different categories of clickstream features facilitate predictions of online user shopping patterns and approximate the marginal utility of using more privacy adverse information in behavioral prediction models. Thus, the paper links the literature on user privacy to that on e-commerce analytics and takes a step toward an economic analysis of privacy costs and benefits. In particular, the results of empirical experimentation with large real-world e-commerce data suggest that the inclusion of short-term customer behavior based on session-related information leads to large gains in predictive accuracy and business performance, while storing and aggregating usage behavior over longer horizons has comparably less value.
E-mail tracking provides companies with fine-grained behavioral data about e-mail recipients, which can be a threat for individual privacy and enterprise security. This problem is especially severe ...since e-mail tracking techniques often gather data without the informed consent of the recipients. So far e-mail recipients lack a reliable protection mechanism.
This article presents a novel protection framework against e-mail tracking that closes an important gap in the field of enterprise security and privacy-enhancing technologies. We conceptualize, implement and evaluate an anti-tracking mail server that is capable of identifying tracking images in e-mails via machine learning with very high accuracy, and can selectively replace them with arbitrary images containing warning messages for the recipient. Our mail protection framework implements a selective prevention strategy as enterprise-grade software using the design science research paradigm. It is flexibly extensible, highly scalable, and ready to be applied under actual production conditions. Experimental evaluations show that these goals are achieved through solid software design, adoption of recent technologies and the creation of novel flexible software components.
•Conceptualization of a novel protection framework against e-mail tracking.•First server-side implementation of reliable e-mail tracking protection.•Framework shows a high accuracy without any manual user effort.•Framework is developed as enterprise-grade software (extensible, scalable).
The Price of Privacy Baumann, Annika; Haupt, Johannes; Gebert, Fabian ...
Business & information systems engineering,
08/2019, Letnik:
61, Številka:
4
Journal Article
Recenzirano
The analysis of clickstream data facilitates the understanding and prediction of customer behavior in e-commerce. Companies can leverage such data to increase revenue. For customers and website ...users, on the other hand, the collection of behavioral data entails privacy invasion. The objective of the paper is to shed light on the trade-off between privacy and the business value of customer information. To that end, the authors review approaches to convert clickstream data into behavioral traits, which we call clickstream features, and propose a categorization of these features according to the potential threat they pose to user privacy. The authors then examine the extent to which different categories of clickstream features facilitate predictions of online user shopping patterns and approximate the marginal utility of using more privacy adverse information in behavioral prediction models. Thus, the paper links the literature on user privacy to that on e-commerce analytics and takes a step toward an economic analysis of privacy costs and benefits. In particular, the results of empirical experimentation with large real-world e-commerce data suggest that the inclusion of short-term customer behavior based on session-related information leads to large gains in predictive accuracy and business performance, while storing and aggregating usage behavior over longer horizons has comparably less value.
This study provides a formal analysis of the customer targeting problem when the cost for a marketing action depends on the customer response and proposes a framework to estimate the decision ...variables for campaign profit optimization. Targeting a customer is profitable if the impact and associated profit of the marketing treatment are higher than its cost. Despite the growing literature on uplift models to identify the strongest treatment-responders, no research has investigated optimal targeting when the costs of the treatment are unknown at the time of the targeting decision. Stochastic costs are ubiquitous in direct marketing and customer retention campaigns because marketing incentives are conditioned on a positive customer response. This study makes two contributions to the literature, which are evaluated on an e-commerce coupon targeting campaign. First, we formally analyze the targeting decision problem under response-dependent costs. Profit-optimal targeting requires an estimate of the treatment effect on the customer and an estimate of the customer response probability under treatment. The empirical results demonstrate that the consideration of treatment cost substantially increases campaign profit when used for customer targeting in combination with an estimate of the average or customer-level treatment effect. Second, we propose a framework to jointly estimate the treatment effect and the response probability by combining methods for causal inference with a hurdle mixture model. The proposed causal hurdle model achieves competitive campaign profit while streamlining model building. Code is available at https://github.com/Humboldt-WI/response-dependent-costs.
Die Digitalisierung der Wirtschaft macht das Customer Targeting zu einer wichtigen Schnittmenge von Marketing und Wirtschaftsinformatik. Marketingtreibende können auf Basis von soziodemografischen ...und Verhaltensdaten gezielt einzelne Kunden mit personalisierten Botschaften ansprechen.;
Diese Arbeit erweitert die Perspektive der Forschung im Bereich der modellbasierten Vorhersage von Kundenverhalten durch 1) die Entwicklung und Validierung neuer Methoden des maschinellen Lernens, die explizit darauf ausgelegt sind, die Profitabilität des Customer Targeting im Direktmarketing und im Kundenbindungsmanagement zu optimieren, und 2) die Untersuchung der Datenerfassung mit Ziel des Customer Targeting aus Unternehmens- und Kundensicht. ;
Die Arbeit entwickelt Methoden welche den vollen Umfang von E-Commerce-Daten nutzbar machen und die Rahmenbedingungen der Marketingentscheidung während der Modellbildung berücksichtigen. Die zugrundeliegenden Modelle des maschinellen Lernens skalieren auf hochdimensionale Kundendaten und ermöglichen die Anwendung in der Praxis. Die vorgeschlagenen Methoden basieren zudem auf dem Verständnis des Customer Targeting als einem Problem der Identifikation von Kausalzusammenhängen. Die Modellschätzung sind für die Umsetzung profitoptimierter Zielkampagnen unter komplexen Kostenstrukturen ausgelegt.;
Die Arbeit adressiert weiterhin die Quantifizierung des Einsparpotenzials effizienter Versuchsplanung bei der Datensammlung und der monetären Kosten der Umsetzung des Prinzips der Datensparsamkeit. Eine Analyse der Datensammlungspraktiken im E-Mail-Direktmarketing zeigt zudem, dass eine Überwachung des Leseverhaltens in der Marketingkommunikation von E-Commerce-Unternehmen ohne explizite Kundenzustimmung weit verbreitet ist. Diese Erkenntnis bildet die Grundlage für ein auf maschinellem Lernen basierendes System zur Erkennung und Löschung von Tracking-Elementen in E-Mails.
The digitization of the economy has fundamentally changed the way in which companies interact with customers and made customer targeting a key intersection of marketing and information systems. Building models of customer behavior at scale requires development of tools at the intersection of data management and statistical knowledge discovery. ;
This dissertation widens the scope of research on predictive modeling by focusing on the intersections of model building with data collection and decision support. Its goals are 1) to develop and validate new machine learning methods explicitly designed to optimize customer targeting decisions in direct marketing and customer retention management and 2) to study the implications of data collection for customer targeting from the perspective of the company and its customers.;
First, the thesis proposes methods that utilize the richness of e-commerce data, reduce the cost of data collection through efficient experiment design and address the targeting decision setting during model building. The underlying state-of-the-art machine learning models scale to high-dimensional customer data and can be conveniently applied by practitioners. These models further address the problem of causal inference that arises when the causal attribution of customer behavior to a marketing incentive is difficult. Marketers can directly apply the model estimates to identify profitable targeting policies under complex cost structures. ;
Second, the thesis quantifies the savings potential of efficient experiment design and the monetary cost of an internal principle of data privacy. An analysis of data collection practices in direct marketing emails reveals the ubiquity of tracking mechanisms without user consent in e-commerce communication. These results form the basis for a machine-learning-based system for the detection and deletion of tracking elements from emails.
Customer scoring models are the core of scalable direct marketing. Uplift models provide an estimate of the incremental benefit from a treatment that is used for operational decision-making. Training ...and monitoring of uplift models require experimental data. However, the collection of data under randomized treatment assignment is costly, since random targeting deviates from an established targeting policy. To increase the cost-efficiency of experimentation and facilitate frequent data collection and model training, we introduce supervised randomization. It is a novel approach that integrates existing scoring models into randomized trials to target relevant customers, while ensuring consistent estimates of treatment effects through correction for active sample selection. An empirical Monte Carlo study shows that data collection under supervised randomization is cost-efficient, while downstream uplift models perform competitively.