Akademska digitalna zbirka SLovenije - logo
E-viri
Recenzirano Odprti dostop
  • Robust identification of em...
    Haupt, Johannes; Bender, Benedict; Fabian, Benjamin; Lessmann, Stefan

    European journal of operational research, 11/2018, Letnik: 271, Številka: 1
    Journal Article

    •We show the prevalence of email tracking in marketing communication.•We propose features that facilitate tracking detection using machine learning.•The new features are resilient against manipulation by trackers.•We assess the detection model through out-of-time-and-universe validation.•Tree learning algorithms achieve high detection rates and few false alarms. Email tracking allows email senders to collect fine-grained behavior and location data on email recipients, who are uniquely identifiable via their email address. Such tracking invades user privacy in that email tracking techniques gather data without user consent or awareness. Striving to increase privacy in email communication, this paper develops a detection engine to be the core of a selective tracking blocking mechanism in the form of three contributions. First, a large collection of email newsletters is analyzed to show the wide usage of tracking over different countries, industries and time. Second, we propose a set of features geared towards the identification of tracking images under real-world conditions. Novel features are devised to be computationally feasible and efficient, generalizable and resilient towards changes in tracking infrastructure. Third, we test the predictive power of these features in a benchmarking experiment using a selection of state-of-the-art classifiers to clarify the effectiveness of model-based tracking identification. We evaluate the expected accuracy of the approach on out-of-sample data, over increasing periods of time, and when faced with unknown senders.