UNI-MB - logo
UMNIK - logo
 
E-resources
Full text
Peer reviewed Open access
  • Data engineering for fraud ...
    Baesens, Bart; Höppner, Sebastiaan; Verdonck, Tim

    Decision Support Systems, November 2021, 2021-11-00, 20211101, Volume: 150
    Journal Article

    Financial institutions increasingly rely upon data-driven methods for developing fraud detection systems, which are able to automatically detect and block fraudulent transactions. From a machine learning perspective, the task of detecting suspicious transactions is a binary classification problem and therefore many techniques can be applied. Interpretability is however of utmost importance for the management to have confidence in the model and for designing fraud prevention strategies. Moreover, models that enable the fraud experts to understand the underlying reasons why a case is flagged as suspicious will greatly facilitate their job of investigating the suspicious transactions. Therefore, we propose several data engineering techniques to improve the performance of an analytical model while retaining the interpretability property. Our data engineering process is decomposed into several feature and instance engineering steps. We illustrate the improvement in performance of these data engineering steps for popular analytical models on a real payment transactions data set. •Companies increasingly rely upon data-driven methods for detecting fraud.•Data engineering is of utmost importance to improve the performance of most machine learning models.•Our data engineering process is decomposed into several feature and instance engineering steps.•The benefits of data engineering is illustrated on a payment transactions data set from a large European Bank.