UNI-MB - logo
UMNIK - logo
 
E-viri
Celotno besedilo
Recenzirano
  • Autoencoders for improving ...
    Nguyen, Hoang Thi Cam; Lee, Suhwan; Kim, Jongchan; Ko, Jonghyeon; Comuzzi, Marco

    Expert systems with applications, 10/2019, Letnik: 131
    Journal Article

    •Innovative use of autoencoders to reconstruct missing values in event logs.•Focus on anomalous and missing information at the level of event log attributes.•Methods tested on real life and artificial event logs.•Qualitative evaluation of impact on process discovery is also presented. Low quality of business process event logs, as determined by anomalous and missing values, is often unavoidable in practical contexts. The output of process analysis that uses event logs with missing and anomalous values is also likely to be of low quality, thus decreasing the quality of any decisions based on it. While previous work has focused on reconstructing missing events in an event log or removing anomalous traces, in this paper we focus on detecting anomalous values and reconstructing missing values at the level of attributes in event logs. We propose methods based on autoencoders, which are a class of neural networks that can reconstruct their own input and are particularly suitable to learn a model of the complex relationships among attribute values in an event log. These methods do not rely on any a-priori knowledge about the business process that generated an event log and are evaluated using real world and artificially-generated event logs. The paper also discusses a qualitative analysis of the impact of event log cleaning and reconstruction on the output of process discovery. The proposed approach shows remarkable performance regarding activity labels and timestamps in artificial event logs. The performance in the case of real world event logs, in particular timestamp anomaly detection, is lower, which may be due to high variability of attribute values in the chosen event logs. Process models discovered from reconstructed event logs are characterised by lower variability of allowed behaviour and, therefore, are more usable in practice.