DIKUL - logo
E-resources
Full text
Peer reviewed
  • Multidimensional subgroup d...
    Ribeiro, J.; Fontes, T.; Soares, C.; Borges, J.L.

    Expert systems with applications, 07/2024, Volume: 246
    Journal Article

    Subgroup discovery (SD) aims at finding significant subgroups of a given population of individuals characterized by statistically unusual properties of interest. SD on event logs provides insight into particular behaviors of processes, which may be a valuable complement to the traditional process analysis techniques, especially for low-structured processes. This paper proposes a scalable and efficient method to search significant SD rules on frequent sequences of events, exploiting their multidimensional nature. With this method, it is intended to identify significant subsequences of events where the distribution of values of some target aspect is significantly different than the same distribution for the entire event log. A publicly available real-life event log of a Dutch hospital is used as a running example to demonstrate the applicability of our method. The proposed approach was applied on a real-life case study based on the public transport of a medium size European city (Porto, Portugal), for which the event data consists of 133 million smartcard travel validations from buses, trams and trains. The results include a characterization of mobility flows over multiple aspects, as well as the identification of unexpected behaviors in the flow of commuters (public transport). The generated knowledge provided a useful insight into the behavior of travelers, which can be applied at operational, tactical and strategic business levels, enhancing the current view of the transport services to transport authorities and operators. •Significant subgroup discovery rules on sequences of multidimensional events.•Process discovery and conformance checking on low-structured processes.•Real-life case study based on smartcard travel validations of a transport network.•Characterization of mobility flows over multiple aspects.•Identification of unexpected behaviors in the flow of commuters.