UP - logo
E-viri
Celotno besedilo
Recenzirano Odprti dostop
  • Human-centered deep composi...
    Koporec, Gregor; Perš, Janez

    Pattern recognition, June 2023, 2023-06-00, Letnik: 138
    Journal Article

    •CNNs cannot handle occlusions, are hard to explain, and are not knowledge-driven.•A mixture of CNNs and Hierarchical Compositional model can overcome the problems.•The proposed HCDC model is robust to occlusions, is discriminative, and generative.•Domain knowledge from human studies and physical laws is encoded in the HCDC model.•The HCDC model outperforms Mask-RCNN in instance segmentation tasks. Display omitted Despite their powerful discriminative abilities, Convolutional Neural Networks (CNNs) lack the properties of generative models. This leads to a decreased performance in environments where objects are poorly visible. Solving such a problem by adding more training samples can quickly lead to a combinatorial explosion, therefore the underlying architecture has to be changed instead. This work proposes a Human-Centered Deep Compositional model (HCDC) that combines low-level visual discrimination of a CNN and the high-level reasoning of a Hierarchical Compositional model (HCM). Defined as a transparent model, it can be optimized to real-world environments by adding compactly encoded domain knowledge from human studies and physical laws. The new FridgeNetv2 dataset and a mixture of publicly available datasets are used as a benchmark. The experimental results show the proposed model is explainable, has higher discriminative and generative power, and better handles the occlusion than the current state-of-the-art Mask-RCNN in instance segmentation tasks.