Akademska digitalna zbirka SLovenije - logo
ALL libraries (COBIB.SI union bibliographic/catalogue database)
  • Cross-lingual text classification for inferring variables describing silk fabrics
    Rei, Luis ; Mladenić, Dunja
    We present a cross-lingual method for the classification of texts describing silk fabrics with the aim of inferring properties of the fabric. We focus on properties relevant to the heritage context, ... namely, materials, techniques, production place and date (century). This method can be used to fill missing properties in an archive’s categorical description or to align them with a specific ontology with the ultimate goal of facilitating discovery and exploration of silk heritage in general and specifically, across languages. Our text classifier consists of a Convolutional Neural Network (CNN) with pre-trained aligned word embeddings. Our dataset was obtained by crawling web sites of museums in English and Spanish. We classify text descriptions of silk fabrics present in archives into a predefined set of domain-expert defined labels for each property. We evaluate our classifier in different scenarios: trained and tested with data from a single museum; trained with data from one museum and evaluated on a different museum; and finally, trained on data from a museum in one language and evaluated on data from a museum in a different language. Our results vary across scenarios and across properties with above 90% accuracy in certain cases, and around 50% in others.
    Type of material - conference contribution ; adult, serious
    Publish date - 2020
    Language - english
    COBISS.SI-ID - 131702787