ALL libraries (COBIB.SI union bibliographic/catalogue database)
  • Segmentation and detection of text in document images
    Zelenika, Darko ; Povh, Janez, 1973- ; Ženko, Bernard
    Text detection in document images plays an important role in optical character recognition systems and is a challenging task. The proposed text detection method uses self-adjusting bottom-up ... segmentation algorithm to segment a document image into a set of connected documents. The segmented connected components are then described in terms of 27 features and a machine learning algorithm is used to classify these components as text or non-text. We have collected a dataset (called ASTRoID), which contains 500 images of text blocks and 500 images of non-text blocks in order to test the method. We impirically compare performance of the proposed text detection method with seven different machine learning algorithms; the best performance is obtained with the radial support vector machine.
    Source: SOR '15 proceedings (Str. 155-160)
    Type of material - conference contribution
    Publish date - 2015
    Language - english
    COBISS.SI-ID - 17624665