Document image binarization is the first essential step in digitalizing images and is considered an essential technique in both document image analysis applications and optical character recognition ...operations, the binarization process is used to obtain a binary image from the original image, binary image is the proper presentation for image segmentation, recognition, and restoration as underlined by several studies which assure that the next step of document image analysis applications depends on the binarization result. However, old and historical document images mainly suffering from several types of degradations, such as bleeding through the blur, uneven illumination and other types of degradations which makes the binarization process a difficult task. Therefore, extracting of foreground from a degraded background relies on the degradation, furthermore it also depends on the type of used paper and document age. Developed binarization methods are necessary to decrease the impact of the degradation in document background. To resolve this difficulty, this paper proposes an effective, enhanced binarization technique for degraded and historical document images. The proposed method is based on enhancing an existing binarization method by modifying parameters and adding a post-processing stage, thus improving the resulting binary images. This proposed technique is also robust, as there is no need for parameter tuning. After using document image binarization Contest (DIBCO) datasets to evaluate this proposed technique, our findings show that the proposed method efficiency is promising, producing better results than those obtained by some of the winners in the DIBCO.
•We propose a novel iteration deep learning which can improve the input image iteratively.•We apply the proposed iterative deep learning for document enhancement and binarization in two possible ...ways: recurrent refinement and stacked refinement.•Our proposed method provides a new, clean version of the degraded image, one that is suitable for visualization and which shows promising results for binarization using Otsu’s global threshold.
This paper presents a novel iterative deep learning framework and applies it to document enhancement and binarization. Unlike the traditional methods that predict the binary label of each pixel on the input image, we train the neural network to learn the degradations in document images and produce uniform images of the degraded input images, which in turn allows the network to refine the output iteratively. Two different iterative methods have been studied in this paper: recurrent refinement (RR) that uses the same trained neural network in each iteration for document enhancement and stacked refinement (SR) that uses a stack of different neural networks for iterative output refinement. Given the learned nature of the uniform and enhanced image, the binarization map can be easily obtained through use of a global or local threshold. The experimental results on several public benchmark data sets show that our proposed method provides a new, clean version of the degraded image, one that is suitable for visualization and which shows promising results for binarization using Otsu’s global threshold, based on enhanced images learned iteratively by the neural network.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
•A selectional autoencoder approach for document image binarization is studied.•The neural network is devoted to learning an image-to-image binarization.•Comprehensive experimentation with datasets ...of different typology is presented.•Results demonstrate that the approach is able to outperform the state of the art.
Binarization plays a key role in the automatic information retrieval from document images. This process is usually performed in the first stages of document analysis systems, and serves as a basis for subsequent steps. Hence it has to be robust in order to allow the full analysis workflow to be successful. Several methods for document image binarization have been proposed so far, most of which are based on hand-crafted image processing strategies. Recently, Convolutional Neural Networks have shown an amazing performance in many disparate duties related to computer vision. In this paper we discuss the use of convolutional auto-encoders devoted to learning an end-to-end map from an input image to its selectional output, in which activations indicate the likelihood of pixels to be either foreground or background. Once trained, documents can therefore be binarized by parsing them through the model and applying a global threshold. This approach has proven to outperform existing binarization strategies in a number of document types.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
The intrinsic features of documents, such as paper color, texture, aging, translucency, the kind of printing, typing or handwriting, etc., are important with regard to how to process and enhance ...their image. Image binarization is the process of producing a monochromatic image having its color version as input. It is a key step in the document processing pipeline. The recent Quality-Time Binarization Competitions for documents have shown that no binarization algorithm is good for any kind of document image. This paper uses a sample of the texture of the scanned historical documents as the main document feature to select which of the 63 widely used algorithms, using five different versions of the input images, totaling 315 document image-binarization schemes, provides a reasonable quality-time trade-off.
•Hardware implementation of video compression and decompression processes.•Design and FPGA implementation of improved high throughput HEVC CABAC binarizer.•Design and FPGA implementation of improved ...high throughput HEVC CABAC De-binarizer.•New architectures of CABAC Binarization and de-binarization in H.265 video codec.
High efficiency video coding (HEVC) video codec applies different techniques in order to achieve high compression ratios and video quality that supports real-time applications. One of the critical techniques in HEVC is the Context adaptive Binary Arithmetic Coding (CABAC) which is type of entropy coding. CABAC comes at the cost of increased computational complexity, especially for parallelization and pipeline of these blocks: binarization, context modeling and binary arithmetic encoding. The Binarization (BZ) and de-Binarization (DBZ) methods are considered as important techniques in HEVC CABAC encoder and decoder respectively. Indeed, an important goal is to get high throughput in hardware architectures of CABAC BZ and DBZ in order to achieve high resolution applications. This work is the only one found on recent literature which focuses on design and implementation of full BZ and full DBZ compatible with H.265 and H.264. Consequently, a hardware architectures of BZ and DBZ are designed and implemented by using VHDL language, targeted an FPGA virtex4 xc4vsx25-12ff668 board and emulated with ModelSim. As a result, the implementation of BZ and DBZ can process 2 bins/cycle for each syntax element when operated at 697.83 MHz and 789.26 MHz, respectively. The proposed designs exhibits an improved high-throughput of 1395.66 Mbins/s for BZ and 1578.52 Mbins/s for the DBZ. The obtained Area Efficiencies in our proposed BZ and DBZ are about 0.544 Mbins/s/slices and 0.606 Mbins/s/slices, respectively, and it is better than many recent works.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
One of the most fundamental issues in image processing is the thresholding (binarization) method. This method is generally used for segmenting regions with different homogeneity in grayscale images. ...In other words, it performs clustering based on the intensity levels of pixels in an image histogram. This paper presents a new and effective approach to the global thresholding method of grayscale images. In the proposed method, alpha and beta regions are determined using the mean and standard deviation values of an image histogram. The optimum threshold value is obtained by calculating the average of gray-scale values of the alpha and beta regions. The experiments were carried out on three different image sets to demonstrate the effectiveness of the thresholding method. The result of experimental studies show that the proposed method achieves promising performance compared to many traditional, state-of-the-art thresholding and document binarization methods performed in H-DIBCO'14 (Document Image Binarization Competition), Human HT29 colon-cancer cells (BBBC008) and C. elegans live/dead assay (BBBC010) datasets based on various evaluation criteria.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
•We propose a two-stage color document image enhancement and binarization method using generative adversarial networks.•Four color-independent adversarial networks extract color foreground ...information from an input image for document image enhancement.•Two independent adversarial networks with global and local features are trained for image binarization of documents of variable size.•The proposed method outperforms state-of-the-art algorithms on various datasets.
Document image enhancement and binarization methods are often used to improve the accuracy and efficiency of document image analysis tasks such as text recognition. Traditional non-machine-learning methods are constructed on low-level features in an unsupervised manner but have difficulty with binarization on documents with severely degraded backgrounds. Convolutional neural network (CNN)based methods focus only on grayscale images and on local textual features. In this paper, we propose a two-stage color document image enhancement and binarization method using generative adversarial neural networks. In the first stage, four color-independent adversarial networks are trained to extract color foreground information from an input image for document image enhancement. In the second stage, two independent adversarial networks with global and local features are trained for image binarization of documents of variable size. For the adversarial neural networks, we formulate loss functions between a discriminator and generators having an encoder–decoder structure. Experimental results show that the proposed method achieves better performance than many classical and state-of-the-art algorithms over the Document Image Binarization Contest (DIBCO) datasets, the LRDE Document Binarization Dataset (LRDE DBD), and our shipping label image dataset. We plan to release the shipping label dataset as well as our implementation code at github.com/opensuh/DocumentBinarization/.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Documents often exhibit various forms of degradation, which make it hard to be read and substantially deteriorate the performance of an OCR system. In this paper, we propose an effective end-to-end ...framework named document enhancement generative adversarial networks (DE-GAN) that uses the conditional GANs (cGANs) to restore severely degraded document images. To the best of our knowledge, this practice has not been studied within the context of generative adversarial deep networks. We demonstrate that, in different tasks (document clean up, binarization, deblurring and watermark removal), DE-GAN can produce an enhanced version of the degraded document with a high quality. In addition, our approach provides consistent improvements compared to state-of-the-art methods over the widely used DIBCO 2013, DIBCO 2017, and H-DIBCO 2018 datasets, proving its ability to restore a degraded document image to its ideal condition. The obtained results on a wide variety of degradation reveal the flexibility of the proposed model to be exploited in other document enhancement problems.