•A novel document image binarization method based on the structural symmetric pixels (SSPs) is presented.•The adaptive gradient binarization algorithm for degraded document images is effective.•The ...new iterative stroke width estimation algorithm takes no parameters and is robust against noise.•Multiple local thresholds are used to increase the binarization accuracy through a voting strategy.•Extensive experiments on various datasets show that our method is robust and effective.
This paper presents an effective approach for the local threshold binarization of degraded document images. We utilize the structural symmetric pixels (SSPs) to calculate the local threshold in neighborhood and the voting result of multiple thresholds will determine whether one pixel belongs to the foreground or not. The SSPs are defined as the pixels around strokes whose gradient magnitudes are large enough and orientations are symmetric opposite. The compensated gradient map is used to extract the SSP so as to weaken the influence of document degradations. To extract SSP candidates with large magnitudes and distinguish the faint characters and bleed-through background, we propose an adaptive global threshold selection algorithm. To further extract pixels with opposite orientations, an iterative stroke width estimation algorithm is applied to ensure the proper size of neighborhood used in orientation judgement. At last, we present a multiple threshold vote based framework to deal with some inaccurate detections of SSP. The experimental results on seven public document image binarization datasets show that our method is accurate and robust compared with many traditional and state-of-the-art document binarization approaches based on multiple evaluation measures.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
To compare the agreement between optical coherence tomography (OCT)-based choroidal vascularity markers measured by two previously reported image binarization techniques.
Spectral-domain OCT using ...enhanced-depth imaging was performed in 100 eyes from 52 normal subjects. Choroidal images were binarized to luminal area and stromal area using two different algorithms. Choroidal vascularity marker was defined as the ratio of luminal area to total choroidal area and they were termed "luminal/choroidal area ratio (L/C ratio)" and "choroidal vascularity index (CVI)" per the algorithm. The agreement between choroidal vascularity markers measured by the two techniques was compared using intraclass correlation coefficient (ICC) and Bland-Altman analysis.
The mean values of choroidal vascularity markers were 70.12% (range, 56.76%-78.55%) for CVI and 67.44% (range, 51.09%-81.31%) for L/C ratio. Low level of absolute agreement between the two binarization techniques was reflected by an adjusted ICC of 0.353 using linear mixed model with age, sex, and spherical equivalent as covariates.
There was discrepancy between measurements of choroidal vascularity using two commonly adopted image binarization techniques. It remained unclear what was the true choroidal vascularity and which binarization algorithm was more accurate. Future studies with enhanced image quality and improved image analysis algorithm are required to decipher the ground truth for choroidal vascularity.
To evaluate the features of the choroidal structures in the eyes of myopic children obtained by enhanced depth imaging optical coherence tomography (EDI-OCT).
Ninety-six myopic children with low to ...moderate myopia (spherical equivalent refractive error SER, -5.75 to -1.00 diopter) were included in this cross-sectional study. Ocular biometrics were measured using an optical low-coherence reflectometry device. Data of the choroidal structures extracted from a 7500-µm cross-sectional arc of the choroid extending from the temporal optic disc margin, including the total choroidal area, luminal area, stromal area, and choroidal vascularity index, were determined by image binarization of the EDI-OCT. Associations between demographic factors, ocular parameters, and choroidal structures were evaluated using univariate and multiple linear regression analyses.
The study participants (mean age, 11.02 ± 1.70 years) had a mean axial length (AL) of 24.94 ± 0.70 mm. The mean total choroidal area was 2.64 ± 0.49 mm2 (luminal area, 1.68 ± 0.32 mm2; stromal area, 0.95 ± 0.19 mm2), and the choroidal vascularity index was 0.64 ± 0.03. Multiple regression analysis showed that the luminal area was significantly associated with the AL (standard β = -0.24, P = 0.022) after adjusting for sex and corneal radius (CR), whereas the stromal area (standard β = -0.30, P = 0.003) and choroidal vascularity index (standard β = 0.36, P = 0.001) were significantly associated with age after adjusting for sex, CR, and lens thickness (LT). Sex, CR, LT, and SER showed no significant association with choroidal structures after adjusting for age and AL (all P > 0.05).
The luminal area of the choroid tends to decrease with a longer AL, whereas the stromal area tends to decrease with increasing age in myopic children. These findings require further exploration in a longitudinal study.
•We propose a supervised binarization method based on the deep supervised networks.•The multi-scale deep supervised network for binarization has not been reported yet.•A hierarchical architecture is ...designed to distinguish text from background noises.•Different feature levels are dealt by the multi-scale architecture.•The performance results are considerably better than state-of-the-art methods.
The binarization of degraded document images is a challenging problem in terms of document analysis. Binarization is a classification process in which intra-image pixels are assigned to either of the two following classes: foreground text and background. Most of the algorithms are constructed on low-level features in an unsupervised manner, and the consequent disenabling of full utilization of input-domain knowledge considerably limits distinguishing of background noises from the foreground. In this paper, a novel supervised-binarization method is proposed, in which a hierarchical deep supervised network (DSN) architecture is learned for the prediction of the text pixels at different feature levels. With higher-level features, the network can differentiate text pixels from background noises, whereby severe degradations that occur in document images can be managed. Alternatively, foreground maps that are predicted at lower-level features present a higher visual quality at the boundary area. Compared with those of traditional algorithms, binary images generated by our architecture have cleaner background and better-preserved strokes. The proposed approach achieves state-of-the-art results over widely used DIBCO datasets, revealing the robustness of the presented method.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
•Introduce fuzzy classification function (FCF) for vaguely separating texts from background.•Present FCF-based source that is defined in a local manner and does not involve any threshold.•Propose ...anisotropic diffusion with FCF-based source for binarizing degraded document images.•Design parallel-serial algorithm by finite difference and parallel/serial splitting methods.•Proposed model is superior to eight related models in terms of degraded document binarization.
Document image binarization plays a vital role in the document image analysis system; however, it remains challenging due to various degradations. In this paper, we propose an anisotropic diffusion model involving fuzzy-based source for binarizing degraded document images, in which the diffusion term is response for edge-preserving smoothing and the source term is used to group intensity values of foreground and background pixels into two dominant modes separated by zero. Specifically, a fuzzy classification function (FCF) is first introduced for vaguely separating foreground from background, which is defined in a local neighborhood of each point rather than in the entire image domain. Then, the fuzzy-based source is constructed by FCF and a speed restrictor, involving no threshold. In numerical aspects, we develop a parallel-serial algorithm by combining finite differencing and parallel/serial splitting methods in the literature. This algorithm is tested on seven publicly available datasets (DIBCO 2009 to 2014 and 2016) and compared with six PDE-based models and two variational models in terms of degraded document binarization. Experimental results illustrate that our model is very effective for binarization of degrade document images, and is superior to the compared models subjectively and objectively.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
The determination of threshold is crucial for the binarization of scanning electron microscopy (SEM) images. In this paper, a new method for determining the optimal threshold considering the ...three-dimensional characteristics of shale samples is proposed based on GIS. Firstly, the SEM images of seven shale samples at six magnifications are 3D reconstructed and visualized using ArcMap and ArcScene. Relevant parameters are obtained to calculate the 3D porosity φ3D. Secondly, the effects of magnification and segmentation size on φ3D are considered. By establishing the relationship between threshold and porosity, the optimal magnification, optimal segmented size and optimal threshold are obtained for the seven shale samples, respectively. Finally, the optimal threshold obtained by this method is applied to the finite element model of the pore structure and fractal dimension, respectively. The results show that the overall effect of 3D reconstruction and visualization is good. The optimal thresholds of seven shale samples are 94, 93, 105, 93, 81, 98 and 88, respectively. The porosity obtained from the finite element model of the pore structure based on the optimal threshold is suitable for SEM images with larger sizes. The fractal dimensions obtained based on the optimal threshold are all larger than those obtained at six magnifications, which can characterize a more complex pore structure and a more inhomogeneous pore size distribution. This method fully considers the three-dimensional characteristics of different samples and improves the accuracy of the threshold study of SEM images.
•3D reconstruction and visualization of SEM images based on GIS.•Determination of the optimal threshold considering 3D characteristics of each shale sample.•Application of this method to the finite element model of pore structure and fractal dimension.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
•Propose a novel document binarization method by cascading pre-trained U-Nets.•Use pre-trained U-Net for solving a training image shortage problem.•Study for optimal inter-module skip-connections ...between U-Net modules.•Analyze the results of DIBCO images included various types of noise.•Compare all DIBCO dataset (2009–2018) and show robust performance.
Display omitted
Artificial neural networks have been shown significant performance in various image-to-image conversion tasks. However, complex conversions often require a large number of images for model training. Therefore, we propose a convolutional model for image-to-image conversions using a pipeline of simpler image processing modules. To verify our proposed approach, we use a document image binarization as the task. Document image binarization is an important process that affects the accuracy of document analysis and recognition. In this paper, we propose a novel document binarization method called Cascading Modular U-Nets (CMU-Nets). CMU-Nets consist of pre-trained modular modules useful for overcoming the problem of a shortage of training images. We also propose a novel cascading scheme for improving overall cascading model performance. We verify the proposed model on all available Document Image Binarization Competition (DIBCO) and the Handwritten-DIBCO (H-DIBCO) datasets.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
Overcoming challenges in the current cancelable palmprint template is essential for enhancing privacy and security in biometric systems, as these templates exhibit vulnerabilities in both template ...security and recognition performance. To address these challenges, we present a novel cancelable palmprint template protection scheme with a deep attention net and randomized hashing security mechanism. Firstly, to enhance recognition performance, we designed a deep attention net that is integrated into the model, aiming to improve its feature extraction ability. Additionally, in recognition of the paramount importance of security and privacy, we introduce a randomized hashing security mechanism. This mechanism, incorporating chaotic sequences as weights in a neuron activation layer, appends to the model and enables dynamic control of neuron activation while generating diverse palmprint feature templates. Subsequently, it enhances security and privacy by combining a Logistic-Tent-Sine (LTTS) random key with palmprint feature values through matrix multiplication. Furthermore, to optimize efficiency, particularly with a large dataset, a binarization layer is implemented using the Straight-Through Estimation (STE) algorithm. This layer contributes to computational efficiency and expedient data processing, further improving the performance of our palmprint template protection. The experimental results validate the outstanding accuracy of this scheme on TJU and PolyU palmprint datasets, establishing it as a state-of-the-art solution with remarkable recognition performance. Moreover, security analysis confirms its compliance with cancelable biometric template protection criteria, ensuring superior irreversibility, unlinkability, revocability, and privacy against various attacks.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
The classical Otsu method is a common tool in document image binarization. Often, two classes, text and background, are imbalanced, which means that the assumption of the classical Otsu method is not ...met. In this work, we considered the imbalanced pixel classes of background and text: weights of two classes are different, but variances are the same. We experimentally demonstrated that the employment of a criterion that takes into account the imbalance of the classes' weights, allows attaining higher binarization accuracy. We described the generalization of the criteria for a two-parametric model, for which an algorithm for the optimal linear separation search via fast linear clustering was proposed. We also demonstrated that the two-parametric model with the proposed separation allows increasing the image binarization accuracy for the documents with a complex background or spots.