Introduction. Many computer vision applications often use procedures for recognizing various shapes and estimating their dimensional characteristics. The entire pipeline of such processing consists ...of several stages, each of which has no clearly defined boundaries. However, it can be divided into low, medium, and high-level processes. Low-level processes only deal with primitive operations such as preprocessing to reduce noise, enhance contrast, or sharpen images. The processes of this level are characterized by the fact that there are images at the input and output. Image processing at the middle level covers tasks such as segmentation, description of objects, and their compression into a form convenient for computer processing. Middle-level processes are characterized by the presence of images only at the input, and only signs and attributes extracted from images are received at the output. High-level processing involves “understanding” a set of recognized objects and recognizing their interactions.
Using the example of the developed software models for recognizing figures and estimating their characteristics, it is shown that the image processing process is reduced to transforming spatial image data into metadata, compressing the amount of information, which leads to a significant increase in the importance of data. This indicates that at the input of the middle level, the image should be as informative as possible (with high contrast, no noise, artifacts, etc.) because after the transformation of the spatial image data into metadata, no further the procedures are not able to correct the data obtained by the video sensors in the direction of improving or increasing the information content.
Recognition of figures in an image can be realized quite efficiently through the use of the procedure for determining the contours of figures. To do this, you need to determine the boundaries of objects and localize them in the image, often the first step for procedures such as separating objects from the background, image segmentation, detection and recognition of various objects, etc.
The purpose of the article is to study the image processing pipeline from the moment of image fixation to the recognition of a certain set of figures (for example, geometric shapes, such as a triangle, quadrilateral, etc.) in an image, the development of software models for recognizing figures in an image, determining the center of mass figures by means of computer vision.
Results. We proposed and tested some variants of nonlinear estimating problem. The properties of such problems depend on value of regulating parameter. The dependence of estimation on value of parameter was studied. It was defined a range for parameter's value for which estimating problem gives adequate result for initial task.
Numerical examples show how much volume of calculations reduces when using a dynamic branching tree.
Conclusions. The results obtained can be used in many applications of computer vision, for example, counting objects in a scene, estimating their parameters, estimating the distance between objects in a scene, etc. Keywords: contour, segmentation, image binarization, computer vision, histogram.
•This paper advances the background subtraction approach for image binarization.•Our approach formulates a robust regression to estimate an image background.•The proposed approach does not require ...any prior identification of edge pixels.•The propose threshold selector binarizes noisy images better after background subtraction.•The approach was validated with 26 benchmark images, comparing to nine existing methods.
This paper presents a robust regression approach for image binarization under significant background variations and observation noise. The work is motivated by the need of identifying foreground regions in noisy microscopic images or degraded document images, where significant background variations and observation noise make image binarization challenging. The proposed method first estimates the background of an input image, subtracts the estimated background from the input image, and performs a global thresholding operation to the subtracted outcome thus achieving the binary image of the foreground. A robust regression approach is proposed to estimate the background intensity surface with minimal effects of the foreground intensities and observation noise, and a global threshold selector is proposed on the basis of a model selection criterion in a sparse regression. The proposed approach is validated using 26 test images and the corresponding ground truths, and the outcomes are compared with those of nine existing image binarization methods. The approach is also combined with three morphological segmentation methods to show how the proposed approach can improve their image segmentation outcomes.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
Document image binarization refers to the conversion of a document image into a binary image. For broken and severely degraded document images, binarization is a very challenging process. Unlike the ...traditional methods that separate the foreground from the background, this paper presents a new framework for the binarization of broken and degraded document images and restoring the quality of the document images. In our approach, the non-local means method is extended and used to remove noises from the input document image in the step of pre-process. Then the proposed method binarizes the document image which takes advantage of the quick adaptive thresholding proposed by Pierre D. Wellner. To get more pleasing binarization results, the binarized document image is post-processed finally. There are three measures in the post-process step: de-speckle, preserve stroke connectivity and improve quality of text regions. Experimental results show significant improvement in the binarization of the broken and degraded document images collected from various sources including degraded and broken books, magazines and document files.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
Deep learning (DL) has recently changed the development of intelligent systems and is widely adopted in many real-life applications. Despite their various benefits and potentials, there is a high ...demand for DL processing in different computationally limited and energy-constrained devices. It is natural to study game-changing technologies such as Binary Neural Networks (BNN) to increase DL capabilities. Recently remarkable progress has been made in BNN since they can be implemented and embedded on tiny restricted devices and save a significant amount of storage, computation cost, and energy consumption. However, nearly all BNN acts trade with extra memory, computation cost, and higher performance. This article provides a complete overview of recent developments in BNN. This article focuses exclusively on 1-bit activations and weights 1-bit convolution networks, contrary to previous surveys in which low-bit works are mixed in. It conducted a complete investigation of BNN’s development—from their predecessors to the latest BNN algorithms/techniques, presenting a broad design pipeline and discussing each module’s variants. Along the way, it examines BNN (a) purpose: their early successes and challenges; (b) BNN optimization: selected representative works that contain essential optimization techniques; (c) deployment: open-source frameworks for BNN modeling and development; (d) terminal: efficient computing architectures and devices for BNN and (e) applications: diverse applications with BNN. Moreover, this paper discusses potential directions and future research opportunities in each section.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Purpose
To determine the effect of low‐intensity, long‐wavelength red light therapy (LLRT) on the inhibition of myopia progression in children.
Methods
A retrospective study was conducted. One ...hundred and five myopic children (spherical equivalent refractive error SER −3.09 ± 1.74 dioptres D; mean age, 9.19 ± 2.40 years) who underwent LLRT treatment (power 0.4 mW, wavelength 635 nm) twice per day for 3 min each session, with at least a 4‐h interval between sessions, and a control group of 56 myopic children (SER −3.04 ± 1.66 D; mean age, 8.62 ± 2.45 years) were evaluated. Both groups wore single‐vision distance spectacles. Each child returned for a follow‐up examination every 3 months after the initial measurements for a total of 9 months.
Results
At 9 months, the mean SER in the LLRT group was −2.87 ± 1.89 D, significantly greater than that of the control group (−3.57 ± 1.49 D, p < 0.001). Axial length (AL) changes were −0.06 ± 0.19 mm and 0.26 ± 0.15 mm in the LLRT group and control group (p < 0.001), respectively. The subfoveal choroidal thickness changed by 45.32 ± 30.88 μm for children treated with LLRT at the 9‐month examination (p < 0.001). Specifically, a substantial hyperopic shift (0.31 ± 0.24 D and 0.20 ± 0.14 D, respectively, p = 0.02) was found in the 8–14 year olds compared with 4–7 year old children. The decrease in AL in subjects with baseline AL >24 mm was −0.08 ± 0.19 mm, significantly greater than those with a baseline AL ≤24 mm (−0.04 ± 0.18 mm, p = 0.03).
Conclusions
Repetitive exposure to LLRT therapy was associated with slower myopia progression and reduced axial growth after short durations of treatment. These results require further validation in randomised controlled trials.
Full text
Available for:
DOBA, FZAB, GIS, IJS, IZUM, KILJ, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBMB, UILJ, UKNU, UL, UM, UPUK
LDAHash: Improved Matching with Smaller Descriptors Strecha, C.; Bronstein, A. M.; Bronstein, M. M. ...
IEEE transactions on pattern analysis and machine intelligence,
2012-Jan., 2012, 2012-Jan, 2012-01-00, 20120101, Volume:
34, Issue:
1
Journal Article
Peer reviewed
Open access
SIFT-like local feature descriptors are ubiquitously employed in computer vision applications such as content-based retrieval, video analysis, copy detection, object recognition, photo tourism, and ...3D reconstruction. Feature descriptors can be designed to be invariant to certain classes of photometric and geometric transformations, in particular, affine and intensity scale transformations. However, real transformations that an image can undergo can only be approximately modeled in this way, and thus most descriptors are only approximately invariant in practice. Second, descriptors are usually high dimensional (e.g., SIFT is represented as a 128-dimensional vector). In large-scale retrieval and matching problems, this can pose challenges in storing and retrieving descriptor data. We map the descriptor vectors into the Hamming space in which the Hamming metric is used to compare the resulting representations. This way, we reduce the size of the descriptors by representing them as short binary strings and learn descriptor invariance from examples. We show extensive experimental validation, demonstrating the advantage of the proposed approach.
Document binarization is a crucial pre-processing step for various document analysis tasks. However, existing methods fail to accurately capture stroke edges, primarily due to the inherent ...limitations of vanilla convolutions and the absence of adequate boundary-related supervision during stroke edge extraction. In this paper, we formulate text extraction as the learning of gating values and propose an end-to-end network architecture based on gated convolutions, named GDB, to address the problem of imprecise stroke edge extraction. The gated convolutions enable the selective extraction of stroke feature with different attention. Our proposed framework comprises two stages. Firstly, a coarse sub-network with an extra edge branch is trained to enhance the precision of feature maps by incorporating a priori mask and edge information. Secondly, a refinement sub-network is cascaded to enhance the output of the first stage using gated convolutions based on the sharp edges. To effectively incorporate global information, GDB also integrates a parallelized multi-scale operation that combines local and global features. We conduct comprehensive experiments on ten Document Image Binarization Contest (DIBCO) datasets from 2009 to 2019 and Document Deblurring Datasets. Experimental results show that our proposed methods outperform the state-of-the-art methods across all metrics on average. Extensive ablation studys demonstrate the efficacy of key components. Available codes: https://github.com/Royalvice/GDB.
•We propose a novel end-to-end document binarization method using gated convolutions.•We solve the problem of imprecise stroke edge extraction by gated convolutions.•We visualize the gating values to explain the effectiveness of gated convolutions.•The network with multi-scale fusion and edge branch is trained on multiple losses.•The proposed method outperforms state-of-the-art methods on DIBCO datasets.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
•ILGCHD model can eliminate the effect of pseudo cracks while reducing the model parameters and total computation.•Features of different levels can be combined more effectively through dense ...conditional random field after SegNet.•SegNet-DCRF model can eliminate noise interference and extract clearer crack edges.•A software interface can display various information on the crack detection.
Pavement crack detection is a key task to ensure driving safety. Manual crack detection is subjective and inefficient. Therefore, it is very important to develop an automatic crack recognition system. However, asphalt pavement crack detection from images is a challenging problem due to the interference of complex noises. Meanwhile, although deep learning-based methods have recently made great progress in crack classification and recognition, there are still difficulties, such as large parameters and low detection efficiency. For this purpose, we develop a novel crack recognition and analysis system. Firstly, crack images are cut by using the overlapping sliding window to establish crack datasets. Then a crack classification algorithm based on interleaved low-rank group convolution hybrid deep network (ILGCHDN) is proposed to recognize cracks and non-cracks. Next, we propose a crack image binarization architecture called SegNet-DCRF, which fuses the SegNet and the dense condition random field (DCRF). Finally, calculate the unidirectional crack width and the web crack area. Moreover, an interactive crack detection software is developed to further display various information on results. Experimental results show that our model is superior to other state-of-the-art algorithms in terms of accuracy, parameters, speed, and anti-interference ability. Also, for cracks with width of 3 mm or more, the relative error is less than 0.02, which better achieves the actual engineering requirements. Besides, the area of web cracks is calculated to comprehensively evaluate the pavement damage level.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
•We present the initial results of a novel method of using neural networks for soil XCT image segmentation.•Depending on the sample, the accuracy in terms of permeability hit 5% error.•To segment ...soil images, we used hybrid U-net + ResNet-101 architecture.•It was shown that the low representativity of XCT images could explain low accuracy cases.•Larger image libraries, better ground-truth data and network architecture were proposed as ways forward.
Direct imaging methods, among which X-ray computed tomography (XCT) continues to dominate, enable the study of soil structure at different scales. However, to compute different morphological parameters or assess soil physical properties using pore-scale modelling we need to perform image segmentation to divide the XCT greyscale image representing local absorption of X-ray radiation into major constituents or phases. Here we focused on the simplest type of segmentation procedure – binarization into pores and solid phases. We present the initial results for soil XCT image segmentation using convolutional neural networks (CNN). We assumed that current state-of-the-art local segmentation approaches could provide ground truth data to perform neural network training. We used hybrid U-net + ResNet-101 architecture and segmented seven soil XCT images. The training was performed by excluding the segmented image from training and validation datasets. The segmentations’ accuracy was assessed using standard computer vision metrics (precision, recall, intersection over union or IoU) and pore-scale simulations to compute the permeability of resulting 3D binary soil images. Depending on the soil sample, the error of segmentations in terms of computed hydraulic properties varied from 5% to 130%. The IoU metric was found to be the most sensitive to false positive and false negative porosity predictions by the neural network. To explain observed variations, we performed ground-truth and original XCT greyscale images analysis with the help of correlation and covariance functions. In addition to a comparison between images, we also trained another segmentation neural network that used all samples as a training/verification dataset that helped to explain the inaccuracies caused by insufficient representativeness of some soil sample structures in the training dataset. We discussed possible ways to improve the segmentation results in the future, including the usage of larger soil image libraries, physically modelled ground-truth data, and advanced neural network architectures.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP