•Deep learning based semantic segmentation method with fully convolutional network is proposed.•Defect images are captured via a self-developed image acquisition equipment (MTI-200a).•The proposed ...method can rapidly and accurately recognize defects for metro shield tunnels.
The performance of traditional visual inspection by handcrafted features for crack and leakage defects of metro shield tunnel is hardly satisfactory nowadays because it is low-efficient to distinguish defects from some interference such as segmental joints, bolt holes, cables and manual marks. Based on deep learning (DL), this paper proposes a novel image recognition algorithm for semantic segmentation of crack and leakage defects of metro shield tunnel using hierarchies of features extracted by fully convolutional network (FCN). The defect images in training dataset and testing dataset are captured via a self-developed image acquisition equipment named Moving Tunnel Inspection (MTI-200a). After the establishment of image datasets, FCN models of crack and leakage are separately trained through several iterations of forward inference and backward learning. Semantic segmentation of defect images is implemented via the corresponding FCN models using two-stream algorithm, i.e. one stream is used to recognize the crack by sliding-window-assembling operation and the other is adopted for the leakage by resizing-interpolation operation. Compared with two frequently-used traditional methods, i.e. region growing algorithm (RGA) and adaptive thresholding algorithm (ATA), great superiority of the proposed method in terms of recognition results, inference time and error rates is shown based on four typical types of defect images which are crack-only image, leakage-only image, two-defect-nonoverlapping (TDN) image, two-defect-overlapping (TDO) image. The proposed method using DL can be employed to rapidly and accurately recognize defects for structure health monitoring and maintenance of metro shield tunnels.
In this work, we introduce Video Question Answering in the temporal domain to infer the past, describe the present and predict the future. We present an encoder–decoder approach using Recurrent ...Neural Networks to learn the temporal structures of videos and introduce a dual-channel ranking loss to answer multiple-choice questions. We explore approaches for finer understanding of video content using the question form of “fill-in-the-blank”, and collect our Video Context QA dataset consisting of 109,895 video clips with a total duration of more than 1000 h from existing TACoS, MPII-MD and MEDTest 14 datasets. In addition, 390,744 corresponding questions are generated from annotations. Extensive experiments demonstrate that our approach significantly outperforms the compared baselines.
•The S–kNN algorithm identifies an optimal k value for each test sample.•Our approach takes the local structures of samples into account.•This paper proposes a novel optimization method to solve the ...designed objective function.
This paper studies an example-driven k-parameter computation that identifies different k values for different test samples in kNN prediction applications, such as classification, regression and missing data imputation. This is carried out with reconstructing a sparse coefficient matrix between test samples and training data. In the reconstruction process, an ℓ1−norm regularization is employed to generate an element-wise sparsity coefficient matrix, and an LPP (Locality Preserving Projection) regularization is adopted to keep the local structures of data for achieving the efficiency. Further, with the learnt k value, kNN approach is applied to classification, regression and missing data imputation. We experimentally evaluate the proposed approach with 20 real datasets, and show that our algorithm is much better than previous kNN algorithms in terms of data mining tasks, such as classification, regression and missing value imputation.
Crack detection is a crucial task in periodic pavement survey. This study establishes and compares the performance of two intelligent approaches for automatic recognition of pavement cracks. The ...first model relies on edge detection approaches of the Sobel and Canny algorithms. Since the implementation of the two edge detectors require the setting of threshold values, Differential Flower Pollination, as a metaheuristic, is employed to fine-tune the model parameters. The second model is constructed by the implementation of the Convolution Neural Network (CNN) – a deep learning algorithm. CNN has the advantage of performing the feature extraction and the prediction of crack/non-crack condition in an integrated and fully automated manner. Experimental results show that the model based on CNN achieves a good prediction performance of Classification Accuracy Rate (CAR) = 92.08%. This performance is significantly better than the method based on the edge detection algorithms (CAR = 79.99%). Accordingly, the proposed CNN based crack detection model is a promising alternative to support transportation agencies in the task of periodic pavement inspection.
•Two approaches for automatic recognition of pavement crack are constructed.•The first approach relies on the Sobel and Canny edge detection algorithms.•The second approach employs deep neural network.•Edge detection based model attains an accuracy rate of 79.99%.•Deep neural network achieves a superior accuracy rate of 92.08%.
Localization-based super-resolution techniques open the door to unprecedented analysis of molecular organization. This task often involves complex image processing adapted to the specific topology ...and quality of the image to be analyzed. Here we present a segmentation framework based on Voronoï tessellation constructed from the coordinates of localized molecules, implemented in freely available and open-source SR-Tesseler software. This method allows precise, robust and automatic quantification of protein organization at different scales, from the cellular level down to clusters of a few fluorescent markers. We validated our method on simulated data and on various biological experimental data of proteins labeled with genetically encoded fluorescent proteins or organic fluorophores. In addition to providing insight into complex protein organization, this polygon-based method should serve as a reference for the development of new types of quantifications, as well as for the optimization of existing ones.
•Digital image colorimetry on smartphone is a powerful, fast and low-cost analysis method.•Detecting target analyte with color changes of digital image of sample.•Principle, color spaces, components, ...and application of digital image colorimetry on smartphone were summarized.•Digital image colorimetry on smartphone will be improved with the rapid development of smartphone’s camera and APPs.
Digital image colorimetry (DIC) on smartphone is regarded as a powerful, fast and low-cost analysis method to measure target analyte with color changes of digital image obtained by the built-in camera. We summarized the basic procedure of DIC, the color spaces (RGB, CMYK, HSB/HSL, CIE XYZ, L*a*b*, and YUV), the principal architectures (tools for capturing image, lighting conditions, and color quantification APPs and DIC APPs), and current status of DIC on smartphone in analysis of metals/heavy metals, herbicides, pesticides, antibiotics, biological and medical indicators, natural compounds, and bacteria/viruses. The advantages and disadvantages of DIC are also revealed. Nowadays, DIC on smartphone must be further refined with controlled geometry and standard lighting sources to become robust and reliable analytical procedures. And it will be improved in the near future with the continuous development of smartphones owing to the rapid development of smartphone’s camera technology and the continuous optimization of related software.
This paper introduces a video representation based on dense trajectories and motion boundary descriptors. Trajectories capture the local motion information of the video. A dense representation ...guarantees a good coverage of foreground motion as well as of the surrounding context. A state-of-the-art optical flow algorithm enables a robust and efficient extraction of dense trajectories. As descriptors we extract features aligned with the trajectories to characterize shape (point coordinates), appearance (histograms of oriented gradients) and motion (histograms of optical flow). Additionally, we introduce a descriptor based on motion boundary histograms (MBH) which rely on differential optical flow. The MBH descriptor shows to consistently outperform other state-of-the-art descriptors, in particular on real-world videos that contain a significant amount of camera motion. We evaluate our video representation in the context of action classification on nine datasets, namely KTH, YouTube, Hollywood2, UCF sports, IXMAS, UIUC, Olympic Sports, UCF50 and HMDB51. On all datasets our approach outperforms current state-of-the-art results.
Existing computational models for salient object detection primarily rely on hand-crafted features, which are only able to capture low-level contrast information. In this paper, we learn the ...hierarchical contrast features by formulating salient object detection as a binary labeling problem using deep learning techniques. A novel superpixelwise convolutional neural network approach, called SuperCNN, is proposed to learn the internal representations of saliency in an efficient manner. In contrast to the classical convolutional networks, SuperCNN has four main properties. First, the proposed method is able to learn the hierarchical contrast features, as it is fed by two meaningful superpixel sequences, which is much more effective for detecting salient regions than feeding raw image pixels. Second, as SuperCNN recovers the contextual information among superpixels, it enables large context to be involved in the analysis efficiently. Third, benefiting from the superpixelwise mechanism, the required number of predictions for a densely labeled map is hugely reduced. Fourth, saliency can be detected independent of region size by utilizing a multiscale network structure. Experiments show that SuperCNN can robustly detect salient objects and outperforms the state-of-the-art methods on three benchmark datasets.
Zero-shot learning for visual recognition, e.g., object and action recognition, has recently attracted a lot of attention. However, it still remains challenging in bridging the semantic gap between ...visual features and their underlying semantics and transferring knowledge to semantic categories unseen during learning. Unlike most of the existing zero-shot visual recognition methods, we propose a stagewise bidirectional latent embedding framework of two subsequent learning stages for zero-shot visual recognition. In the bottom–up stage, a latent embedding space is first created by exploring the topological and labeling information underlying training data of known classes via a proper supervised subspace learning algorithm and the latent embedding of training data are used to form landmarks that guide embedding semantics underlying unseen classes into this learned latent space. In the top–down stage, semantic representations of unseen-class labels in a given label vocabulary are then embedded to the same latent space to preserve the semantic relatedness between all different classes via our proposed semi-supervised Sammon mapping with the guidance of landmarks. Thus, the resultant latent embedding space allows for predicting the label of a test instance with a simple nearest-neighbor rule. To evaluate the effectiveness of the proposed framework, we have conducted extensive experiments on four benchmark datasets in object and action recognition, i.e., AwA, CUB-200-2011, UCF101 and HMDB51. The experimental results under comparative studies demonstrate that our proposed approach yields the state-of-the-art performance under inductive and transductive settings.
Digitizing side-channel signals at high sampling rates produces huge amounts of data, while side-channel analysis techniques only need those specific trace segments containing Cryptographic ...Operations (COs). For detecting these segments, waveform-matching techniques have been established comparing the signal with a template of the CO’s characteristic pattern. Real-time waveform matching requires highly parallel implementations as achieved by hardware design but also reconfigurability as provided by Field-Programmable Gate Arrays (FPGAs) to adapt the matching hardware to a specific CO pattern. However, currently proposed designs process the samples from analog-to-digital converters sequentially and can only process low sampling rates due to the limited clock speed of FPGAs. In this article, we present a parallel waveform-matching architecture capable of performing high-speed waveform matching on a high-end FPGA-based digitizer. We also present a workflow for calibrating the waveform-matching system to the specific pattern of the CO in the presence of hardware restrictions provided by the FPGA hardware. Our implementation enables waveform matching at 10 GS/s, offering a speedup of 50× compared to the fastest state-of-the-art implementation known to us. We demonstrate how to apply the technique for attacking the widespread XTS-AES algorithm using waveform matching to recover the encrypted tweak even in the presence of so-called systemic noise.