The representation power of convolutional neural network (CNN) models for hyperspectral image (HSI) analysis is in practice limited by the available amount of the labeled samples, which is often ...insufficient to sustain deep networks with many parameters. We propose a novel approach to boost the network representation power with a two-stream 2-D CNN architecture. The proposed method extracts simultaneously, the spectral features and local spatial and global spatial features, with two 2-D CNN networks and makes use of channel correlations to identify the most informative features. Moreover, we propose a layer-specific regularization and a smooth normalization fusion scheme to adaptively learn the fusion weights for the spectral-spatial features from the two parallel streams. An important asset of our model is the simultaneous training of the feature extraction, fusion, and classification processes with the same cost function. Experimental results on several hyperspectral data sets demonstrate the efficacy of the proposed method compared with the state-of-the-art methods in the field.
In this paper, we have trained several deep convolutional networks with introduced training techniques for classifying X-ray images into three classes: normal, pneumonia, and COVID-19, based on two ...open-source datasets. Our data contains 180 X-ray images that belong to persons infected with COVID-19, and we attempted to apply methods to achieve the best possible results. In this research, we introduce some training techniques that help the network learn better when we have an unbalanced dataset (fewer cases of COVID-19 along with more cases from other classes). We also propose a neural network that is a concatenation of the Xception and ResNet50V2 networks. This network achieved the best accuracy by utilizing multiple features extracted by two robust networks. For evaluating our network, we have tested it on 11302 images to report the actual accuracy achievable in real circumstances. The average accuracy of the proposed network for detecting COVID-19 cases is 99.50%, and the overall average accuracy for all classes is 91.4%.
•We introduce a deep convolution network based on the concatenation of Xception and ReNet50V2 to improve the accuracy.•We propose a training technique for dealing with unbalanced datasets.•We evaluate our networks on 11302 chest X-ray images.•We have evaluated ResNet50V2 and Xception on our dataset and compared our proposed network with them. .
We investigate the relationship between the frequency spectrum of image data and the generalization behavior of convolutional neural networks (CNN). We first notice CNN's ability in capturing the ...high-frequency components of images. These high-frequency components are almost imperceptible to a human. Thus the observation leads to multiple hypotheses that are related to the generalization behaviors of CNN, including a potential explanation for adversarial examples, a discussion of CNN's trade-off between robustness and accuracy, and some evidence in understanding training heuristics.
Deep Convolutional Neural Network (CNN) is a special type of Neural Networks, which has shown exemplary performance on several competitions related to Computer Vision and Image Processing. Some of ...the exciting application areas of CNN include Image Classification and Segmentation, Object Detection, Video Processing, Natural Language Processing, and Speech Recognition. The powerful learning ability of deep CNN is primarily due to the use of multiple feature extraction stages that can automatically learn representations from the data. The availability of a large amount of data and improvement in the hardware technology has accelerated the research in CNNs, and recently interesting deep CNN architectures have been reported. Several inspiring ideas to bring advancements in CNNs have been explored, such as the use of different activation and loss functions, parameter optimization, regularization, and architectural innovations. However, the significant improvement in the representational capacity of the deep CNN is achieved through architectural innovations. Notably, the ideas of exploiting spatial and channel information, depth and width of architecture, and multi-path information processing have gained substantial attention. Similarly, the idea of using a block of layers as a structural unit is also gaining popularity. This survey thus focuses on the intrinsic taxonomy present in the recently reported deep CNN architectures and, consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature-map exploitation, channel boosting, and attention. Additionally, the elementary understanding of CNN components, current challenges, and applications of CNN are also provided.
Stress-strain curves are an important representation of a material's mechanical properties, from which important properties such as elastic modulus, strength, and toughness, are defined. However, ...generating stress-strain curves from numerical methods such as finite element method (FEM) is computationally intensive, especially when considering the entire failure path for a material. As a result, it is difficult to perform high throughput computational design of materials with large design spaces, especially when considering mechanical responses beyond the elastic limit. In this work, a combination of principal component analysis (PCA) and convolutional neural networks (CNN) are used to predict the entire stress-strain behavior of binary composites evaluated over the entire failure path, motivated by the significantly faster inference speed of empirical models. We show that PCA transforms the stress-strain curves into an effective latent space by visualizing the eigenbasis of PCA. Despite having a dataset of only 10-27% of possible microstructure configurations, the mean absolute error of the prediction is <10% of the range of values in the dataset, when measuring model performance based on derived material descriptors, such as modulus, strength, and toughness. Our study demonstrates the potential to use machine learning to accelerate material design, characterization, and optimization.
Display omitted
•Used convolutional neural networks to predict the entire stress-strain curve of binary composites with excellent accuracy•Demonstrated the ability to compress stress-strain curves with negligible loss in information•Established novel custom loss function for neural networks predicting compressed stressstrain curves•Formulated interpretable evaluation metric of a machine learning model’s ability to predict stress-strain curves
LiDAR-based or RGB-D-based object detection is used in numerous applications, ranging from autonomous driving to robot vision. Voxel-based 3D convolutional networks have been used for some time to ...enhance the retention of information when processing point cloud LiDAR data. However, problems remain, including a slow inference speed and low orientation estimation performance. We therefore investigate an improved sparse convolution method for such networks, which significantly increases the speed of both training and inference. We also introduce a new form of angle loss regression to improve the orientation estimation performance and a new data augmentation approach that can enhance the convergence speed and performance. The proposed network produces state-of-the-art results on the KITTI 3D object detection benchmarks while maintaining a fast inference speed.
Deep learning has been successfully applied to image denoising. In this study, we take one step forward by using deep learning to suppress random noise in poststack seismic data from the aspects of ...network architecture and training samples. On the one hand, poststack seismic data denoising mainly aims at 3-D seismic data. We designed an end-to-end 3-D denoising convolutional neural network (3-D-DnCNN) that takes raw 3-D cubes as input in order to better extract the features of the 3-D spatial structure of poststack seismic data. On the other hand, denoising images with deep learning require noisy-clean sample pairs for training. In the field of seismic data processing, researchers usually try their best to suppress noise by using complex processes that combine different methods, but clean labels of seismic data are not available. In addition, building training samples in field seismic data has become an interesting but challenging problem. Therefore, we propose a training sample selection method that contains a complex workflow to produce comparatively ideal training samples. Experiments in this study demonstrate that deep learning can directly learn the ability to denoise field seismic data from selected samples. Although the building of the training samples may occur through a complex process, the experimental results of synthetic seismic data and field seismic data show that the 3-D-DnCNN has learned the ability to suppress the Gaussian noise and super-Gaussian noise from different training samples. Moreover, the 3-D-DnCNN network has better denoising performance toward arc-like imaging noise. In addition, we adopt residual learning and batch normalization in order to accelerate the training speed. After network training is satisfactorily completed, its processing efficiency can be significantly higher than that of conventional denoising methods.
Pansharpening refers to the fusion of a panchromatic (PAN) image with a high spatial resolution and a multispectral (MS) image with a low spatial resolution, aiming to obtain a high spatial ...resolution MS (HRMS) image. In this article, we propose a novel deep neural network architecture with level-domain-based loss function for pansharpening by taking into account the following double-type structures, i.e., double-level, double-branch, and double-direction, called as triple-double network (TDNet). By using the structure of TDNet, the spatial details of the PAN image can be fully exploited and utilized to progressively inject into the low spatial resolution MS (LRMS) image, thus yielding the high spatial resolution output. The specific network design is motivated by the physical formula of the traditional multi-resolution analysis (MRA) methods. Hence, an effective MRA fusion module is also integrated into the TDNet. Besides, we adopt a few ResNet blocks and some multi-scale convolution kernels to deepen and widen the network to effectively enhance the feature extraction and the robustness of the proposed TDNet. Extensive experiments on reduced- and full-resolution datasets acquired by WorldView-3, QuickBird, and GaoFen-2 sensors demonstrate the superiority of the proposed TDNet compared with some recent state-of-the-art pansharpening approaches. An ablation study has also corroborated the effectiveness of the proposed approach. The code is available at https://github.com/liangjiandeng/TDNet .
Scene text detection is an important step of scene text recognition system and also a challenging problem. Different from general object detections, the main challenges of scene text detection lie on ...arbitrary orientations, small sizes, and significantly variant aspect ratios of text in natural images. In this paper, we present an end-to-end trainable fast scene text detector, named TextBoxes++, which detects arbitrary-oriented scene text with both high accuracy and efficiency in a single network forward pass. No post-processing other than efficient non-maximum suppression is involved. We have evaluated the proposed TextBoxes++ on four public data sets. In all experiments, TextBoxes++ outperforms competing methods in terms of text localization accuracy and runtime. More specifically, TextBoxes++ achieves an f-measure of 0.817 at 11.6 frames/s for 1024 × 1024 ICDAR 2015 incidental text images and an f-measure of 0.5591 at 19.8 frames/s for 768 × 768 COCO-Text images. Furthermore, combined with a text recognizer, TextBoxes++ significantly outperforms the state-of-the-art approaches for word spotting and end-to-end text recognition tasks on popular benchmarks. Code is available at: https://github.com/MhLiao/TextBoxes_plusplus.
Understanding deep convolutional networks Mallat, Stéphane
Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences,
04/2016, Letnik:
374, Številka:
2065
Journal Article
Recenzirano
Odprti dostop
Deep convolutional networks provide state-of-the-art classifications and regressions results over many high-dimensional problems. We review their architecture, which scatters data with a cascade of ...linear filter weights and nonlinearities. A mathematical framework is introduced to analyse their properties. Computations of invariants involve multiscale contractions with wavelets, the linearization of hierarchical symmetries and sparse separations. Applications are discussed.