Change detection is one of the central problems in earth observation and was extensively investigated over recent decades. In this paper, we propose a novel recurrent convolutional neural network ...(ReCNN) architecture, which is trained to learn a joint spectral-spatial-temporal feature representation in a unified framework for change detection in multispectral images. To this end, we bring together a convolutional neural network and a recurrent neural network into one end-to-end network. The former is able to generate rich spectral-spatial feature representations, while the latter effectively analyzes temporal dependence in bitemporal images. In comparison with previous approaches to change detection, the proposed network architecture possesses three distinctive properties: 1) it is end-to-end trainable, in contrast to most existing methods whose components are separately trained or computed; 2) it naturally harnesses spatial information that has been proven to be beneficial to change detection task; and 3) it is capable of adaptively learning the temporal dependence between multitemporal images, unlike most of the algorithms that use fairly simple operation like image differencing or stacking. As far as we know, this is the first time that a recurrent convolutional network architecture has been proposed for multitemporal remote sensing image analysis. The proposed network is validated on real multispectral data sets. Both visual and quantitative analyses of the experimental results demonstrate competitive performance in the proposed mode.
Change detection (CD) in multitemporal images is an important application of remote sensing. Recent technological evolution provided very high spatial resolution (VHR) multitemporal optical satellite ...images showing high spatial correlation among pixels and requiring an effective modeling of spatial context to accurately capture change information. Here, we propose a novel unsupervised context-sensitive framework-deep change vector analysis (DCVA)-for CD in multitemporal VHR images that exploit convolutional neural network (CNN) features. To have an unsupervised system, DCVA starts from a suboptimal pretrained multilayered CNN for obtaining deep features that can model spatial relationship among neighboring pixels and thus complex objects. An automatic feature selection strategy is employed layerwise to select features emphasizing both high and low prior probability change information. Selected features from multiple layers are combined into a deep feature hypervector providing a multiscale scene representation. The use of the same pretrained CNN for semantic segmentation of single images enables us to obtain coherent multitemporal deep feature hypervectors that can be compared pixelwise to obtain deep change vectors that also model spatial context information. Deep change vectors are analyzed based on their magnitude to identify changed pixels. Then, deep change vectors corresponding to identified changed pixels are binarized to obtain a compressed binary deep change vectors that preserve information about the direction (kind) of change. Changed pixels are analyzed for multiple CD based on the binary features, thus implicitly using the spatial information. Experimental results on multitemporal data sets of Worldview-2, Pleiades, and Quickbird images confirm the effectiveness of the proposed method.
Automatic registration of multimodal remote sensing data e.g., optical, light detection and ranging (LiDAR), and synthetic aperture radar (SAR) is a challenging task due to the significant nonlinear ...radiometric differences between these data. To address this problem, this paper proposes a novel feature descriptor named the histogram of orientated phase congruency (HOPC), which is based on the structural properties of images. Furthermore, a similarity metric named HOPCncc is defined, which uses the normalized correlation coefficient (NCC) of the HOPC descriptors for multimodal registration. In the definition of the proposed similarity metric, we first extend the phase congruency model to generate its orientation representation and use the extended model to build HOPCncc. Then, a fast template matching scheme for this metric is designed to detect the control points between images. The proposed HOPCncc aims to capture the structural similarity between images and has been tested with a variety of optical, LiDAR, SAR, and map data. The results show that HOPCncc is robust against complex nonlinear radiometric differences and outperforms the state-of-the-art similarities metrics (i.e., NCC and mutual information) in matching performance. Moreover, a robust registration method is also proposed in this paper based on HOPCncc, which is evaluated using six pairs of multimodal remote sensing images. The experimental results demonstrate the effectiveness of the proposed method for multimodal image registration.
Hyperspectral images (HSIs) provide rich spectral-spatial information with stacked hundreds of contiguous narrowbands. Due to the existence of noise and band correlation, the selection of informative ...spectral-spatial kernel features poses a challenge. This is often addressed by using convolutional neural networks (CNNs) with receptive field (RF) having fixed sizes. However, these solutions cannot enable neurons to effectively adjust RF sizes and cross-channel dependencies when forward and backward propagations are used to optimize the network. In this article, we present an attention-based adaptive spectral-spatial kernel improved residual network ( A 2 S 2 K-ResNet ) with spectral attention to capture discriminative spectral-spatial features for HSI classification in an end-to-end training fashion. In particular, the proposed network learns selective 3-D convolutional kernels to jointly extract spectral-spatial features using improved 3-D ResBlocks and adopts an efficient feature recalibration (EFR) mechanism to boost the classification performance. Extensive experiments are performed on three well-known hyperspectral data sets, i.e., IP, KSC, and UP, and the proposed A 2 S 2 K-ResNet can provide better classification results in terms of overall accuracy (OA), average accuracy (AA), and Kappa compared with the existing methods investigated. The source code will be made available at https://github.com/suvojit- <inline-formula> <tex-math notation="LaTeX">0\times 55 </tex-math></inline-formula>aa/A2S2K-ResNet.
Building change detection (CD), important for its application in urban monitoring, can be performed in near real time by comparing prechange and postchange very-high-spatial-resolution (VHR) ...synthetic-aperture-radar (SAR) images. However, multitemporal VHR SAR images are complex as they show high spatial correlation, prone to shadows, and show an inhomogeneous signature. Spatial context needs to be taken into account to effectively detect a change in such images. Recently, convolutional-neural-network (CNN)-based transfer learning techniques have shown strong performance for CD in VHR multispectral images. However, its direct use for SAR CD is impeded by the absence of labeled SAR data and, thus, pretrained networks. To overcome this, we exploit the availability of paired unlabeled SAR and optical images to train for the suboptimal task of transcoding SAR images into optical images using a cycle-consistent generative adversarial network (CycleGAN). The CycleGAN consists of two generator networks: one for transcoding SAR images into the optical image domain and the other for projecting optical images into the SAR image domain. After unsupervised training, the generator transcoding SAR images into optical ones is used as a bitemporal deep feature extractor to extract optical-like features from bitemporal SAR images. Thus, deep change vector analysis (DCVA) and fuzzy rules can be applied to identify changed buildings (new/destroyed). We validate our method on two data sets made up of pairs of bitemporal VHR SAR images on the city of L'Aquila (Italy) and Trento (Italy).
While image matching has been studied in remote sensing community for decades, matching multimodal data e.g., optical, light detection and ranging (LiDAR), synthetic aperture radar (SAR), and map ...remains a challenging problem because of significant nonlinear intensity differences between such data. To address this problem, we present a novel fast and robust template matching framework integrating local descriptors for multimodal images. First, a local descriptor such as histogram of oriented gradient (HOG) and local self-similarity (LSS) or speeded-up robust feature (SURF) is extracted at each pixel to form a pixelwise feature representation of an image. Then, we define a fast similarity measure based on the feature representation using the fast Fourier transform (FFT) in the frequency domain. A template matching strategy is employed to detect correspondences between images. In this procedure, we also propose a novel pixelwise feature representation using orientated gradients of images, which is named channel features of orientated gradients (CFOG). This novel feature is an extension of the pixelwise HOG descriptor with superior performance in image matching and computational efficiency. The major advantages of the proposed matching framework include: 1) structural similarity representation using the pixelwise feature description and 2) high computational efficiency due to the use of FFT. The proposed matching framework has been evaluated using many different types of multimodal images, and the results demonstrate its superior matching performance with respect to the state-of-the-art methods.
Change detection (CD) is one of the main applications of remote sensing. With the increasing popularity of deep learning, most recent developments of CD methods have introduced the use of deep ...learning techniques to increase the accuracy and automation level over traditional methods. However, when using supervised CD methods, a large amount of labeled data is needed to train deep convolutional networks with millions of parameters. These labeled data are difficult to acquire for CD tasks. To address this limitation, a novel semisupervised convolutional network for CD (SemiCDNet) is proposed based on a generative adversarial network (GAN). First, both the labeled data and unlabeled data are input into the segmentation network to produce initial predictions and entropy maps. Then, to exploit the potential of unlabeled data, two discriminators are adopted to enforce the feature distribution consistency of segmentation maps and entropy maps between the labeled and unlabeled data. During the competitive training, the generator is continuously regularized by utilizing the unlabeled information, thus improving its generalization capability. The effectiveness and reliability of our proposed method are verified on two high-resolution remote sensing data sets. Extensive experimental results demonstrate the superiority of the proposed method against other state-of-the-art approaches.
The semantic segmentation of remote sensing images (RSIs) is important in a variety of applications. Conventional encoder-decoder-based convolutional neural networks (CNNs) use cascade pooling ...operations to aggregate the semantic information, which results in a loss of localization accuracy and in the preservation of spatial details. To overcome these limitations, we introduce the use of the high-resolution network (HRNet) to produce high-resolution features without the decoding stage. Moreover, we enhance the low-to-high features extracted from different branches separately to strengthen the embedding of scale-related contextual information. The low-resolution features contain more semantic information and have a small spatial size; thus, they are utilized to model the long-term spatial correlations. The high-resolution branches are enhanced by introducing an adaptive spatial pooling (ASP) module to aggregate more local contexts. By combining these context aggregation designs across different levels, the resulting architecture is capable of exploiting spatial context at both global and local levels. The experimental results obtained on two RSI datasets show that our approach significantly improves the accuracy with respect to the commonly used CNNs and achieves state-of-the-art performance.
A successful method for removing artifacts from electroencephalogram (EEG) recordings is Independent Component Analysis (ICA), but its implementation remains largely user‐dependent. Here, we propose ...a completely automatic algorithm (ADJUST) that identifies artifacted independent components by combining stereotyped artifact‐specific spatial and temporal features. Features were optimized to capture blinks, eye movements, and generic discontinuities on a feature selection dataset. Validation on a totally different EEG dataset shows that (1) ADJUST's classification of independent components largely matches a manual one by experts (agreement on 95.2% of the data variance), and (2) Removal of the artifacted components detected by ADJUST leads to neat reconstruction of visual and auditory event‐related potentials from heavily artifacted data. These results demonstrate that ADJUST provides a fast, efficient, and automatic way to use ICA for artifact removal.
The identification of tree species is an important issue in forest management. In recent years, many studies have explored this topic using hyperspectral, multispectral, and LiDAR data. In this study ...we analyzed two multi-sensor set-ups: 1) airborne high spatial resolution hyperspectral images combined with LiDAR data; and 2) high spatial resolution satellite multispectral images combined with LiDAR data. Two LiDAR acquisitions were considered: low point density (approx. 0.48points per m2) and high point density (approx. 8.6points per m2). The aims of this work were: i) to understand what level of classification accuracy can be achieved using a high spectral and spatial resolution multi-sensor data set-up (very high spatial and spectral resolution airborne hyperspectral images integrated with high point density LiDAR data), over a mountain area characterized by many species, both broadleaf and coniferous; ii) to understand the implications of a downgrading of the data characteristics (in terms of spectral resolution of spectral data and point density of LiDAR data), on species separability, with respect to the previous set-up; and iii) to understand the differences between high- and low-point density LiDAR acquisitions on tree species classification. The study region was a mountain area in the Southern Alps characterized by many tree species (7 species and a “non-forest” class), either coniferous or broadleaf. For each set-up a specific processing chain was adopted, from the pre-processing of the raw data to the classification (two classifiers were used: support vector machine and random forest). Different class definitions were tested, including general macro-classes, forest types, and finally single tree species. Experimental results showed that the set-up based on hyperspectral data was effective with general macro-classes, forest types, and single species, reaching high kappa accuracies (93.2%, 82.1% and 76.5%, respectively). The use of multispectral data produced a reduction in the classification accuracy, which was sharp for single tree species, and still high for forest types. Considering general macro-classes, the multispectral set-up was still very accurate (85.8%). Regarding LiDAR data, the experimental analysis showed that high density LiDAR data provided more information for tree species classification with respect to low density data, when combined with either hyperspectral or multispectral data.
► Hyperspectral data allow one to distinguish similar species. ► Downscaling the spectral resolution reduces the discrimination ability of the data. ► Multispectral data are effective for macro-classes discrimination. ► The addition of LiDAR data increase the classification accuracy. ► High point density LiDAR data allow higher class accuracy than low point density data.