Remote sensing image scene classification plays an important role in a wide range of applications and hence has been receiving remarkable attention. During the past years, significant efforts have ...been made to develop various data sets or present a variety of approaches for scene classification from remote sensing images. However, a systematic review of the literature concerning data sets and methods for scene classification is still lacking. In addition, almost all existing data sets have a number of limitations, including the small scale of scene classes and the image numbers, the lack of image variations and diversity, and the saturation of accuracy. These limitations severely limit the development of new approaches especially deep learning-based methods. This paper first provides a comprehensive review of the recent progress. Then, we propose a large-scale data set, termed "NWPU-RESISC45," which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This data set contains 31 500 images, covering 45 scene classes with 700 images in each class. The proposed NWPU-RESISC45 1) is large-scale on the scene classes and the total image number; 2) holds big variations in translation, spatial resolution, viewpoint, object pose, illumination, background, and occlusion; and 3) has high within-class diversity and between-class similarity. The creation of this data set will enable the community to develop and evaluate various data-driven algorithms. Finally, several representative methods are evaluated using the proposed data set, and the results are reported as a useful baseline for future research.
Object detection in optical remote sensing images, being a fundamental but challenging problem in the field of aerial and satellite image analysis, plays an important role for a wide range of ...applications and is receiving significant attention in recent years. While enormous methods exist, a deep review of the literature concerning generic object detection is still lacking. This paper aims to provide a review of the recent progress in this field. Different from several previously published surveys that focus on a specific object class such as building and road, we concentrate on more generic object categories including, but are not limited to, road, building, tree, vehicle, ship, airport, urban-area. Covering about 270 publications we survey (1) template matching-based object detection methods, (2) knowledge-based object detection methods, (3) object-based image analysis (OBIA)-based object detection methods, (4) machine learning-based object detection methods, and (5) five publicly available datasets and three standard evaluation metrics. We also discuss the challenges of current studies and propose two promising research directions, namely deep learning-based feature representation and weakly supervised learning-based geospatial object detection. It is our hope that this survey will be beneficial for the researchers to have better understanding of this research field.
Most of the existing deep-learning-based methods are difficult to effectively deal with the challenges faced for geospatial object detection such as rotation variations and appearance ambiguity. To ...address these problems, this paper proposes a novel deep-learning-based object detection framework including region proposal network (RPN) and local-contextual feature fusion network designed for remote sensing images. Specifically, the RPN includes additional multiangle anchors besides the conventional multiscale and multiaspect-ratio ones, and thus can deal with the multiangle and multiscale characteristics of geospatial objects. To address the appearance ambiguity problem, we propose a double-channel feature fusion network that can learn local and contextual properties along two independent pathways. The two kinds of features are later combined in the final layers of processing in order to form a powerful joint representation. Comprehensive evaluations on a publicly available ten-class object detection data set demonstrate the effectiveness of the proposed method.
The performance of object detection has recently been significantly improved due to the powerful features learnt through convolutional neural networks (CNNs). Despite the remarkable success, there ...are still several major challenges in object detection, including object rotation, within-class diversity, and between-class similarity, which generally degenerate object detection performance. To address these issues, we build up the existing state-of-the-art object detection systems and propose a simple but effective method to train rotation-invariant and Fisher discriminative CNN models to further boost object detection performance. This is achieved by optimizing a new objective function that explicitly imposes a rotation-invariant regularizer and a Fisher discrimination regularizer on the CNN features. Specifically, the first regularizer enforces the CNN feature representations of the training samples before and after rotation to be mapped closely to each other in order to achieve rotation-invariance. The second regularizer constrains the CNN features to have small within-class scatter but large between-class separation. We implement our proposed method under four popular object detection frameworks, including region-CNN (R-CNN), Fast R- CNN, Faster R- CNN, and R- FCN. In the experiments, we comprehensively evaluate the proposed method on the PASCAL VOC 2007 and 2012 data sets and a publicly available aerial image data set. Our proposed methods outperform the existing baseline methods and achieve the state-of-the-art results.
As one of the fundamental research topics in remote sensing image analysis, hyperspectral image (HSI) classification has been extensively studied so far. However, how to discriminatively learn a ...low-dimensional feature space, in which the mapped features have small within-class scatter and big between-class separation, is still a challenging problem. To address this issue, this paper proposes an effective framework, named compact and discriminative stacked autoencoder (CDSAE), for HSI classification. The proposed CDSAE framework comprises two stages with different optimization objectives, which can learn discriminative low-dimensional feature mappings and train an effective classifier progressively. First, we impose a local Fisher discriminant regularization on each hidden layer of stacked autoencoder (SAE) to train discriminative SAE (DSAE) by minimizing reconstruction error. This stage can learn feature mappings, in which the pixels from the same land-cover class are mapped as nearly as possible and the pixels from different land-cover categories are separated by a large margin. Second, we learn an effective classifier and meanwhile update DSAE with a local Fisher discriminant regularization being embedded on the top of feature representations. Moreover, to learn a compact DSAE with as small number of hidden neurons as possible, we impose a diversity regularization on the hidden neurons of DSAE to balance the feature dimensionality and the feature representation capability. The experimental results on three widely-used HSI data sets and comprehensive comparisons with existing methods demonstrate that our proposed method is effective.
Remote sensing image scene classification is an active and challenging task driven by many applications. More recently, with the advances of deep learning models especially convolutional neural ...networks (CNNs), the performance of remote sensing image scene classification has been significantly improved due to the powerful feature representations learnt through CNNs. Although great success has been obtained so far, the problems of within-class diversity and between-class similarity are still two big challenges. To address these problems, in this paper, we propose a simple but effective method to learn discriminative CNNs (D-CNNs) to boost the performance of remote sensing image scene classification. Different from the traditional CNN models that minimize only the cross entropy loss, our proposed D-CNN models are trained by optimizing a new discriminative objective function. To this end, apart from minimizing the classification error, we also explicitly impose a metric learning regularization term on the CNN features. The metric learning regularization enforces the D-CNN models to be more discriminative so that, in the new D-CNN feature spaces, the images from the same scene class are mapped closely to each other and the images of different classes are mapped as farther apart as possible. In the experiments, we comprehensively evaluate the proposed method on three publicly available benchmark data sets using three off-the-shelf CNN models. Experimental results demonstrate that our proposed D-CNN methods outperform the existing baseline methods and achieve state-of-the-art results on all three data sets.
In 1975, tau protein was isolated as a microtubule-associated factor from the porcine brain. In the previous year, a paired helical filament (PHF) protein had been identified in neurofibrillary ...tangles in the brains of individuals with Alzheimer disease (AD), but it was not until 1986 that the PHF protein and tau were discovered to be one and the same. In the AD brain, tau was found to be abnormally hyperphosphorylated, and it inhibited rather than promoted in vitro microtubule assembly. Almost 80 disease-causing exonic missense and intronic silent mutations in the tau gene have been found in familial cases of frontotemporal dementia but, to date, no such mutation has been found in AD. The first phase I clinical trial of an active tau immunization vaccine in patients with AD was recently completed. Assays for tau levels in cerebrospinal fluid and plasma are now available, and tau radiotracers for PET are under development. In this article, we provide an overview of the pivotal discoveries in the tau research field over the past 40 years. We also review the current status of the field, including disease mechanisms and therapeutic approaches.
Co-saliency detection, which focuses on extracting commonly salient objects in a group of relevant images, has been attracting research interest because of its broad applications. In practice, the ...relevant images in a group may have a wide range of variations, and the salient objects may also have large appearance changes. Such wide variations usually bring about large intra-co-salient objects (intra-COs) diversity and high similarity between COs and background, which makes the co-saliency detection task more difficult. To address these problems, we make the earliest effort to introduce metric learning to co-saliency detection. Specifically, we propose a unified metric learning-based framework to jointly learn discriminative feature representation and co-salient object detector. This is achieved by optimizing a new objective function that explicitly embeds a metric learning regularization term into support vector machine (SVM) training. Here, the metric learning regularization term is used to learn a powerful feature representation that has small intra-COs scatter, but big separation between background and COs and the SVM classifier is used for subsequent co-saliency detection. In the experiments, we comprehensively evaluate the proposed method on two commonly used benchmark data sets. The state-of-the-art results are achieved in comparison with the existing co-saliency detection methods.
Object detection in very high resolution optical remote sensing images is a fundamental problem faced for remote sensing image analysis. Due to the advances of powerful feature representations, ...machine-learning-based object detection is receiving increasing attention. Although numerous feature representations exist, most of them are handcrafted or shallow-learning-based features. As the object detection task becomes more challenging, their description capability becomes limited or even impoverished. More recently, deep learning algorithms, especially convolutional neural networks (CNNs), have shown their much stronger feature representation power in computer vision. Despite the progress made in nature scene images, it is problematic to directly use the CNN feature for object detection in optical remote sensing images because it is difficult to effectively deal with the problem of object rotation variations. To address this problem, this paper proposes a novel and effective approach to learn a rotation-invariant CNN (RICNN) model for advancing the performance of object detection, which is achieved by introducing and learning a new rotation-invariant layer on the basis of the existing CNN architectures. However, different from the training of traditional CNN models that only optimizes the multinomial logistic regression objective, our RICNN model is trained by optimizing a new objective function via imposing a regularization constraint, which explicitly enforces the feature representations of the training samples before and after rotating to be mapped close to each other, hence achieving rotation invariance. To facilitate training, we first train the rotation-invariant layer and then domain-specifically fine-tune the whole RICNN network to further boost the performance. Comprehensive evaluations on a publicly available ten-class object detection data set demonstrate the effectiveness of the proposed method.
Weakly supervised object detection (WSOD) in remote sensing images (RSI) plays an essential role in RSI understanding applications. Currently, predominant works are inclined to first activate the ...most discriminative region and then pursue the whole object by analyzing the context information of the activated region. However, the most discriminative region usually only covers a small crucial part. Besides, many same-class instances often appear in adjacent locations. In such a case, treating proposals of large spatial overlap as the same-class instances not only introduces potential ambiguities but also misleads the detection model to recognize multiple adjacent instances as one object instance. To address these challenges, a novel triple context-aware network (TCANet) is proposed to learn complementary and discriminative visual patterns for WSOD in RSIs. Specifically, a global context-aware enhancement (GCAE) module is first designed to activate the features of the whole object by capturing the global visual scene context. Then, a dual-local context residual (DLCR) module is further developed to capture the instance-level discriminative cues by leveraging the semantic discrepancy of the local context. Furthermore, an effective adaptive-weighted refinement loss is integrated into the DLCR module to reduce the ambiguities in the label propagating process. The collaboration of GCAE and DLCR formulates a unique TCANet that can be learned in an end-to-end manner. Comprehensive experiments are carried out on the challenging NWPU VHR-10.v2 and DIOR data sets. We achieve a 58.8% mAP and a 25.8% mAP on the NWPU VHR-10.v2 and DIOR data sets, respectively, which both significantly outperform the state of the arts.