This article defines the land cover classes used in Meter-scale Urban Land Cover (MULC), a unique, high resolution (one meter
per pixel) land cover dataset developed for 30 US communities for the ...United States Environmental Protection Agency (US EPA) EnviroAtlas. MULC data categorize the landscape into these land cover classes: impervious surface, tree, grass-herbaceous, shrub, soil-barren, water, wetland and agriculture. MULC data are used to calculate approximately 100 EnviroAtlas metrics that serve as indicators of nature's benefits (ecosystem goods and services). MULC, a dataset for which development is ongoing, is produced by multiple classification methods using aerial photo and LiDAR datasets. The mean overall fuzzy accuracy across the EnviroAtlas communities is 88% and mean Kappa coefficient is 0.84. MULC is available in EnviroAtlas via web browser, web map service (WMS) in the user's geographic information system (GIS), and as downloadable data at EPA Environmental Data Gateway. Fact Sheets and metadata for each MULC Community are available through EnviroAtlas. Some MULC applications include mapping green and grey infrastructure, connecting land cover with socioeconomic/demographic variables, street tree planting, urban heat island analysis, mosquito habitat risk mapping and bikeway planning. This article provides practical guidance for using MULC effectively and developing similar high resolution (HR) land cover data.
Few-shot learning for fine-grained image classification has gained recent attention in computer vision. Among the approaches for few-shot learning, due to the simplicity and effectiveness, ...metric-based methods are favorably state-of-the-art on many tasks. Most of the metric-based methods assume a single similarity measure and thus obtain a single feature space. However, if samples can simultaneously be well classified via two distinct similarity measures, the samples within a class can distribute more compactly in a smaller feature space, producing more discriminative feature maps. Motivated by this, we propose a so-called Bi-Similarity Network ( BSNet ) that consists of a single embedding module and a bi-similarity module of two similarity measures. After the support images and the query images pass through the convolution-based embedding module, the bi-similarity module learns feature maps according to two similarity measures of diverse characteristics. In this way, the model is enabled to learn more discriminative and less similarity-biased features from few shots of fine-grained images, such that the model generalization ability can be significantly improved. Through extensive experiments by slightly modifying established metric/similarity based networks, we show that the proposed approach produces a substantial improvement on several fine-grained image benchmark datasets. Codes are available at: https://github.com/PRIS-CV/BSNet .
Hyperspectral imaging (HSI) has been extensively utilized in many real-life applications because it benefits from the detailed spectral information contained in each pixel. Notably, the complex ...characteristics, i.e., the nonlinear relation among the captured spectral information and the corresponding object of HSI data, make accurate classification challenging for traditional methods. In the last few years, deep learning (DL) has been substantiated as a powerful feature extractor that effectively addresses the nonlinear problems that appeared in a number of computer vision tasks. This prompts the deployment of DL for HSI classification (HSIC) which revealed good performance. This survey enlists a systematic overview of DL for HSIC and compared state-of-the-art strategies of the said topic. Primarily, we will encapsulate the main challenges of TML for HSIC and then we will acquaint the superiority of DL to address these problems. This article breaks down the state-of-the-art DL frameworks into spectral-features, spatial-features, and together spatial-spectral features to systematically analyze the achievements (future research directions as well) of these frameworks for HSIC. Moreover, we will consider the fact that DL requires a large number of labeled training examples whereas acquiring such a number for HSIC is challenging in terms of time and cost. Therefore, this survey discusses some strategies to improve the generalization performance of DL strategies which can provide some future guidelines.
A multiple-feature-based adaptive sparse representation (MFASR) method is proposed for the classification of hyperspectral images (HSIs). The proposed method mainly includes the following steps. ...First, four different features are separately extracted from the original HSI and they reflect different kinds of spectral and spatial information. Second, for each pixel, a shape adaptive (SA) spatial region is extracted. Third, an adaptive sparse representation algorithm is introduced to obtain the sparse coefficients for the multiple-feature matrix set of pixels in each SA region. Finally, these obtained coefficients are jointly used to determine the class label of each test pixel. Experimental results demonstrated that the proposed MFASR method can outperform several well-known classifiers in terms of both qualitative and quantitative results.
With the development of information technology, multi-platform collaborative collection and processing of remote sensing images has become a significant trend. However, the existing models are ...challenging to achieve accurate and efficient image interpretation on remote sensing multi-platform systems. To solve this problem, we propose a novel distributed convolutional neural network (DCNNet) and demonstrate the superiority of our method in remote sensing image classification. Firstly, a progressive inference mechanism is introduced to support most images to be classified in advance with satisfactory accuracy, which minimises redundant cloud transmission and achieves higher inference acceleration. Meanwhile, a distributed self-distillation paradigm is designed to integrate and refine in-depth features, performing efficient knowledge transfer between terminals and the cloud network. Secondly, a multi-scale feature fusion (MSFF) module is presented to extract valid receptive fields and assign weights to crucial channel dimension features. Finally, a sampling augmentation (SA) attention is proposed to enhance the effective feature representation of RS images through a bottom-up and top-down feedforward structure. We conducted extensive experiments and visual analyses on three benchmark scene classification datasets and one fine-grained dataset. Compared with the existing methods, DCNNet consolidates several advantages in terms of accuracy, computation, transmission and processing efficiency into a single framework for multi-platform remote sensing image classification.
Hyperspectral images (HSIs) have gained high spectral resolution due to recent advances in spectral imaging technologies. This incurs problems, such as an increased data scale and an increased number ...of bands for HSIs, which results in a complex correlation between different bands. In the applications of remote sensing and earth observation, ground objects represented by each HSI pixel are composed of physical and chemical non-Euclidean structures, and HSI classification (HIC) is becoming a more challenging task. To solve the above problems, we propose a framework based on a deep attention graph convolutional network (DAGCN). Specifically, we first integrate an attention mechanism into the spectral similarity measurement to aggregate similar spectra. Therefore, we propose a new similarity measurement method, i.e., the mixed measurement of a kernel spectral angle mapper and spectral information divergence (KSAM-SID), to aggregate similar spectra. Considering the non-Euclidean structural characteristics of HSIs, we design deep graph convolutional networks (DeepGCNs) as a feature extraction method to extract deep abstract features and explore the internal relationship between HSI data. Finally, we dynamically update the attention graph adjacency matrix to adapt to the changes in each feature graph. Experiments on three standard HSI data sets, namely, the Indian Pines, Pavia University, and Salinas data sets, demonstrate that the DAGCN outperforms the baselines in terms of various evaluation criteria. For example, on the Indian Pines data set, the overall accuracy of the proposed method achieves 98.61% when the training sample is 10%.
Recently, Hyperspectral Image (HSI) classification has gradually been getting attention from more and more researchers. HSI has abundant spectral and spatial information; thus, how to fuse these two ...types of information is still a problem worth studying. In this paper, to extract spectral and spatial feature, we propose a Double-Branch Multi-Attention mechanism network (DBMA) for HSI classification. This network has two branches to extract spectral and spatial feature respectively which can reduce the interference between the two types of feature. Furthermore, with respect to the different characteristics of these two branches, two types of attention mechanism are applied in the two branches respectively, which ensures to extract more discriminative spectral and spatial feature. The extracted features are then fused for classification. A lot of experiment results on three hyperspectral datasets shows that the proposed method performs better than the state-of-the-art method.
Multi-label image classification is an essential yet challenging task that requires to recognize multiple objects of images. To this end, recent studies have sought to acquire visual representations ...for each label by attention models, and then train binary classifiers for prediction. However, these methods have two major drawbacks: 1) They rely heavily on the precise alignments between two modalities, which is still challenging for current attention models; 2) They ignore patch-level representations rich in local object features, which are also of great importance for label recognition. In this paper, we propose a semantic-guided representation enhancement framework, which augments patch-level representations with object-level representations for robust label recognition. Concretely, the proposed framework consists of two significant components: 1) an inter-modal attention module that accounts for coarsely locating object regions and producing object-level representations for each label; 2) an intra-modal attention module that aggregates object representations to enhance patch representations based on their correlations. In this way, both local clues and global glances of objects are fully exploited simultaneously, rather than relying solely on object-level representations obtained by the inter-modal attention, thus improving the performance of label recognition. Experimental results show that our framework outperforms the state-of-the-art methods by 0.5%, 0.6%, 0.7% and 0.8% in mAP on Pascal VOC 2007, Microsoft COCO, NUS-WIDE and Visual Genome datasets, respectively. Codes and models are available on https://github.com/jasonseu/SGRE.
The local binary pattern (LBP) and its variants have shown the effectiveness in texture images classification, face recognition, and other applications. However, most of these LBP methods only focus ...on the histogram of LBP patterns and ignore the spatial contextual information between LBP patterns. In this paper, we propose a 2D-LBP method which uses a sliding window to count the weighted occurrence number of the rotation invariant uniform LBP pattern pairs to obtain the spatial contextual information. The multi-resolution 2D-LBP features can also be obtained when the radius of 2D-LBP is changed. At last, a two-stage classifier which acts as an ensemble learning step is followed to achieve an accurate classification by combining the predictions on each 2D-LBP with single resolution. Theoretical proof shows that the proposed 2D-LBP is a general framework and can be integrated on other LBP variants to derive new feature extraction methods. Experimental results show that, the proposed method achieves 99.71%, 97.09%, 98.48%, and 49.00% classification accuracy on the public "Brodatz," "CUReT," "UIUC," and "FMD" texture image databases, respectively. Compared with the original LBP and its variants, the proposed method obtains higher classification accuracy under different cases, and simultaneously owns shorter time complexity.