Unsupervised domain adaptation relies on well-labeled auxiliary source domain information to get better performance on the unlabeled target domain. It has shown tremendous importance for various ...classification and segmentation problems. Classical methods rely on diminishing the domain discrepancy in the latent space but ignore class-wise information, which will lead to elimination of the inherent data structure. To avoid destroying the inherent structure during unsupervised domain adaptation, we propose a Bi-Directional Class-level Adversaries cross-domain model (BDCA) with two symmetric classifiers interpolating two latent spaces to build a tunnel between the source domain and target domain. Specifically, we propose a class-level discrepancy metric to enforce domain consistency during the trend of domain adaption. We also employ two symmetric classifiers that are collectively optimized to maximize the discrepancy on target sample prediction. Extensive experiments are conducted on four publicly available datasets (
i.e.
office-31, office-home, GTAV and Cityscapes) and two challenging computer vision prediction problems,
i.e.
, image classification and semantic segmentation. Quantitative and qualitative results demonstrate the effectiveness of our proposed model.
Object detection results have been rapidly improved over a short period of time with the development of deep convolutional neural networks. Although impressive results have been achieved on ...large/medium sized objects, the performance on small objects is far from satisfactory and one of remaining open challenges is detecting small object in unconstrained conditions (e.g. COCO and WIDER FACE benchmarks). The reason is that small objects usually lack sufficient detailed appearance information, which can distinguish them from the backgrounds or similar objects. To deal with the small object detection problem, in this paper, we propose an end-to-end multi-task generative adversarial network (MTGAN), which is a general framework. In the MTGAN, the generator is a super-resolution network, which can up-sample small blurred images into fine-scale ones and recover detailed information for more accurate detection. The discriminator is a multi-task network, which describes each inputted image patch with a real/fake score, object category scores, and bounding box regression offsets. Furthermore, to make the generator recover more details for easier detection, the classification and regression losses in the discriminator are back-propagated into the generator during training process. Extensive experiments on the challenging COCO and WIDER FACE datasets demonstrate the effectiveness of the proposed method in restoring a clear super-resolved image from a blurred small one, and show that the detection performance, especially for small sized objects, improves over state-of-the-art methods by a large margin.
We demonstrated a mechanochemically-assisted approach to synthesize Ti
3
C
2
T
x
MXene with crinkled morphology with enhanced energy storage performance. The fabrication efficiency and capacitive ...property of the resulting Ti
3
C
2
T
x
MXene was significantly promoted under the aid of a high-energy ball mill: (i) removal of Al from pristine Ti
3
AlC
2
powder was achieved after 8 h of etching in 2% hydrochloric acid, while 18 h was sufficient in 5% hydrochloric acid for conventional experimental as reported in previous literature; (ii) the capacitive property of the as-prepared samples increases with etching time, the 8-h mechanochemically etched sample showed a specific capacitance of 129 F/g at 10 mV/s in 1 M H
2
SO
4
electrolyte, while no typical energy storage behavior was found for the sample without mechanochemical aid. The contribution of double layer, pseudocapacitive and diffusion-limited capacitance for the total specific capacitance was quantitively analyzed for the first time. The as-prepared sample exhibits higher specific capacitance than the previously reported MXene and MXene-based composites. The mechanochemically-assisted approach showed good capability in preparing Ti
3
C
2
T
x
MXene with enhanced capacitive property.
Three evident and meaningful characteristics of disruptive technology are the zeroing effect that causes sustaining technology useless for its remarkable and unprecedented progress, reshaping the ...landscape of technology and economy, and leading the future mainstream of technology system, all of which have profound impacts and positive influences. The identification of disruptive technology is a universally difficult task. Therefore, this paper aims to enhance the technical relevance of potential disruptive technology identification results and improve the granularity and effectiveness of potential disruptive technology identification topics. According to the life cycle theory, dividing the time stage, then constructing and analyzing the dynamic of technology networks to identify potential disruptive technology. Thereby, using the Latent Dirichlet Allocation (LDA) topic model further to clarify the topic content of potential disruptive technologies. This paper takes the large civil unmanned aerial vehicles (UAVs) as an example to prove the feasibility and effectiveness of the model. The results show that the potential disruptive technology in this field is the data acquisition, main equipment, and ground platform intelligence.
CNN-based Martian rock image processing has attracted much attention in Mars missions lately, since it can help planetary rover autonomously recognize and collect high value science targets. However, ...due to the difficulty of Martian rock image acquisition, the accuracy of the processing model is affected. In this paper, we introduce a new dataset called “GMSRI” that is a mixture of real Mars images and synthetic counterparts which are generated by GAN. GMSRI aims to provide a set of Martian rock images sorted by the texture and spatial structure of rocks. This paper offers a detailed analysis of GMSRI in its current state: Five sub-trees with 28 leaf nodes and 30,000 images in total. We show that GMSRI is much larger in scale and diversity than the current same kinds of datasets. Constructing such a database is a challenging task, and we describe the data collection, selection and generation processes carefully in this paper. Moreover, we evaluate the effectiveness of the GMSRI by an image super-resolution task. We hope that the scale, diversity and hierarchical structure of GMSRI can offer opportunities to researchers in the Mars exploration community and beyond.
High-resolution (HR) Mars images have great significance for studying the land-form features of Mars and analyzing the climate on Mars. Nowadays, the mainstream image super-resolution methods are ...based on deep learning or CNNs, which are better than traditional methods. However, these deep learning based methods obtain low-resolution(LR) images usually by using an ideal down-sampling method ( e.g. bicubic interpolation). There are two limitations in the existing SR methods: 1) The paired LR-HR data by using such methods can achieve a satisfactory results when tested on an ideal datasets. But, these methods always fail in real Mars image super-resolution, since real Mars images rarely obey an ideal down-sampling rule. 2) The LR images obtained by ideal down-sampling methods have no noise while real Mars images usually have noise, which leads to the super-resolved images are not realistic in texture details. To solve the above-mentioned problems, in this article, we propose a novel two-step framework for Mars image super-resolution. Specifically, to address limitation 1), we focus on designing a new degradation framework by estimating blur-kernels. To address limitation 2), a Generative Adversarial Network (GAN) is trained to generate noise distribution. Extensive experiments on the Mars32k dataset demonstrate the effectiveness of the proposed method, and we achieve better qualitative and quantitative results compared to other SOTA methods.
In this paper, we propose a novel single image enhancement technique for defogging by using dark channel prior. The traditional dark channel prior methods for defogging have problems of high time ...complexity, edge effect, and failure of dark channel prior. To overcome the problems of high time complexity and edge effect, firstly, a four-point weighting algorithm is proposed to estimate the atmospheric light value accurately, and the dark channel prior is used to estimate the rough transmittance. Then, the gray-scale image of the input image is used to refine the transmittance. After that, an atmospheric scattering model is designed to restore the fog-free image. To solve the problem that the dark channel prior can not process the high brightness area, a combination of edge detection and maximum inter-class variance is used to segment the sky area and non-sky area. Finally, the improved defogging method is used for processing the non-sky area, and the enhancement algorithm via sequential decomposition is used for handling the sky area. Extensive experiments show that the improved algorithm can not only reduce the time complexity, but also effectively improve the edge effect. At the same time, it can also solve the problem of failure of dark channel prior.
The representation power of convolutional neural network (CNN) models for hyperspectral image (HSI) analysis is in practice limited by the available amount of the labeled samples, which is often ...insufficient to sustain deep networks with many parameters. We propose a novel approach to boost the network representation power with a two-stream 2-D CNN architecture. The proposed method extracts simultaneously, the spectral features and local spatial and global spatial features, with two 2-D CNN networks and makes use of channel correlations to identify the most informative features. Moreover, we propose a layer-specific regularization and a smooth normalization fusion scheme to adaptively learn the fusion weights for the spectral-spatial features from the two parallel streams. An important asset of our model is the simultaneous training of the feature extraction, fusion, and classification processes with the same cost function. Experimental results on several hyperspectral data sets demonstrate the efficacy of the proposed method compared with the state-of-the-art methods in the field.
Recent progress in spectral classification is largely attributed to the use of convolutional neural networks (CNNs). While a variety of successful architectures have been proposed, they all extract ...spectral features from various portions of adjacent spectral bands. In this article, we take a different approach and develop a deep spectral feature fusion method, which extracts both local and interlocal spectral features, capturing thus also the correlations among nonadjacent bands. To our knowledge, this is the first reported deep spectral feature fusion method. Our model is a two-stream architecture, where an intergroup and a groupwise spectral classifier operate in parallel. The interlocal spectral correlation feature extraction is achieved elegantly, by reshaping the input spectral vectors to form the so-called nonadjacent spectral matrices. We introduce the concept of groupwise band convolution to enable the efficient extraction of discriminative local features with multiple kernels adopting the local spectral content. Another important contribution of this work is a novel dual-channel attention mechanism to identify the most informative spectral features. The model is trained in an end-to-end fashion with a joint loss. Experimental results on real datasets demonstrate excellent performance compared with the current state of the art.
While convolutional neural networks have shown a tremendous impact on various computer vision tasks, they generally demonstrate limitations in explicitly modeling long-range dependencies due to the ...intrinsic locality of the convolution operation. Initially designed for natural language processing tasks, Transformers have emerged as alternative architectures with innate global self-attention mechanisms to capture long-range dependencies. In this paper, we propose TransDepth, an architecture that benefits from both convolutional neural networks and transformers. To avoid the network losing its ability to capture locallevel details due to the adoption of transformers, we propose a novel decoder that employs attention mechanisms based on gates. Notably, this is the first paper that applies transformers to pixel-wise prediction problems involving continuous labels (i.e., monocular depth prediction and surface normal estimation). Extensive experiments demonstrate that the proposed TransDepth achieves state-of-theart performance on three challenging datasets. Our code is available at: https://github.com/ygjwd12345/TransDepth.