Deep learning (DL) has aroused wide attention in hyperspectral unmixing (HU) owing to its powerful feature representation ability. As a representative of unsupervised DL approaches, autoencoder (AE) ...has been proven to be effective to better capture nonlinear components of hyperspectral images than the traditional model-driven linearized methods. However, only using hyperspectral images for unmixing fails to distinguish objects in complex scene, especially for different endmembers with similar materials. To overcome this limitation, we propose a novel multimodal unmixing network for hyperspectral images, called MUNet, by considering the height differences of light detection and ranging (LiDAR) data in a squeeze-and-excitation (SE)-driven attention fashion to guide the unmixing process, yielding performance improvement. MUNet is capable of fusing multimodal information and using the attention map derived by LiDAR to aid network that focuses on more discriminative and meaningful spatial information regarding scenes. Moreover, attribute profile (AP) is adopted to extract the geometrical structures of different objects to better model the spatial information of LiDAR. Experimental results on synthetic and real datasets demonstrate the effectiveness and superiority of the proposed method compared with several state-of-the-art unmixing algorithms. The codes will be available at https://github.com/hanzhu97702/IEEE_TGRS_MUNet , contributing to the remote sensing community.
•The original SAE is expanded to suit with the multiplicative noise in SAR change detection.•The features extracted by SFAE are more discriminative than the original stacked autoencoder due to that ...Fisher discriminant criterion is incorporated into SFAE.•Experiments on the simulated and real SAR datasets reveal that the proposed SFAE algorithm is effective on multitemporal single/multi-polarization SAR change detection. Specifically, the proposed SFAE method is obviously superior to the real-time methods on detection accuracy and the non-realtime methods on computational complexity.
Stacked autoencoder is effective in image denoising and classification when it is used for synthetic aperture radar (SAR) change detection. However, the resulting features may not be discriminative enough for in some sense. To alleviate this problem, in this paper we propose a stacked Fisher autoencoder (SFAE) for SAR change detection. Specifically, in the framework of SFAE, unsupervised layer-wise feature learning and supervised fine-tuning are jointly performed when training the network. The trained network can be used to detect the changes in both of the single and multi-polarization SAR datasets in real-time. The proposed SFAE has two advantages. The first one is to expand the stacked autoencoder to suit the environment with the multiplicative noise in SAR change detection. The second is that the features extracted by SFAE are more discriminative than the original stacked autoencoder due to that Fisher discriminant criterion is incorporated into SFAE. The results on the simulated and real SAR datasets indicate that the proposed SFAE algorithm has a significant advantage on multitemporal single/multi-polarization SAR (SAR/PolSAR) change detection.
Development of intelligent systems with the pursuit of detecting abnormal events in real world and in real time is challenging due to difficult environmental conditions, hardware limitations, and ...computational algorithmic restrictions. As a result, degradation of detection performance in dynamically changing environments is often encountered. However, in the next‐generation factories, an anomaly detection system based on acoustic signals is especially required to quickly detect and interfere with the abnormal events during the industrial processes due to the increased cost of complex equipment and facilities. In this study we propose a real time Acoustic Anomaly Detection (AAD) system with the use of sequence‐to‐sequence Autoencoder (AE) models in the industrial environments. The proposed processing pipeline makes use of the audio features extracted from the streaming audio signal captured by a single‐channel microphone. The reconstruction error generated by the AE model is calculated to measure the degree of abnormality of the sound event. The performance of Convolutional Long Short‐Term Memory AE (Conv‐LSTMAE) is evaluated and compared with sequential Convolutional AE (CAE) using sounds captured from various industrial manufacturing processes. In the experiments conducted with the real time AAD system, it is shown that the Conv‐LSTMAE‐based AAD demonstrates better detection performance than CAE model‐based AAD under different signal‐to‐noise ratio conditions of sound events such as explosion, fire and glass breaking.
Health indicator (HI) affects the accuracy and reliability of the remaining useful life (RUL) prediction model. The hidden variables of variational autoencoder (VAE) can represent the HI values for a ...life-cycle dataset with obvious degradation trend. However, for an irregular dataset of a rotary machine, it is still a great challenge to construct the HI that can effectively represent the machinery degradation tendency. Therefore, this article proposes a novel degradation-trend-constrained VAE (DTC-VAE) to construct the HI vector with the distinct degradation trend. First, the multidimensional time-domain and frequency-domain characteristics are calculated via the collected vibration samples. Second, a new degradation-constraint loss term is proposed and introduced into VAE for constructing DTC-VAE. Third, with the multidimensional features and DTC-VAE, various HIs can be generated without supervision. The proposed method is applied to construct the HI vectors of bearing life-cycle datasets and gear fatigue datasets, and then macroscopic-microscopic-attention-based long short term memory (MMALSTM) is used to predict the corresponding RULs with the constructed HIs. Via several contrast experiments, the results prove that the proposed unsupervised HI construction approach is superior to other typical methods, and the obtained HI vectors are more suitable for the RUL prediction.
Feature selection is an important process in machine learning. It builds an interpretable and robust model by selecting the features that contribute the most to the prediction target. However, most ...mature feature selection algorithms, including supervised and semi-supervised, fail to fully exploit the complex potential structure between features. We believe that these structures are very important for the feature selection process, especially when labels are lacking and data is noisy.
To this end, we innovatively introduces a deep learning-based self-supervised mechanism into feature selection problems, namely batch-Attention-based Self-supervision Feature Selection(A-SFS). Firstly, a multi-task self-supervised autoencoder is designed to uncover the hidden structural among features with the support of two pretext tasks. Guided by the integrated information from the multi-self-supervised learning model, a batch-attention mechanism is designed to generate feature weights according to batch-based feature selection patterns to alleviate the impacts introduced from a handful of noisy data. This method is compared to 14 major strong benchmarks, including LightGBM and XGBoost. Experimental results show that A-SFS achieves the highest accuracy in most datasets. Furthermore, this design significantly reduces the reliance on labels, with only 1/10 labeled data are needed to achieve the same performance as those state of art baselines. Results show that A-SFS is also most robust to the noisy and missing data.
•A new feature selection method based on self-supervised pattern discovery.•A multi-task self-supervised model for latent structure discovery.•Batch-attention-based feature weight generation.
Deep learning based soft analyzers are important for modern industrial process monitoring and measurement, which aim to establish prediction models between quality data and easy-to-measure variables. ...However, in traditional deep learning methods, the guidance of quality information on feature extraction is insufficient and easily reduces as data dimension increases. In this paper, a stacked maximal quality-driven autoencoder (SMQAE) is proposed to extract maximal quality-relevant features for soft analyzers. In each maximal quality-driven autoencoder, quality variables are reconstructed together with the input variables in the output layer. The SMQAE ensures that the influence of the quality part and input part on the reconstruction are the same. And the maximal information coefficient (MIC), which is not limited to any specific function type, is exploited to enhance the importance of quality-related variables in the input part. With the constraint of the quality equivalence strategy and variable importance evaluation based on MIC, the SMQAE maximizes the guidance of the quality variables during feature learning without the interference of the data dimensions. Therefore, the SMQAE can extract quality-relevant features for complex high-dimensional data. The rationality, superiority and robustness of SMQAE based soft analyzers are validated on four simulated scenarios and two industrial processes.
Incorporating unstructured data into physical models is a challenging problem that is emerging in data assimilation. Traditional approaches focus on well-defined observation operators whose ...functional forms are typically assumed to be known. This prevents these methods from achieving a consistent model-data synthesis in configurations where the mapping from data-space to model-space is unknown. To address these shortcomings, in this paper we develop a physics-informed dynamical variational autoencoder (Φ-DVAE) to embed diverse data streams into time-evolving physical systems described by differential equations. Our approach combines a standard, possibly nonlinear, filter for the latent state-space model and a VAE, to assimilate the unstructured data into the latent dynamical system. Unstructured data, in our example systems, comes in the form of video data and velocity field measurements, however the methodology is suitably generic to allow for arbitrary unknown observation operators. A variational Bayesian framework is used for the joint estimation of the encoding, latent states, and unknown system parameters. To demonstrate the method, we provide case studies with the Lorenz-63 ordinary differential equation, and the advection and Korteweg-de Vries partial differential equations. Our results, with synthetic data, show that Φ-DVAE provides a data efficient dynamics encoding methodology which is competitive with standard approaches. Unknown parameters are recovered with uncertainty quantification, and unseen data are accurately predicted.
•Bayesian inference methodology for unstructured data assimilation.•Variational autoencoder embeds data to observations of latent differential equation.•Statistical FEM construction and parameter estimation account for misspecification.•Demonstrated on Lorenz-63, advection and Korteweg-de Vries differential equations.•Embedding physical prior knowledge produces data-efficient learning.
Image fusion aims to acquire a more complete image representation within a limited physical space to more effectively support practical vision applications. Although the currently popular infrared ...and visible image fusion algorithms take practical applications into consideration. However, they did not fully consider the redundancy and transmission efficiency of image data. To address this limitation, this paper proposes a compression fusion network for infrared and visible images based on joint CNN and Transformer, termed CFNet. First of all, the idea of variational autoencoder image compression is introduced into the image fusion framework, achieving data compression while maintaining image fusion quality and reducing redundancy. Moreover, a joint CNN and Transformer network structure is proposed, which comprehensively considers the local information extracted by CNN and the global long-distance dependencies emphasized by Transformer. Finally, multi-channel loss based on region of interest is used to guide network training. Not only can color visible and infrared images be fused directly but more bits can be allocated to the foreground region of interest, resulting in a superior compression ratio. Extensive qualitative and quantitative analyses affirm that the proposed compression fusion algorithm achieves state-of-the-art performance. In particular, rate–distortion performance experiments demonstrate the great advantages of the proposed algorithm for data storage and transmission. The source code is available at https://github.com/Xiaoxing0503/CFNet.
•We propose a compression-driven image fusion network, termed as CFNet.•A novel joint CNN and Transformer module is proposed.•The region of interest multi-channel loss is designed to guide network training.•Extensive experiments are conducted to verify the superiority of our method.
Nowadays, data-driven soft sensors have become mainstream for the key performance indicators prediction, which guarantees the safety and stability of the industrial process. The typical autoencoder ...(AE) has been widely used to extract potential features through unsupervised pretraining and supervised fine-tuning. However, most existing studies fail to consider both the time-varying features of the process and the differences in the contributions of the hidden features to the target variable. Therefore, in this article, a stacked spatial-temporal autoencoder (S 2 TAE) is proposed to enhance the representation learning capability for soft sensor modeling by taking the spatial-temporal correlations into consideration. Specifically, to effectively model the temporal dependence from nearby times, a temporal autoencoder is proposed, in which a memory module is devised and integrated to learn valuable historical information. Moreover, a "feature recalibration" block is developed and embedded into the spatial-temporal autoencoder (STAE) to selectively capture more informative features and suppress the less useful ones in a supervised way. Then, multiple STAEs are stacked to construct the S 2 TAE network to extract more robust high-level features. Finally, the experimental results on two real-world datasets of a sorbent decontamination system (SDS) desulfurization process and a high-low transformer demonstrate that the S 2 TAE-based soft sensor is effective and feasible.