As a promising way for analyzing data, sparse modeling has achieved great success throughout science and engineering. It is well known that the sparsity/low-rank of a vector/matrix can be rationally ...measured by nonzero-entries-number (l 0 norm)/nonzerosingular-values-number (rank), respectively. However, data from real applications are often generated by the interaction of multiple factors, which obviously cannot be sufficiently represented by a vector/matrix, while a high order tensor is expected to provide more faithful representation to deliver the intrinsic structure underlying such data ensembles. Unlike the vector/matrix case, constructing a rational high order sparsity measure for tensor is a relatively harder task. To this aim, in this paper we propose a measure for tensor sparsity, called Kronecker-basis-representation based tensor sparsity measure (KBR briefly), which encodes both sparsity insights delivered by Tucker and CANDECOMP/PARAFAC (CP) low-rank decompositions for a general tensor. Then we study the KBR regularization minimization (KBRM) problem, and design an effective ADMM algorithm for solving it, where each involved parameter can be updated with closed-form equations. Such an efficient solver makes it possible to extend KBR to various tasks like tensor completion and tensor robust principal component analysis. A series of experiments, including multispectral image (MSI) denoising, MSI completion and background subtraction, substantiate the superiority of the proposed methods beyond state-of-the-arts.
This paper introduces a novel examplar-based inpainting algorithm through investigating the sparsity of natural image patches. Two novel concepts of sparsity at the patch level are proposed for ...modeling the patch priority and patch representation, which are two crucial steps for patch propagation in the examplar-based inpainting approach. First, patch structure sparsity is designed to measure the confidence of a patch located at the image structure (e.g., the edge or corner) by the sparseness of its nonzero similarities to the neighboring patches. The patch with larger structure sparsity will be assigned higher priority for further inpainting. Second, it is assumed that the patch to be filled can be represented by the sparse linear combination of candidate patches under the local patch consistency constraint in a framework of sparse representation. Compared with the traditional examplar-based inpainting approach, structure sparsity enables better discrimination of structure and texture, and the patch sparse representation forces the newly inpainted regions to be sharp and consistent with the surrounding textures. Experiments on synthetic and natural images show the advantages of the proposed approach.
This paper presents a novel and efficient deep fusion convolutional neural network (DF-CNN) for multimodal 2D+3D facial expression recognition (FER). DF-CNN comprises a feature extraction subnet, a ...feature fusion subnet, and a softmax layer. In particular, each textured three-dimensional (3D) face scan is represented as six types of 2D facial attribute maps (i.e., geometry map, three normal maps, curvature map, and texture map), all of which are jointly fed into DF-CNN for feature learning and fusion learning, resulting in a highly concentrated facial representation (32-dimensional). Expression prediction is performed by two ways: 1) learning linear support vector machine classifiers using the 32-dimensional fused deep features, or 2) directly performing softmax prediction using the six-dimensional expression probability vectors. Different from existing 3D FER methods, DF-CNN combines feature learning and fusion learning into a single end-to-end training framework. To demonstrate the effectiveness of DF-CNN, we conducted comprehensive experiments to compare the performance of DFCNN with handcrafted features, pre-trained deep features, finetuned deep features, and state-of-the-art methods on three 3D face datasets (i.e., BU-3DFE Subset I, BU-3DFE Subset II, and Bosphorus Subset). In all cases, DF-CNN consistently achieved the best results. To the best of our knowledge, this is the first work of introducing deep CNN to 3D FER and deep learning-based featurelevel fusion for multimodal 2D+3D FER.
Differential evolution is one of the most prestigious population-based stochastic optimization algorithm for black-box problems. The performance of a differential evolution algorithm depends highly ...on its mutation and crossover strategy and associated control parameters. However, the determination process for the most suitable parameter setting is troublesome and time consuming. Adaptive control parameter methods that can adapt to problem landscape and optimization environment are more preferable than fixed parameter settings. This article proposes a novel adaptive parameter control approach based on learning from the optimization experiences over a set of problems. In the approach, the parameter control is modeled as a finite-horizon Markov decision process. A reinforcement learning algorithm, named policy gradient, is applied to learn an agent (i.e., parameter controller) that can provide the control parameters of a proposed differential evolution adaptively during the search procedure. The differential evolution algorithm based on the learned agent is compared against nine well-known evolutionary algorithms on the CEC'13 and CEC'17 test suites. Experimental results show that the proposed algorithm performs competitively against these compared algorithms on the test suites.
This paper presents a new supervised classification algorithm for remotely sensed hyperspectral image (HSI) which integrates spectral and spatial information in a unified Bayesian framework. First, ...we formulate the HSI classification problem from a Bayesian perspective. Then, we adopt a convolutional neural network (CNN) to learn the posterior class distributions using a patch-wise training strategy to better use the spatial information. Next, spatial information is further considered by placing a spatial smoothness prior on the labels. Finally, we iteratively update the CNN parameters using stochastic gradient decent and update the class labels of all pixel vectors using α-expansion mincut-based algorithm. Compared with the other state-of-the-art methods, the classification method achieves better performance on one synthetic data set and two benchmark HSI data sets in a number of experimental settings.
Blind hyperspectral unmixing (HU), as a crucial technique for hyperspectral data exploitation, aims to decompose mixed pixels into a collection of constituent materials weighted by the corresponding ...fractional abundances. In recent years, nonnegative matrix factorization (NMF)-based methods have become more and more popular for this task and achieved promising performance. Among these methods, two types of properties upon the abundances, namely, the sparseness and the structural smoothness, have been explored and shown to be important for blind HU. However, all of the previous methods ignore another important insightful property possessed by a natural hyperspectral image (HSI), non-local smoothness, which means that similar patches in a larger region of an HSI are sharing the similar smoothness structure. Based on the previous attempts on other tasks, such a prior structure reflects intrinsic configurations underlying an HSI and is thus expected to largely improve the performance of the investigated HU problem. In this paper, we first consider such prior in HSI by encoding it as the non-local total variation (NLTV) regularizer. Furthermore, by fully exploring the intrinsic structure of HSI, we generalize NLTV to non-local HSI TV (NLHTV) to make the model more suitable for the blind HU task. By incorporating these two regularizers, together with a non-convex log-sum form regularizer characterizing the sparseness of abundance maps, to the NMF model, we propose novel blind HU models named NLTV/NLHTV and log-sum regularized NMF (NLTV-LSRNMF/NLHTV-LSRNMF), respectively. To solve the proposed models, an efficient algorithm is designed based on an alternative optimization strategy (AOS) and alternating direction method of multipliers (ADMM). Extensive experiments conducted on both simulated and real hyperspectral data sets substantiate the superiority of the proposed approach over other competing ones for blind HU task.
An extreme learning machine (ELM) can be regarded as a two-stage feed-forward neural network (FNN) learning system that randomly assigns the connections with and within hidden neurons in the first ...stage and tunes the connections with output neurons in the second stage. Therefore, ELM training is essentially a linear learning problem, which significantly reduces the computational burden. Numerous applications show that such a computation burden reduction does not degrade the generalization capability. It has, however, been open that whether this is true in theory. The aim of this paper is to study the theoretical feasibility of ELM by analyzing the pros and cons of ELM. In the previous part of this topic, we pointed out that via appropriately selected activation functions, ELM does not degrade the generalization capability in the sense of expectation. In this paper, we launch the study in a different direction and show that the randomness of ELM also leads to certain negative consequences. On one hand, we find that the randomness causes an additional uncertainty problem of ELM, both in approximation and learning. On the other hand, we theoretically justify that there also exist activation functions such that the corresponding ELM degrades the generalization capability. In particular, we prove that the generalization capability of ELM with Gaussian kernel is essentially worse than that of FNN with Gaussian kernel. To facilitate the use of ELM, we also provide a remedy to such a degradation. We find that the well-developed coefficient regularization technique can essentially improve the generalization capability. The obtained results reveal the essential characteristic of ELM in a certain sense and give theoretical guidance concerning how to use ELM.
An extreme learning machine (ELM) is a feedforward neural network (FNN) like learning system whose connections with output neurons are adjustable, while the connections with and within hidden neurons ...are randomly fixed. Numerous applications have demonstrated the feasibility and high efficiency of ELM-like systems. It has, however, been open if this is true for any general applications. In this two-part paper, we conduct a comprehensive feasibility analysis of ELM. In Part I, we provide an answer to the question by theoretically justifying the following: 1) for some suitable activation functions, such as polynomials, Nadaraya-Watson and sigmoid functions, the ELM-like systems can attain the theoretical generalization bound of the FNNs with all connections adjusted, i.e., they do not degrade the generalization capability of the FNNs even when the connections with and within hidden neurons are randomly fixed; 2) the number of hidden neurons needed for an ELM-like system to achieve the theoretical bound can be estimated; and 3) whenever the activation function is taken as polynomial, the deduced hidden layer output matrix is of full column-rank, therefore the generalized inverse technique can be efficiently applied to yield the solution of an ELM-like system, and, furthermore, for the nonpolynomial case, the Tikhonov regularization can be applied to guarantee the weak regularity while not sacrificing the generalization capability. In Part II, however, we reveal a different aspect of the feasibility of ELM: there also exists some activation functions, which makes the corresponding ELM degrade the generalization capability. The obtained results underlie the feasibility and efficiency of ELM-like systems, and yield various generalizations and improvements of the systems as well.