The fuzzy logic-based neural network usually forms fuzzy rules via multiplying the input membership degrees, which lacks expressiveness and flexibility. In this article, a novel neural network model ...is designed by integrating the gene expression programming into the interval type-2 fuzzy rough neural network, aiming to generate fuzzy rules with more expressiveness utilizing various logical operators. The network training is regarded as a multiobjective optimization problem through simultaneously considering network precision, explainability, and generalization. Specifically, the network complexity can be minimized to generate concise and few fuzzy rules for improving the network explainability. Inspired by the extreme learning machine and the broad learning system, an enhanced distributed parallel multiobjective evolutionary algorithm is proposed. This evolutionary algorithm can flexibly explore the forms of fuzzy rules, and the weight refinement of the final layer can significantly improve precision and convergence by solving the pseudoinverse. Experimental results show that the proposed multiobjective evolutionary network framework is superior in both effectiveness and explainability.
Language models have traditionally been estimated based on relative frequencies, using count statistics that can be extracted from huge amounts of text data. More recently, it has been found that ...neural networks are particularly powerful at estimating probability distributions over word sequences, giving substantial improvements over state-of-the-art count models. However, the performance of neural network language models strongly depends on their architectural structure. This paper compares count models to feedforward, recurrent, and long short-term memory (LSTM) neural network variants on two large-vocabulary speech recognition tasks. We evaluate the models in terms of perplexity and word error rate, experimentally validating the strong correlation of the two quantities, which we find to hold regardless of the underlying type of the language model. Furthermore, neural networks incur an increased computational complexity compared to count models, and they differently model context dependences, often exceeding the number of words that are taken into account by count based approaches. These differences require efficient search methods for neural networks, and we analyze the potential improvements that can be obtained when applying advanced algorithms to the rescoring of word lattices on large-scale setups.
•The need of deep neural networks for stock price and trend prediction is discussed.•CNN, DQN, RNN, LSTM, GRU, ESN, DNN, RBM, and DBN are reviewed for stock prediction.•An experimental comparison of ...nine models is carried out and results are analysed.•The prediction performance of considered models are compared with existing approach.•The challenges and potential future research directions are also provided.
The stock market has been an attractive field for a large number of organizers and investors to derive useful predictions. Fundamental knowledge of stock market can be utilised with technical indicators to investigate different perspectives of the financial market; also, the influence of various events, financial news, and/or opinions on investors’ decisions and hence, market trends have been observed. Such information can be exploited to make reliable predictions and achieve higher profitability. Computational intelligence has emerged with various deep neural network (DNN) techniques to address complex stock market problems. In this article, we aim to review the significance and need of DNNs in the field of stock price and trend prediction; we discuss the applicability of DNN variations to the temporal stock market data and also extend our survey to include hybrid, as well as metaheuristic, approaches with DNNs. We observe the potential limitations for stock market prediction using various DNNs. To provide an experimental evaluation, we also conduct a series of experiments for stock market prediction using nine deep learning-based models; we analyse the impact of these models on forecasting the stock market data. We also evaluate the performance of individual models with different number of features. We discuss challenges, as well as potential future research directions, and conclude our survey with the experimental study. This survey can be referred for the recent perspectives of DNN-based stock market prediction, primarily covering research spanning over years 2017-2020.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Deep neural networks have been widely applied to hyperspectral image (HSI) classification areas, in which recurrent neural network (RNN) is one of the most typical networks. Most of the existing ...RNN-based classifiers treat the spectral signature of pixels as an ordered sequence, in which only unidirectional correlation along the wavelength direction of adjacent bands is considered. However, each band image is related to not only its preceding band images but also its successive band images. In order to fully explore such bidirectional spectral correlation within an HSI, in this article, a bidirectional long short-term memory (Bi-LSTM)-based network is designed for HSI classification. Moreover, a spatial-spectral attention mechanism is designed and implemented in the proposed Bi-LSTM network to emphasize the effective information and reduce the redundant information among spatial-spectral context of pixels, by which the performance of classification can be greatly improved. Experimental results over three benchmark HSIs, i.e., Salinas Valley, Pavia Centre, and Pavia University, demonstrate that our proposed Bi-LSTM obviously outperforms several state-of-the-art unidirectional RNN-based classification algorithms. Moreover, the proposed spatial-spectral attention mechanism can further improve the classification accuracy of our proposed Bi-LSTM algorithm by effectively weighting spatial and spectral context of pixels. The source code of the proposed Bi-LSTM algorithm is available at https://github.com/MeiShaohui/Attention-based-Bidirectional-LSTM-Network .
Partial differential equations (PDEs) are commonly derived based on empirical observations. However, recent advances of technology enable us to collect and store massive amount of data, which offers ...new opportunities for data-driven discovery of PDEs. In this paper, we propose a new deep neural network, called PDE-Net 2.0, to discover (time-dependent) PDEs from observed dynamic data with minor prior knowledge on the underlying mechanism that drives the dynamics. The design of PDE-Net 2.0 is based on our earlier work 1 where the original version of PDE-Net was proposed. PDE-Net 2.0 is a combination of numerical approximation of differential operators by convolutions and a symbolic multi-layer neural network for model recovery. Comparing with existing approaches, PDE-Net 2.0 has the most flexibility and expressive power by learning both differential operators and the nonlinear response function of the underlying PDE model. Numerical experiments show that the PDE-Net 2.0 has the potential to uncover the hidden PDE of the observed dynamics, and predict the dynamical behavior for a relatively long time, even in a noisy environment.
•The proposal of a numeric-symbolic hybrid deep network to recover PDEs from observed dynamic data.•The symbolic network is able to recover concise analytic form of the hidden PDE model.•Our approach only requires minor prior knowledge on the mechanism of the observed dynamic data.•The network can perform accurate long-term prediction without re-training for new initial conditions.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Recently, enormous datasets have made power dissipation and area usage lie at the heart of designs for artificial neural networks (ANNs). Considering the significant role of activation functions in ...neurons and the growth of hardware-based neural networks like memristive neural networks, this work proposes a novel design for a hyperbolic tangent activation function (Tanh) to be used in memristive-based neuromorphic architectures. The purpose of implementing a CMOS-based design for Tanh is to decrease power dissipation and area usage. This design also increases the overall speed of computation in ANNs, while keeping the accuracy in an acceptable range. The proposed design is one of the first analog designs for the hyperbolic tangent and its performance is analyzed by using two well-known datsets, including the Modified National Institute of Standards and Technology (MNIST) and Fashion-MNIST. The direct implementation of the proposed design for Tanh is proposed and investigated via software and hardware modeling.
This paper proposes a novel intelligent fault diagnosis method to automatically identify different health conditions of wind turbine (WT) gearbox. Unlike traditional approaches, where feature ...extraction and classification are separately designed and performed, this paper aims to automatically learn effective fault features directly from raw vibration signals while classify the type of faults in a single framework, thus providing an end-to-end learning-based fault diagnosis system for WT gearbox without additional signal processing and diagnostic expertise. Considering the multiscale characteristics inherent in vibration signals of a gearbox, a new multiscale convolutional neural network (MSCNN) architecture is proposed to perform multiscale feature extraction and classification simultaneously. The proposed MSCNN incorporates multiscale learning into the traditional CNN architecture, which has two merits: 1) high-level fault features can be effectively learned by the hierarchical learning structure with multiple pairs of convolutional and pooling layers; and 2) multiscale learning scheme can capture complementary and rich diagnosis information at different scales. This greatly improves the feature learning ability and enables better diagnosis performance. The proposed MSCNN approach is evaluated through experiments on a WT gearbox test rig. Experimental results and comprehensive comparison analysis with respect to the traditional CNN and traditional multiscale feature extractors have demonstrated the superiority of the proposed method.
In this paper, we describe a novel deep convolutional neural network (CNN) that is deeper and wider than other existing deep networks for hyperspectral image classification. Unlike current ...state-of-the-art approaches in CNN-based hyperspectral image classification, the proposed network, called contextual deep CNN, can optimally explore local contextual interactions by jointly exploiting local spatio-spectral relationships of neighboring individual pixel vectors. The joint exploitation of the spatio-spectral information is achieved by a multi-scale convolutional filter bank used as an initial component of the proposed CNN pipeline. The initial spatial and spectral feature maps obtained from the multi-scale filter bank are then combined together to form a joint spatio-spectral feature map. The joint feature map representing rich spectral and spatial properties of the hyperspectral image is then fed through a fully convolutional network that eventually predicts the corresponding label of each pixel vector. The proposed approach is tested on three benchmark data sets: the Indian Pines data set, the Salinas data set, and the University of Pavia data set. Performance comparison shows enhanced classification performance of the proposed approach over the current state-of-the-art on the three data sets.
Domain adaptation studies learning algorithms that generalize across source domains and target domains that exhibit different distributions. Recent studies reveal that deep neural networks can learn ...transferable features that generalize well to similar novel tasks. However, as deep features eventually transition from general to specific along the network, feature transferability drops significantly in higher task-specific layers with increasing domain discrepancy. To formally reduce the effects of this discrepancy and enhance feature transferability in task-specific layers, we develop a novel framework for deep adaptation networks that extends deep convolutional neural networks to domain adaptation problems. The framework embeds the deep features of all task-specific layers into reproducing kernel Hilbert spaces (RKHSs) and optimally matches different domain distributions. The deep features are made more transferable by exploiting low-density separation of target-unlabeled data in very deep architectures, while the domain discrepancy is further reduced via the use of multiple kernel learning that enhances the statistical power of kernel embedding matching. The overall framework is cast in a minimax game setting. Extensive empirical evidence shows that the proposed networks yield state-of-the-art results on standard visual domain-adaptation benchmarks.
This paper proposes a generalized prediction system called a recurrent self-evolving fuzzy neural network (RSEFNN) that employs an on-line gradient descent learning rule to address the ...electroencephalography (EEG) regression problem in brain dynamics for driving fatigue. The cognitive states of drivers significantly affect driving safety; in particular, fatigue driving, or drowsy driving, endangers both the individual and the public. For this reason, the development of brain-computer interfaces (BCIs) that can identify drowsy driving states is a crucial and urgent topic of study. Many EEG-based BCIs have been developed as artificial auxiliary systems for use in various practical applications because of the benefits of measuring EEG signals. In the literature, the efficacy of EEG-based BCIs in recognition tasks has been limited by low resolutions. The system proposed in this paper represents the first attempt to use the recurrent fuzzy neural network (RFNN) architecture to increase adaptability in realistic EEG applications to overcome this bottleneck. This paper further analyzes brain dynamics in a simulated car driving task in a virtual-reality environment. The proposed RSEFNN model is evaluated using the generalized cross-subject approach, and the results indicate that the RSEFNN is superior to competing models regardless of the use of recurrent or nonrecurrent structures.