Bioinspired spiking neural networks (SNNs), operating with asynchronous binary signals (or spikes) distributed over time, can potentially lead to greater computational efficiency on event-driven ...hardware. The state-of-the-art SNNs suffer from high inference latency, resulting from inefficient input encoding and suboptimal settings of the neuron parameters (firing threshold and membrane leak). We propose DIET-SNN, a low-latency deep spiking network trained with gradient descent to optimize the membrane leak and the firing threshold along with other network parameters (weights). The membrane leak and threshold of each layer are optimized with end-to-end backpropagation to achieve competitive accuracy at reduced latency. The input layer directly processes the analog pixel values of an image without converting it to spike train. The first convolutional layer converts analog inputs into spikes where leaky-integrate-and-fire (LIF) neurons integrate the weighted inputs and generate an output spike when the membrane potential crosses the trained firing threshold. The trained membrane leak selectively attenuates the membrane potential, which increases activation sparsity in the network. The reduced latency combined with high activation sparsity provides massive improvements in computational efficiency. We evaluate DIET-SNN on image classification tasks from CIFAR and ImageNet datasets on VGG and ResNet architectures. We achieve top-1 accuracy of 69% with five timesteps (inference latency) on the ImageNet dataset with <inline-formula> <tex-math notation="LaTeX">12\times </tex-math></inline-formula> less compute energy than an equivalent standard artificial neural network (ANN). In addition, DIET-SNN performs 20-<inline-formula> <tex-math notation="LaTeX">500\times </tex-math></inline-formula> faster inference compared to other state-of-the-art SNN models.
In this paper, to reduce the time and manpower to fine-tune an extended Kalman filter (EKF), we propose a new learning framework, EKFNet, for automatically estimating the best process and measurement ...noise covariance parameters for an EKF from real measurement data. The EKFNet is trained end-to-end by using backpropagation through time (BPTT) over the EKF. The forward operation of EKFNet is the same as the normal EKF operation which will be used during the tracking process. During the offline training, the EKFNet uses the BPTT for passing the gradient flow to each time step and optimizing the unknown noise statistic parameters. The proposed method can choose among several optimization criteria, such as maximizing the likelihood, minimizing the measurement residual error, or minimizing the posterior state estimation error either with or without the ground truth data. The proposed method's performance is demonstrated using real GPS data, which outperforms an existing method and a manually tuned EKF.
This paper presents a low-level controller for an unmanned surface vehicle based on adaptive dynamic programming and deep reinforcement learning. This approach uses a single deep neural network ...capable of self-learning a policy, and driving the surge speed and yaw dynamics of a vessel. A simulation of the vehicle mathematical model was used to train the neural network with the model-based backpropagation through time algorithm, capable of dealing with continuous action-spaces. The path-following control scenario is additionally addressed by combining the proposed low-level controller and a line-of-sight based guidance law with time-varying look-ahead distance. Simulation and real-world experimental results are presented to validate the control capabilities of the proposed approach and contribute to the diversity of validated applications of adaptive dynamic programming based control strategies. Results show the controller is capable of self-learning the policy to drive the surge speed and yaw dynamics, and has an improved performance in comparison to a standard controller.
•Adaptive dynamic programming can control the speed and heading dynamics of a USV.•Backpropagation through time can train a DNN using a simulation model of the vehicle.•Real-world results achieve low error and control effort with set-point regulation.•When combined with LOS-based guidance, accurate path-following is demonstrated.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
This study implements a recurrent neural network (RNN) by comparing two RNN network structures, namely Elman and Jordan using the backpropagation through time (BPTT) programming algorithm in the ...training and forecasting process in foreign exchange forecasting cases. The activation functions used are the linear transfer function, the tan-sigmoid transfer function (Tansig), and the log-sigmoid transfer function (Logsig), which are applied to the hidden and output layers. The application of the activation function results in the log-sigmoid transfer function being the most appropriate activation function for the hidden layer, while the linear transfer function is the most appropriate activation function for the output layer. Based on the results of training and forecasting the USD against IDR currency, the Elman BPTT method is better than the Jordan BPTT method, with the best iteration being the 4000th iteration for both. The lowest root mean square error (RMSE) values for training and forecasting produced by Elman BPTT were 0.073477 and 122.15 the following day, while the Jordan backpropagation RNN method yielded 0.130317 and 222.96 also the following day.
Online Spatio-Temporal Learning in Deep Neural Networks Bohnstingl, Thomas; Wozniak, Stanislaw; Pantazi, Angeliki ...
IEEE transaction on neural networks and learning systems,
11/2023, Volume:
34, Issue:
11
Journal Article
Open access
Biological neural networks are equipped with an inherent capability to continuously adapt through online learning. This aspect remains in stark contrast to learning with error backpropagation through ...time (BPTT) that involves offline computation of the gradients due to the need to unroll the network through time. Here, we present an alternative online learning algorithm ic framework for deep recurrent neural networks (RNNs) and spiking neural networks (SNNs), called online spatio-temporal learning (OSTL). It is based on insights from biology and proposes the clear separation of spatial and temporal gradient components. For shallow SNNs, OSTL is gradient equivalent to BPTT enabling for the first time online training of SNNs with BPTT-equivalent gradients. In addition, the proposed formulation unveils a class of SNN architectures trainable online at low time complexity. Moreover, we extend OSTL to a generic form, applicable to a wide range of network architectures, including networks comprising long short-term memory (LSTM) and gated recurrent units (GRUs). We demonstrate the operation of our algorithm ic framework on various tasks from language modeling to speech recognition and obtain results on par with the BPTT baselines.
Fuzzy Cognitive Mapping (FCM) and the extensive family of models derived from it have firmly established their strong position in the landscape of machine learning algorithms. Specifically designed ...for pattern classification and multi-output regression, the recently introduced Recurrence-aware Long-term Cognitive Network (r-LTCN) model is one of these FCM-inspired extensions. On the one hand, this recurrent neural network connects all temporal states generated during the reasoning process with the decision-making layer. On the other hand, it uses a quasi-nonlinear reasoning rule devoted to avoiding convergence issues caused by unique fixed points, which typically emerge in other FCM models. In the original paper, the authors employed a combination of unsupervised and supervised learning to compute the r-LTCNs’ learnable parameters. Despite r-LTCNs’ astounding performance for a wide variety of pattern classification problems, the literature reports no attempt to train these recurrent neural systems in a fully supervised manner nor provide insights into their performance in other machine learning settings. This paper brings forward a modified Backpropagation Through Time learning (BPTT) algorithm devoted to training r-LTCN models used for multi-output regressions tasks rather than pattern classification. The proposed BPPT includes a simple yet effective mechanism to deal with the vanishing gradient within the recurrent layer that operates as a closed system while being tailored to the quasi-nonlinear reasoning mechanism. Empirical evaluation of the proposed BPTT algorithm using 20 multi-output regression problems reveals that it produces lower prediction errors compared with other state-of-the-art learning approaches.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Spiking neural network (SNN) is broadly deployed in neuromorphic devices to emulate brain function. In this context, SNN security becomes important while lacking in-depth investigation. To this end, ...we target the adversarial attack against SNNs and identify several challenges distinct from the artificial neural network (ANN) attack: 1) current adversarial attack is mainly based on gradient information that presents in a spatiotemporal pattern in SNNs, hard to obtain with conventional backpropagation algorithms; 2) the continuous gradient of the input is incompatible with the binary spiking input during gradient accumulation, hindering the generation of spike-based adversarial examples; and 3) the input gradient can be all-zeros (i.e., vanishing) sometimes due to the zero-dominant derivative of the firing function. Recently, backpropagation through time (BPTT)-inspired learning algorithms are widely introduced into SNNs to improve the performance, which brings the possibility to attack the models accurately given spatiotemporal gradient maps. We propose two approaches to address the above challenges of gradient-input incompatibility and gradient vanishing. Specifically, we design a gradient-to-spike (G2S) converter to convert continuous gradients to ternary ones compatible with spike inputs. Then, we design a restricted spike flipper (RSF) to construct ternary gradients that can randomly flip the spike inputs with a controllable turnover rate, when meeting all-zero gradients. Putting these methods together, we build an adversarial attack methodology for SNNs. Moreover, we analyze the influence of the training loss function and the firing threshold of the penultimate layer on the attack effectiveness. Extensive experiments are conducted to validate our solution. Besides the quantitative analysis of the influence factors, we also compare SNNs and ANNs against adversarial attacks under different attack methods. This work can help reveal what happens in SNN attacks and might stimulate more research on the security of SNN models and neuromorphic devices.
This paper investigates how to train a recurrent neural network (RNN) using the Levenberg-Marquardt (LM) algorithm as well as how to implement optimal control of a grid-connected converter (GCC) ...using an RNN. To successfully and efficiently train an RNN using the LM algorithm, a new forward accumulation through time (FATT) algorithm is proposed to calculate the Jacobian matrix required by the LM algorithm. This paper explores how to incorporate FATT into the LM algorithm. The results show that the combination of the LM and FATT algorithms trains RNNs better than the conventional backpropagation through time algorithm. This paper presents an analytical study on the optimal control of GCCs, including theoretically ideal optimal and suboptimal controllers. To overcome the inapplicability of the optimal GCC controller under practical conditions, a new RNN controller with an improved input structure is proposed to approximate the ideal optimal controller. The performance of an ideal optimal controller and a well-trained RNN controller was compared in close to real-life power converter switching environments, demonstrating that the proposed RNN controller can achieve close to ideal optimal control performance even under low sampling rate conditions. The excellent performance of the proposed RNN controller under challenging and distorted system conditions further indicates the feasibility of using an RNN to approximate optimal control in practical applications.
Computed tomography (CT) is one of the most important medical imaging technologies in use today. Most commercial CT products use a technique known as the filtered backprojection (FBP) that is fast ...and can produce decent image quality when an X-ray dose is high. However, the FBP is not good enough on low-dose X-ray CT imaging because the CT image reconstruction problem becomes more stochastic. A more effective reconstruction technique proposed recently and implemented in a limited number of CT commercial products is an iterative reconstruction (IR). The IR technique is based on a Bayesian formulation of the CT image reconstruction problem with an explicit model of the CT scanning, including its stochastic nature, and a prior model that incorporates our knowledge about what a good CT image should look like. However, constructing such prior knowledge is more complicated than it seems. In this article, we propose a novel neural network for CT image reconstruction. The network is based on the IR formulation and constructed with a recurrent neural network (RNN). Specifically, we transform the gated recurrent unit (GRU) into a neural network performing CT image reconstruction. We call it "GRU reconstruction." This neural network conducts concurrent dual-domain learning. Many deep learning (DL)-based methods in medical imaging are single-domain learning, but dual-domain learning performs better because it learns from both the sinogram and the image domain. In addition, we propose backpropagation through stage (BPTS) as a new RNN backpropagation algorithm. It is similar to the backpropagation through time (BPTT) of an RNN; however, it is tailored for iterative optimization. Results from extensive experiments indicate that our proposed method outperforms conventional model-based methods, single-domain DL methods, and state-of-the-art DL techniques in terms of the root mean squared error (RMSE), the peak signal-to-noise ratio (PSNR), and the structure similarity (SSIM) and in terms of visual appearance.
In forecasting foreign currencies, or known as foreign exchange, fundamental analysis and technical analysis can be used. Fundamental analysis relies on external factors or news happening in the ...market. In comparison, technical analysis studies the price itself by relying on graphs and mathematical formulas. This study combines fundamental value and technical analysis to predict the Rupiah (IDR) against the Dollar (USD). In this study, the artificial neural network architecture that is compared is Backpropagation and Recurrent Neural Networks (RNN). The RNN architecture used in this research is Elman and Jordan with Backpropagation Through Time (BPTT) learning algorithm. Technical analysis is applied by entering the USD exchange rate against IDR at a certain time. At the same time, fundamental analysis is applied in the form of entering some data on the value of fundamental factors as a training data set. Fundamental data used in this research are inflation rate, interest rate, money supply, and the number of exports and imports in Rupiah. In this study, the prediction system is also compared, the prediction system uses technical data, and the prediction system uses technical and fundamental data. This research results in the prediction system using the Elman RNN algorithm, which is better than the Backpropagation and Jordan RNN algorithms. A prediction system using training data in time series and fundamental data is better than only training data. So, it means in this study that the best prediction system uses the Elman RNN Algorithm with training data in the form of time series data USD sell exchange rate against IDR and fundamental data.