Online Spatio-Temporal Learning in Deep Neural Networks Bohnstingl, Thomas; Wozniak, Stanislaw; Pantazi, Angeliki ...
IEEE transaction on neural networks and learning systems,
11/2023, Volume:
34, Issue:
11
Journal Article
Open access
Biological neural networks are equipped with an inherent capability to continuously adapt through online learning. This aspect remains in stark contrast to learning with error backpropagation through ...time (BPTT) that involves offline computation of the gradients due to the need to unroll the network through time. Here, we present an alternative online learning algorithm ic framework for deep recurrent neural networks (RNNs) and spiking neural networks (SNNs), called online spatio-temporal learning (OSTL). It is based on insights from biology and proposes the clear separation of spatial and temporal gradient components. For shallow SNNs, OSTL is gradient equivalent to BPTT enabling for the first time online training of SNNs with BPTT-equivalent gradients. In addition, the proposed formulation unveils a class of SNN architectures trainable online at low time complexity. Moreover, we extend OSTL to a generic form, applicable to a wide range of network architectures, including networks comprising long short-term memory (LSTM) and gated recurrent units (GRUs). We demonstrate the operation of our algorithm ic framework on various tasks from language modeling to speech recognition and obtain results on par with the BPTT baselines.
In the new era of cognitive computing, systems will be able to learn and interact with the environment in ways that will drastically enhance the capabilities of current processors, especially in ...extracting knowledge from vast amount of data obtained from many sources. Brain-inspired neuromorphic computing systems increasingly attract research interest as an alternative to the classical von Neumann processor architecture, mainly because of the coexistence of memory and processing units. In these systems, the basic components are neurons interconnected by synapses. The neurons, based on their nonlinear dynamics, generate spikes that provide the main communication mechanism. The computational tasks are distributed across the neural network, where synapses implement both the memory and the computational units, by means of learning mechanisms such as spike-timing-dependent plasticity. In this work, we present an all-memristive neuromorphic architecture comprising neurons and synapses realized by using the physical properties and state dynamics of phase-change memristors. The architecture employs a novel concept of interconnecting the neurons in the same layer, resulting in level-tuned neuronal characteristics that preferentially process input information. We demonstrate the proposed architecture in the tasks of unsupervised learning and detection of multiple temporal correlations in parallel input streams. The efficiency of the neuromorphic architecture along with the homogenous neuro-synaptic dynamics implemented with nanoscale phase-change memristors represent a significant step towards the development of ultrahigh-density neuromorphic co-processors.
OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and revolutionized the approach in artificial intelligence to human-model interaction. The first contact with the chatbot ...reveals its ability to provide detailed and precise answers in various areas. Several publications on ChatGPT evaluation test its effectiveness on well-known natural language processing (NLP) tasks. However, the existing studies are mostly non-automated and tested on a very limited scale. In this work, we examined ChatGPT’s capabilities on 25 diverse analytical NLP tasks, most of them subjective even to humans, such as sentiment analysis, emotion recognition, offensiveness, and stance detection. In contrast, the other tasks require more objective reasoning like word sense disambiguation, linguistic acceptability, and question answering. We also evaluated GPT-4 model on five selected subsets of NLP tasks. We automated ChatGPT and GPT-4 prompting process and analyzed more than 49k responses. Our comparison of its results with available State-of-the-Art (SOTA) solutions showed that the average loss in quality of the ChatGPT model was about 25% for zero-shot and few-shot evaluation. For GPT-4 model, a loss for semantic tasks is significantly lower than for ChatGPT. We showed that the more difficult the task (lower SOTA performance), the higher the ChatGPT loss. It especially refers to pragmatic NLP problems like emotion recognition. We also tested the ability to personalize ChatGPT responses for selected subjective tasks via Random Contextual Few-Shot Personalization, and we obtained significantly better user-based predictions. Additional qualitative analysis revealed a ChatGPT bias, most likely due to the rules imposed on human trainers by OpenAI. Our results provide the basis for a fundamental discussion of whether the high quality of recent predictive NLP models can indicate a tool’s usefulness to society and how the learning and validation procedures for such systems should be established.
•The results of ChatGPT and GPT-4 evaluation on 25 tasks using 48k+ prompts.•Context-awareness and personalization are valuable capabilities of ChatGPT.•ChatGPT and GPT-4 are always worse compared to SOTA methods from 4% to over 70%.•ChatGPT loss tends to be higher for more difficult reasoning problems.•ChatGPT can boost AI development and change our daily lives.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Deep spiking neural networks (SNNs) offer the promise of low-power artificial intelligence. However, training deep SNNs from scratch or converting deep artificial neural networks to SNNs without loss ...of performance has been a challenge. Here we propose an exact mapping from a network with Rectified Linear Units (ReLUs) to an SNN that fires exactly one spike per neuron. For our constructive proof, we assume that an arbitrary multi-layer ReLU network with or without convolutional layers, batch normalization and max pooling layers was trained to high performance on some training set. Furthermore, we assume that we have access to a representative example of input data used during training and to the exact parameters (weights and biases) of the trained ReLU network. The mapping from deep ReLU networks to SNNs causes zero percent drop in accuracy on CIFAR10, CIFAR100 and the ImageNet-like data sets Places365 and PASS. More generally our work shows that an arbitrary deep ReLU network can be replaced by an energy-efficient single-spike neural network without any loss of performance.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
In biologically inspired spiking neural networks (SNNs) neurons communicate by short pulses, called spikes. SNNs have the potential to be more power efficient than artificial neural networks (ANNs), ...thanks to the fewer computational steps required by the spike transmission and processing, as compared to the multiply-and-accumulate (MAC) operations with wide bit-vectors usually adopted in ANNs. We present the design of two types of SNNs with integrate-and-fire dynamics and single-spike per neuron operation, where neural communication is based on synchronous time-to-first-spike (sTTFS) and time-to-first-spike (TTFS) encoding schemes. In the considered time-encoded SNNs, the information is carried by the timing of the spikes with respect to a reference time. In 7nm CMOS technology both designs are synthesized as VHDL-based random-logic-macros (RLMs) and compared to an equivalent ANN design in terms of power consumption, latency and silicon area, using the Iris data set for inference. A cost function expressed as a product of energy consumption and silicon area is introduced to compare the three network designs. With respect to this cost function, it turns out that the SNN-TTFS implemented for the considered classification task outperforms the ANN used as baseline model.
Spiking neural networks (SNNs) are mimicking computationally powerful biologically inspired models in which neurons communicate through sequences of spikes, regarded here as sparse binary sequences ...of zeros and ones. In neuroscience it is conjectured that time encoding, where the information is carried by the temporal position of spikes, is playing a crucial role at least in some parts of the brain where estimation of the spiking rate with a large latency cannot take place. Motivated by the efficiency of temporal coding, compared with the widely used rate coding, the goal of this paper is to develop and train an energy-efficient time-coded deep spiking neural network system. To ensure that the similarity among input stimuli is translated into a correlation of the spike sequences, we introduce correlative temporal encoding and extended correlative temporal encoding techniques to map analog input information into
input spike patterns
. Importantly, we propose an implementation where all multiplications in the system are replaced with at most a few additions. As a more efficient alternative to both rate-coded SNNs and artificial neural networks, such system represents a preferable solution for the implementation of neuromorphic hardware. We consider data classification tasks where
input spike patterns
are presented to a feed-forward architecture with leaky-integrate-and-fire neurons. The SNN is trained by backpropagation through time with the objective to match sequences of output spikes with those of specifically designed
target spike patterns
, each corresponding to exactly one class. During inference the
target spike pattern
with the smallest van Rossum distance from the
output spike pattern
determines the class. Extensive simulations indicate that the proposed system achieves a classification accuracy at par with that of state-of-the-art machine learning models.
Full text
Available for:
DOBA, EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, IZUM, KILJ, KISLJ, MFDPS, NLZOH, NUK, ODKLJ, OILJ, PILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UILJ, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
This review provides a high-level synthesis of significant recent advances in artificial neural network research, as well as multi-disciplinary concepts connected to the far-reaching goal of ...obtaining intelligent systems. We assume that a global outlook of these interconnected fields can benefit researchers by providing alternative viewpoints. Therefore, we present different network and neuron models, we discuss model parameters and the means to obtain them, and we draw a quick outline of information encoding, before proceeding to an overview of the relevant learning mechanisms, ranging from established approaches to novel ideas. We specifically focus on comparing the classical artificial model with the biologically-feasible spiking neuron, and we take this comparison further into a discussion on the biological plausibility of various learning approaches.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK, ZRSKP
Neuromorphic computing takes inspiration from the brain to build highly parallel, energy- and area-efficient architectures. Recently, hardware realizations of neurons and synapses using memristive ...devices were proposed and applied for the task of correlation detection. However, for weakly correlated signals, this task becomes challenging because of the variability and the asymmetric conductance response of the memristive devices. In this brief, we propose a high-density memristive system realized using nanodevices based on phase-change technology. We present a noise-robust phase-change implementation of a neuron and a synaptic learning rule that is capable of capturing patterns of weakly correlated inputs. We experimentally demonstrate the operation with a correlation coefficient as low as 0.2 using a record number of 1M phase-change synapses.
Spiking neural networks (SNNs) incorporating biologically plausible neurons hold great promise because of their unique temporal dynamics and energy efficiency. However, SNNs have developed separately ...from artificial neural networks (ANNs), limiting the impact of deep learning advances for SNNs. Here, we present an alternative perspective of the spiking neuron that incorporates its neural dynamics into a recurrent ANN unit called a spiking neural unit (SNU). SNUs may operate as SNNs, using a step function activation, or as ANNs, using continuous activations. We demonstrate the advantages of SNU dynamics through simulations on multiple tasks and obtain accuracies comparable to, or better than, those of ANNs. The SNU concept enables an efficient implementation with in-memory acceleration for both training and inference. We experimentally demonstrate its efficacy for a music-prediction task in an in-memory-based SNN accelerator prototype using 52,800 phase-change memory devices. Our results open up an avenue for broad adoption of biologically inspired neural dynamics in challenging applications and acceleration with neuromorphic hardware.Spiking neural networks and in-memory computing are both promising routes towards energy-efficient hardware for deep learning. Woźniak et al. incorporate the biologically inspired dynamics of spiking neurons into conventional recurrent neural network units and in-memory computing, and show how this allows for accurate and energy-efficient deep learning.
Plasticity circuits in the brain are known to be influenced by the distribution of the synaptic weights through the mechanisms of synaptic integration and local regulation of synaptic strength. ...However, the complex interplay of stimulation-dependent plasticity with local learning signals is disregarded by most of the artificial neural network training algorithms devised so far. Here, we propose a novel biologically inspired optimizer for artificial and spiking neural networks that incorporates key principles of synaptic plasticity observed in cortical dendrites: GRAPES (Group Responsibility for Adjusting the Propagation of Error Signals). GRAPES implements a weight-distribution-dependent modulation of the error signal at each node of the network. We show that this biologically inspired mechanism leads to a substantial improvement of the performance of artificial and spiking networks with feedforward, convolutional, and recurrent architectures, it mitigates catastrophic forgetting, and it is optimally suited for dedicated hardware implementations. Overall, our work indicates that reconciling neurophysiology insights with machine intelligence is key to boosting the performance of neural networks.