To solve a navigation task based on experiences, we need a mechanism to associate places with objects and recall them along the course of action. In a reward-oriented task, if the route to a reward ...location is simulated in mind after experiencing it once, it might be possible that the reward is gained efficiently. One way to solve this is to incorporate a biologically plausible mechanism. In this study, we propose a neural network that stores a sequence of events associated with a reward. The proposed network recalls the reward location by tracing them in its mind in order. We simulated a virtual mouse that explores a figure-eight maze and recalls the route to the reward location. During the learning period, a sequence of events related to firing along a passage was temporarily stored in the heteroassociative network, and the sequence of events is consolidated in the synaptic weight matrix when a reward is fed. For retrieval, an impetus input internally generates the sequential activation of conjunctive cue-place cells toward the reward location. In the figure-eight maze task, the location of the reward was estimated by mind travel, irrespective of whether the reward is in the counterclockwise or distant clockwise route. The mechanism of efficiently reaching the goal by mind travel in the brain based on experiences is beneficial for mobile service robots that perform autonomous navigation.
This paper proposes a shared synapse architecture for autoencoders (AEs), and implements an AE with the proposed architecture as a digital circuit on a field-programmable gate array (FPGA). In the ...proposed architecture, the values of the synapse weights are shared between the synapses of an input and a hidden layer, and between the synapses of a hidden and an output layer. This architecture utilizes less of the limited resources of an FPGA than an architecture which does not share the synapse weights, and reduces the amount of synapse modules used by half. For the proposed circuit to be implemented into various types of AEs, it utilizes three kinds of parameters; one to change the number of layers' units, one to change the bit width of an internal value, and a learning rate. By altering a network configuration using these parameters, the proposed architecture can be used to construct a stacked AE. The proposed circuits are logically synthesized, and the number of their resources is determined. Our experimental results show that single and stacked AE circuits utilizing the proposed shared synapse architecture operate as regular AEs and as regular stacked AEs. The scalability of the proposed circuit and the relationship between the bit widths and the learning results are also determined. The clock cycles of the proposed circuits are formulated, and this formula is used to estimate the theoretical performance of the circuit when the circuit is used to construct arbitrary networks.
Abstract
Reservoir computing (RC) can efficiently process time-series data by mapping the input signal into a high-dimensional space via randomly connected recurrent neural networks (RNNs), which are ...referred to as a reservoir. The high-dimensional representation of time-series data in the reservoir simplifies subsequent learning tasks. Although this simple architecture allows fast learning and facile physical implementation, the learning performance is inferior to that of other state-of-the-art RNN models. In this study, to improve the learning ability of RC, we propose self-modulated RC (SM-RC) that extends RC by adding a self-modulation mechanism. SM-RC can perform attention tasks where input information is retained or discarded depending on the input signal. We find that a chaotic state can emerge as a result of learning in SM-RC. Furthermore, we demonstrate that SM-RC outperforms RC in NARMA and Lorenz model tasks. Because the SM-RC architecture only requires two additional gates, it is physically implementable as RC, thereby providing a direction for realizing edge artificial intelligence.
Boltzmann machines (BMs) are useful in various applications but are limited by their requirement to generate random numbers. In contrast, chaotic Boltzmann machines (CBMs) are neural networks that ...imitate the stochastic behavior of BMs with the chaotic dynamics and deterministic behavior, without random numbers. CBMs can potentially require fewer hardware resources than the original algorithms due to the unnecessity of random number generators. In this study, hardware-oriented algorithms and a differential multiply-accumulation operation are proposed to overcome the difficulties of implementing CBMs on field-programmable gate arrays (FPGAs). A hardware-oriented algorithm for CBMs, which includes fixed-point operations and shift operations, is proposed to reduce hardware resource utilization in the implemented circuits. In particular, the differential multiply-accumulate operation allows us to implement the multiply-accumulate operation with block random access memory and digital signal processors to reduce the consumption of lookup tables and flip-flops in FPGAs without losing the calculation speed. Our proposed approach was evaluated in numerical simulations, logical synthesis, and FPGA implementation. The calculation speed of FPGA-implemented CBMs was compared with software-implemented CBMs, which resulted in 1 / 6,500 of calculation time reduction in a 300-neuron CBM. Moreover, 2,048 neurons of CBM were realized by the logical synthesis. Therefore, the proposed hardware implementation of CBMs was shown to be feasible. The proposed CBMs can solve combinatorial optimization problems at a larger scale with fewer resources.
This paper proposes a time-domain analog calculations model based on a pulse-width modulation (PWM) approach for neural network calculations including weighted-sum or multiply-and-accumulate ...calculation and rectified-linear unit operation. We also propose very-large-scale integration (VLSI) circuits to implement the proposed model. Unlike the conventional analog voltage or current mode circuits, our circuits use transient operation in charging/discharging processes to capacitors through resistors. Since the circuits calculate multiple weighted-sums by charging a capacitance, they can be operated with extremely low energy consumption. However, because a relatively long time constant is required to guarantee calculation resolution in the time domain, they have to use very high-resistance devices, on the order of giga-ohms. We designed, fabricated, and tested a proof-of-concept complementary metal-oxide-semiconductor (CMOS) VLSI chip using a 250-nm fabrication technology to verify weighted-sum operation based on the proposed model with binary weights and PWM input signals, which realizes the BinaryConnect model. In the chip, memory cells of static-random-access memory (SRAM) are used for synaptic connection weights. High-resistance operation was realized by using the subthreshold operation region of MOS transistors, unlike in the ordinary in-memory-computing circuits. We evaluated the energy efficiency and temperature characteristics by measurement using the fabricated chip, where the highest energy efficiency for the weighted-sum calculation was 300 TOPS/W (Tera-Operations Per Second per Watt). The effects by a temperature change can be compensated for by adjusting the bias voltage. If state-of-the-art VLSI technology is used to implement the proposed model, an energy efficiency of more than 1,000 TOPS/W will be possible.
This study develops an intelligent system for home service robots mimicking human brain function that can manage common knowledge applicable to any environment and local knowledge reflecting its ...specific environment. Deep learning is effective for acquiring common knowledge because the performance of deep learning relies on the amounts of training and big training data that can be accessed for such knowledge; however, deep learning is ineffective for acquiring local knowledge because no big training data for such knowledge exist. Thus, we propose a brain-inspired learning model and system for acquiring local knowledge using small training data. We focus on the amygdala because its classical fear conditioning is effective for training using small training data. We propose an amygdala-inspired classical conditioning model comprising multiple self-organizing maps (lateral nucleus) and a fully connected neural network (central nucleus), imitating the function and structure of the amygdala. The proposed model is applied to a task of a waiter robot in a restaurant, and the model can learn customers' preferences after only a few human-robot interactions. We accelerate the computation of the model and reduce its power consumption by proposing a hardware-oriented algorithm for the model and its digital hardware design and implement it in an XCZU9EG field programmable gate array. The hardware-oriented algorithm reduces the multiplication operations and exponential functions requiring huge hardware resources. The performance of the hardware operated at 150 MHz is 1,273 times faster than the software implementation on Arm Cortex-A53, and the power consumption of the chip is 5.009 W.
<div class=""abs_img""><img src=""disp_template_path/JRM/abst-image/00270006/11.jpg"" width=""300"" /> Summary of proposed method
Home service robots must possess the ability to communicate with ...humans, for which human detection and recognition methods are particularly important. This paper proposes methods for human detection and face recognition that are based on image processing, and are suitable for home service robots. For the human detection method, we combine the method proposed by Xia et al. based on the use of head shape with the results of region segmentation based on depth information, and use the positional relations of the detected points. We obtained a detection rate of 98.1% when the method was evaluated for various postures and facing directions. We demonstrate the robustness of the proposed method against postural changes such as stretching the arms, resting the chin on one’s hands, and drinking beverages. For the human recognition method, we combine the elastic bunch graph matching method proposed by Wiskott et al. with Face Tracking SDK to extract facial feature points, and use the 3D information in the deformation computation; we obtained a recognition rate of 93.6% during evaluation.
This live demonstration presents a "connective object for middleware to accelerator (COMTA)," an intelligent processing system that uses hardware accelerators (i.e., field programmable gate arrays ...(FPGAs)) and robot middleware. The key idea of COMTA is to automatically generate the system via robot middleware interfaces. To realize the proposed system, we have developed a block of programs called an "object" in a hardware/software complex system. We demonstrate an implementation of a human tracking image processing application on a vehicle robot accelerated by COMTA. The demonstration system achieved 3.3 times better power efficiency than a general PCs.
In this paper, we propose an analog CMOS circuit which achieves spiking neural networks with spike-timing dependent synaptic plasticity (STDP). In particular, we propose a STDP circuit with symmetric ...function for the first time, and also we demonstrate associative memory operation in a Hopfield-type feedback network with STDP learning. In our spiking neuron model, analog information expressing processing results is given by the relative timing of spike firing events. It is well known that a biological neuron changes its synaptic weights by STDP, which provides learning rules depending on relative timing between asynchronous spikes. Therefore, STDP can be used for spiking neural systems with learning function. The measurement results of fabricated chips using TSMC 0.25µm CMOS process technology demonstrate that our spiking neuron circuit can construct feedback networks and update synaptic weights based on relative timing between asynchronous spikes by a symmetric or an asymmetric STDP circuits.