People frequently gesture when a word is on the tip of their tongue (TOT), yet research is mixed as to whether and why gesture aids lexical retrieval. We tested three accounts: the lexical retrieval ...hypothesis, which predicts that semantically related gestures facilitate successful lexical retrieval; the cognitive load account, which predicts that matching gestures facilitate lexical retrieval only when retrieval is hard, as in the case of a TOT; and the motor movement account, which predicts that any motor movements should support lexical retrieval. In Experiment 1 (a between‐subjects study; N = 90), gesture inhibition, but not neck inhibition, affected TOT resolution but not overall lexical retrieval; participants in the gesture‐inhibited condition resolved fewer TOTs than participants who were allowed to gesture. When participants could gesture, they produced more representational gestures during resolved than unresolved TOTs, a pattern not observed for meaningless motor movements (e.g., beats). However, the effect of gesture inhibition on TOT resolution was not uniform; some participants resolved many TOTs, while others struggled. In Experiment 2 (a within‐subjects study; N = 34), the effect of gesture inhibition was traced to individual differences in verbal, not spatial short‐term memory (STM) span; those with weaker verbal STM resolved fewer TOTs when unable to gesture. This relationship between verbal STM and TOT resolution was not observed when participants were allowed to gesture. Taken together, these results fit the cognitive load account; when lexical retrieval is hard, gesture effectively reduces the cognitive load of TOT resolution for those who find the task especially taxing.
LSTM: A Search Space Odyssey Greff, Klaus; Srivastava, Rupesh K.; Koutnik, Jan ...
IEEE transaction on neural networks and learning systems,
10/2017, Volume:
28, Issue:
10
Journal Article
Open access
Several variants of the long short-term memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the ...state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In this paper, we present the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling. The hyperparameters of all LSTM variants for each task were optimized separately using random search, and their importance was assessed using the powerful functional ANalysis Of VAriance framework. In total, we summarize the results of 5400 experimental runs (≈15 years of CPU time), which makes our study the largest of its kind on LSTM networks. Our results show that none of the variants can improve upon the standard LSTM architecture significantly, and demonstrate the forget gate and the output activation function to be its most critical components. We further observe that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.
Working memory, the system that maintains a limited set of representations for immediate use in cognition, is a central part of human cognition. Three processes have recently been proposed to govern ...information storage in working memory: consolidation, refreshing, and removal. Here, we discuss in detail the theoretical construct of working memory consolidation, a process critical to the creation of a stable working memory representation. We present a brief overview of the research that indicated the need for a construct such as working memory consolidation and the subsequent research that has helped to define the parameters of the construct. We then move on to explicitly state the points of agreement as to what processes are involved in working memory consolidation.
Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) have recently outperformed other state-of-the-art approaches, such as i-vector and Deep Neural Networks (DNNs), in automatic Language ...Identification (LID), particularly when dealing with very short utterances (∼3s). In this contribution we present an open-source, end-to-end, LSTM RNN system running on limited computational resources (a single GPU) that outperforms a reference i-vector system on a subset of the NIST Language Recognition Evaluation (8 target languages, 3s task) by up to a 26%. This result is in line with previously published research using proprietary LSTM implementations and huge computational resources, which made these former results hardly reproducible. Further, we extend those previous experiments modeling unseen languages (out of set, OOS, modeling), which is crucial in real applications. Results show that a LSTM RNN with OOS modeling is able to detect these languages and generalizes robustly to unseen OOS languages. Finally, we also analyze the effect of even more limited test data (from 2.25s to 0.1s) proving that with as little as 0.5s an accuracy of over 50% can be achieved.
The international array for real-time geostrophic oceanography (Argo) project is committed to rapidly and precisely acquiring comprehensive 3-D data on ocean temperature and salinity, which is ...crucial for monitoring ocean climate change and natural phenomena. During the buoy observation, environmental factors, human mistakes, and equipment malfunctions can cause abnormalities such as density inversion and spike, and thus detecting the errors in Argo data is significant to ensure its reliability and applicability. Traditional methods mainly rely on the knowledge and judgment of marine experts, ensuring high accuracy but requiring large amounts of effort. Machine-learning methods are used for automatic Argo data error detection, while they still struggle with extracting deep and discriminative features from profiles. Recently, deep-learning methods have received increasing attention in this field, yet their effectiveness have not been widely explored, faced with challenges of imbalanced samples, joint detection, and complicated patterns. In this article, a novel vertical attention-based siamese ConvLSTM (VAS-CLSTM) network is proposed for the accurate error detection of Argo data. First, an oversampling approach with optimized deep clustering based on inheritance theory and Mahalanobis distance is designed to effectively augment the error samples. Second, a siamese convolutional long-short-term memory (ConvLSTM) network with contextual connection and spatial-temporal adjacent profile search is built to learn interactively from temperature and salinity profiles. Third, a depth-based vertical attention mechanism with grouped weights and vertical trends is proposed for adaptive modeling and flexible learning. Experimental results of North and South Atlantic datasets show that the proposed VAS-CLSTM method effectively improves the accuracy and reliability of error detection in Argo observation data.
Increasing availability of data related to air quality from ground monitoring stations has provided the chance for data mining researchers to propose sophisticated models for predicting the ...concentrations of different air pollutants. In this paper, we proposed a hybrid model based on deep learning methods that integrates Graph Convolutional networks and Long Short-Term Memory networks (GC-LSTM) to model and forecast the spatiotemporal variation of PM
concentrations. Specifically, historical observations on different stations are constructed as spatiotemporal graph series, and historical air quality variables, meteorological factors, spatial terms and temporal attributes are defined as graph signals. To evaluate the performance of the GC-LSTM, we compared our results with several state-of-the-art methods in different time intervals. Based on the results, our GC-LSTM model achieved the best performance for predictions. Moreover, evaluations of recall rate (68.45%), false alarm rate (4.65%) (both of threshold: 115 μg/m
) and correlation coefficient R
(0.72) for 72-hour predictions also verify the feasibility of our proposed model. This methodology can be used for concentration forecasting of different air pollutants in future.
Accurate load forecasting is an important issue for the reliable and efficient operation of a power system. This study presents a hybrid algorithm that combines similar days (SD) selection, empirical ...mode decomposition (EMD), and long short-term memory (LSTM) neural networks to construct a prediction model (i.e., SD-EMD-LSTM) for short-term load forecasting. The extreme gradient boosting-based weighted k-means algorithm is used to evaluate the similarity between the forecasting and historical days. The EMD method is employed to decompose the SD load to several intrinsic mode functions (IMFs) and residual. Separated LSTM neural networks were also employed to forecast each IMF and residual. Lastly, the forecasting values from each LSTM model were reconstructed. Numerical testing demonstrates that the SD-EMD-LSTM method can accurately forecast the electric load.
State-of-charge (SOC) estimation of lithium-ion battery is one of the core functions of battery management system. In order to improve the estimation accuracy of SOC, this paper proposes a long ...short-term memory neural network based on particle swarm optimization (PSO-LSTM). Firstly, the key parameters of LSTM are optimized by PSO algorithm, so that the data characteristics of lithium-ion battery can match the network topology. In addition, random noise is added to the input layer of PSO-LSTM neural network to improve the anti-interference ability of the network. Finally, experiments show that the proposed method can achieve accurate estimation under different conditions. The estimates based on PSO-LSTM converge to the real state-of-charge within an error of 0.5%.
•A PSO-LSTM model is established for SOC estimation of lithium-ion battery.•PSO is applied to optimize the hyper-parameters of LSTM.•Random noises are added to the sampled data, so as to prevent over-fitting of the PSO-LSTM model.•Results show that the proposed method has high estimation accuracy and robustness.
Display omitted
•Targeted interventions for high risk patients are required to curb hospital readmissions and care costs.•The value of this study is a deep learning framework in which both human and ...machine derived features are fed sequentially in a cost-sensitive LSTM model to predict readmission risk (AUC 0.77).•Incorporating sequential trajectories contributed the most towards prediction performance (26%) followed by expert features appended to machine-derived features (3%).•Heatmaps show significant cost savings if targeted interventions are offered to high risk patients.
Unscheduled 30-day readmissions are a hallmark of Congestive Heart Failure (CHF) patients that pose significant health risks and escalate care cost. In order to reduce readmissions and curb the cost of care, it is important to initiate targeted intervention programs for patients at risk of readmission. This requires identifying high-risk patients at the time of discharge from hospital. Here, using real data from over 7500 CHF patients hospitalized between 2012 and 2016 in Sweden, we built and tested a deep learning framework to predict 30-day unscheduled readmission. We present a cost-sensitive formulation of Long Short-Term Memory (LSTM) neural network using expert features and contextual embedding of clinical concepts. This study targets key elements of an Electronic Health Record (EHR) driven prediction model in a single framework: using both expert and machine derived features, incorporating sequential patterns and addressing the class imbalance problem. We evaluate the contribution of each element towards prediction performance (ROC-AUC, F1-measure) and cost-savings. We show that the model with all key elements achieves higher discrimination ability (AUC: 0.77; F1: 0.51; Cost: 22% of maximum possible savings) outperforming the reduced models in at least two evaluation metrics. Additionally, we present a simple financial analysis to estimate annual savings if targeted interventions are offered to high risk patients.
Recent research has called for the use of fine‐grained measures that distinguish implicit knowledge from automatized explicit knowledge. In the current study, such measures were used to determine how ...the two systems interact in a naturalistic second language (L2) acquisition context. One hundred advanced L2 speakers of Japanese living in Japan were assessed using tests of automatized explicit knowledge and implicit knowledge, along with tests of phonological short‐term memory and aptitude tests for explicit and implicit learning. Structural equation modeling demonstrated that aptitude for explicit learning significantly predicted acquisition of automatized explicit knowledge, and automatized explicit knowledge significantly predicted acquisition of implicit knowledge. The effects of implicit learning aptitude and phonological short‐term memory on acquisition of automatized explicit knowledge and implicit knowledge were limited. These findings provide the first empirical evidence that automatized explicit knowledge, which develops through explicit learning mechanisms, may impact the acquisition of implicit knowledge.
Open Practices
This article has been awarded an Open Materials badge. All original materials are publicly accessible in the IRIS digital repository at http://www.iris-database.org. Learn more about the Open Practices badges from the Center for Open Science: https://osf.io/tvyxz/wiki.