Models that represent meaning as high-dimensional numerical vectors—such as latent semantic analysis (LSA), hyperspace analogue to language (HAL), bound encoding of the aggregate language environment ...(BEAGLE), topic models, global vectors (GloVe), and word2vec—have been introduced as extremely powerful machine-learning proxies for human semantic representations and have seen an explosive rise in popularity over the past 2 decades. However, despite their considerable advancements and spread in the cognitive sciences, one can observe problems associated with the adequate presentation and understanding of some of their features. Indeed, when these models are examined from a cognitive perspective, a number of unfounded arguments tend to appear in the psychological literature. In this article, we review the most common of these arguments and discuss (a) what exactly these models represent at the implementational level and their plausibility as a cognitive theory, (b) how they deal with various aspects of meaning such as polysemy or compositionality, and (c) how they relate to the debate on embodied and grounded cognition. We identify common misconceptions that arise as a result of incomplete descriptions, outdated arguments, and unclear distinctions between theory and implementation of the models. We clarify and amend these points to provide a theoretical basis for future research and discussions on vector models of semantic representation.
Nonarbitrary phenomena in language, such as systematic association in the form-meaning interface, have been widely reported in the literature. Exploiting such systematic associations previous studies ...have demonstrated that pseudowords can be indicative of meaning. However, whether semantic activation from words and pseudowords is supported by the very same processes, activating a common semantic memory system, is currently not known. Here, we take advantage of recent progresses from computational linguistics models allowing to induce meaning representations for out-of-vocabulary strings of letters via domain-general associative-learning mechanisms applied to natural language. We combined these models with data from priming tasks, in which participants are showed two strings of letters presented sequentially one after the other and are then asked to indicate if the latter is a word or a pseudoword. In Experiment 1 we reanalyzed the data of the largest behavioral database on semantic priming, while in Experiment 2 we ran an independent replication on a new language, Italian, controlling for a series of possible confounds. Results were consistent across the two experiments and showed that the prime-word meaning interferes with the semantic pattern elicited by the target pseudoword (i.e., at increasing estimated semantic relatedness between prime word and target pseudoword, participants' reaction times increased and accuracy decreased). These findings indicate that the same associative mechanisms governing word meaning also subserve the processing of pseudowords, suggesting in turn that human semantic memory can be conceived as a distributional system that builds upon a general-purpose capacity of extracting knowledge from complex statistical patterns.
Effects of semantic transparency, reflected in processing differences between semantically transparent (teabag) and opaque (ladybird) compounds, have received considerable attention in the ...investigation of the role of constituents in compound processing. However, previous studies have yielded inconsistent results. In the present article, we argue that this is due to semantic transparency's often being conceptualized only as the semantic relatedness between the compound and constituent meanings as separate units. This neglects the fact that compounds are inherently productive constructions. We argue that compound processing is routinely impacted by a compositional process aimed at computing a compositional meaning, which would cause compositional semantic transparency effects to emerge in compound processing. We employ recent developments in compositional distributional semantics to quantify relatedness- as well as composition-based semantic transparency measures and use these to predict lexical decision times in a large-scale data set. We observed semantic transparency effects on compound processing that are not captured in relatedness terms but only by adopting a compositional perspective.
Recursive processing in sentence comprehension is considered a hallmark of human linguistic abilities. However, its underlying neural mechanisms remain largely unknown. We studied whether a modern ...artificial neural network trained with “deep learning” methods mimics a central aspect of human sentence processing, namely the storing of grammatical number and gender information in working memory and its use in long-distance agreement (e.g., capturing the correct number agreement between subject and verb when they are separated by other phrases). Although the network, a recurrent architecture with Long Short-Term Memory units, was solely trained to predict the next word in a large corpus, analysis showed the emergence of a very sparse set of specialized units that successfully handled local and long-distance syntactic agreement for grammatical number. However, the simulations also showed that this mechanism does not support full recursion and fails with some long-range embedded dependencies. We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns, with or without embedding. Human and model error patterns were remarkably similar, showing that the model echoes various effects observed in human data. However, a key difference was that, with embedded long-range dependencies, humans remained above chance level, while the model's systematic errors brought it below chance. Overall, our study shows that exploring the ways in which modern artificial neural networks process sentences leads to precise and testable hypotheses about human linguistic performance.
•A specialized mechanism for grammatical agreement emerges in Neural Language Models (NLMs).•The mechanism consistently emerges for different grammatical features and various languages.•Agreement performance of the NLM was found to be worse on the innermost dependency of nested grammatical structures.•Model prediction was confirmed in humans. Humans too make more agreement errors on inner dependencies.•Exploring how modern NLMs process sentences leads to precise and testable hypotheses about human linguistic performance.
Although mouse‐tracking has been taken as a real‐time window on different aspects of human decision‐making processes, whether purely semantic information affects response conflict at the level of ...motor output as measured through mouse movements is still unknown. Here, across two experiments, we investigated the effects of semantic knowledge by predicting participants’ performance in a standard keyboard task and in a mouse‐tracking task through distributional semantics, a usage‐based modeling approach to meaning. In Experiment 1, participants were shown word pairs and were required to perform a two‐alternative forced choice task selecting either the more or the more concrete word, using standard keyboard presses. In Experiment 2, participants performed the same task, yet this time response selection was achieved by moving the computer mouse. Results showed that the involvement of semantic components in the task at hand is observable using both standard reaction times (Experiment 1) as well as using indexes extracted from mouse trajectories (Experiment 2). In particular, mouse trajectories reflected the response conflict and its temporal evolution, with a larger deviation for increasing word semantic relatedness. These findings support the validity of mouse‐tracking as a method to detect deep and implicit decision‐making features. Additionally, by demonstrating that a usage‐based model of meaning can account for the different degrees of cognitive conflict associated with task achievement, these findings testify the impact of the human semantic memory on decision‐making processes.
While distributional semantic models that represent word meanings as high-dimensional vectors induced from large text corpora have been shown to successfully predict human behavior across a wide ...range of tasks, they have also received criticism from different directions. These include concerns over their interpretability (how can numbers specifying abstract, latent dimensions represent meaning?) and their ability to capture variation in meaning (how can a single vector representation capture multiple different interpretations for the same expression?). Here, we demonstrate that semantic vectors can indeed rise up to these challenges, by training a mapping system (a simple linear regression) that predicts inter-individual variation in relational interpretations for compounds such as wood brush (for example brush FOR wood, or brush MADE OF wood) from (compositional) semantic vectors representing the meanings of these compounds. These predictions consistently beat different random baselines, both for familiar compounds (moon light, Experiment 1) as well as novel compounds (wood brush, Experiment 2), demonstrating that distributional semantic vectors encode variations in qualitative interpretations that can be decoded using techniques as simple as linear regression.
There is an ongoing, vibrant debate about whether numerical information in both nonsymbolic and symbolic notations would be supported by different neurocognitive systems or rather by a common ...preverbal approximate number system, which is ratio dependent and follows Weber's law. Here, we propose that the similarities between nonsymbolic and symbolic number processing can be explained based on the principle of efficient coding. To probe this hypothesis we employed a new empirical approach, by predicting the behavioural performance in number comparison tasks with symbolic (i.e., number words) and nonsymbolic (i.e., arrays of dots) information not only from numerical ratio, but for the first time also from natural language data. That is, we used data extracted from vector-space models that are informative about the distributional pattern of numberwords usage in natural language. Results showed that linguistic estimates predicted the behavioural performance in both symbolic and nonsymbolic tasks. However, and critically, our results also showed a task-dependent dissociation: linguistic data better predicted the performance in the symbolic task, whereas real numerical ratio better predicted the performance in the nonsymbolic task. These findings indicate that efficient coding of environmental regularities is an explanatory principle of human behavior in tasks involving numerical information. They also suggest that the ability to discriminate a stimulus from similar ones varies as a function of the specific statistical structure of the considered learning environment.
There is an ongoing, vibrant debate about whether numerical information in both nonsymbolic and symbolic notations would be supported by different neurocognitive systems or rather by a common ...preverbal approximate number system, which is ratio dependent and follows Weber’s law. Here, we propose that the similarities between nonsymbolic and symbolic number processing can be explained based on the principle of efficient coding. To probe this hypothesis we employed a new empirical approach, by predicting the behavioural performance in number comparison tasks with symbolic (i.e., number words) and nonsymbolic (i.e., arrays of dots) information not only from numerical ratio, but for the first time also from natural language data. That is, we used data extracted from vector-space models that are informative about the distributional pattern of number-words usage in natural language. Results showed that linguistic estimates predicted the behavioural performance in both symbolic and nonsymbolic tasks. However, and critically, our results also showed a task-dependent dissociation: linguistic data better predicted the performance in the symbolic task, whereas real numerical ratio better predicted the performance in the nonsymbolic task. These findings indicate that efficient coding of environmental regularities is an explanatory principle of human behavior in tasks involving numerical information. They also suggest that the ability to discriminate a stimulus from similar ones varies as a function of the specific statistical structure of the considered learning environment.
Recent decades have ushered in tremendous progress in understanding the neural basis of language. Most of our current knowledge on language and the brain, however, is derived from lab-based ...experiments that are far removed from everyday language use, and that are inspired by questions originating in linguistic and psycholinguistic contexts. In this paper we argue that in order to make progress, the field needs to shift its focus to understanding the neurobiology of naturalistic language comprehension. We present here a new conceptual framework for understanding the neurobiological organization of language comprehension. This framework is non-language-centered in the computational/neurobiological constructs it identifies, and focuses strongly on context. Our core arguments address three general issues: (i) the difficulty in extending language-centric explanations to discourse; (ii) the necessity of taking context as a serious topic of study, modeling it formally and acknowledging the limitations on external validity when studying language comprehension outside context; and (iii) the tenuous status of the language network as an explanatory construct. We argue that adopting this framework means that neurobiological studies of language will be less focused on identifying correlations between brain activity patterns and mechanisms postulated by psycholinguistic theories. Instead, they will be less self-referential and increasingly more inclined towards integration of language with other cognitive systems, ultimately doing more justice to the neurobiological organization of language and how it supports language as it is used in everyday life.
Pseudowords offer a unique opportunity to investigate how humans deal with new (verbal) information. Within this framework, previous studies have shown that, at the implicit level, humans exploit ...systematic associations in the form-meaning interface to process new information by relying on (sub-lexical) contents already mapped in semantic memory. However, whether speakers exploit such processes in explicit decisions about the meanings elicited by unfamiliar terms remains an open, important question. Here, we tested this by leveraging computational models that are able to induce semantic representations for out-of-vocabulary stimuli. Across two experiments, we demonstrate that participants' guesses about pseudoword meanings in a 2AFC task consistently align with the model's predictions. This indicates that humans' ability to extract meaningful knowledge from complex statistical patterns can affect explicit decisions.