Distributional semantics has been for long a source of successful models in psycholinguistics, permitting to obtain semantic estimates for a large number of words in an automatic and fast way. ...However, resources in this respect remain scarce or limitedly accessible for languages different from English. The present paper describes WEISS (Word-Embeddings Italian Semantic Space), a distributional semantic model based on Italian. WEISS includes models of semantic representations that are trained adopting state-of-the art word-embeddings methods, applying neural networks to induce distributed representations for lexical meanings. The resource is evaluated against two test sets, demonstrating that WEISS obtains a better performance with respect to a baseline encoding word associations. Moreover, an extensive qualitative analysis of the WEISS output provides examples of the model potentialities in capturing several semantic phenomena. Two variants of WEISS are released and made easily accessible via web through the SNAUT graphic interface.
Massively multilingual models such as mBERT and XLM-R are increasingly valued in Natural Language Processing research and applications, due to their ability to tackle the uneven distribution of ...resources available for different languages. The models’ ability to process multiple languages relying on a shared set of parameters raises the question of whether the grammatical knowledge they extracted during pre-training can be considered as a data-driven cross-lingual grammar. The present work studies the inner workings of mBERT and XLM-R in order to test the cross-lingual consistency of the individual neural units that respond to a precise syntactic phenomenon, that is, number agreement, in five languages (English, German, French, Hebrew, Russian). We found that there is a significant overlap in the latent dimensions that encode agreement across the languages we considered. This overlap is larger (a) for long- vis-à-vis short-distance agreement and (b) when considering XLM-R as compared to mBERT, and peaks in the intermediate layers of the network. We further show that a small set of syntax-sensitive neurons can capture agreement violations across languages; however, their contribution is not decisive in agreement processing.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
A 2-layer
symbolic network model based on the equilibrium equations of the
Rescorla-Wagner model (
Danks,
2003
) is proposed. The study first presents 2
experiments in Serbian, which reveal for ...sentential reading the inflectional
paradigmatic effects previously observed by
Milin, Filipović Đurđević, and Moscoso del
Prado Martín (2009)
for unprimed lexical decision.
The empirical results are successfully modeled without having to assume separate
representations for inflections or data structures such as inflectional
paradigms. In the next step, the same naive discriminative learning approach is
pitted against a wide range of effects documented in the morphological
processing literature. Frequency effects for complex words as well as for
phrases (
Arnon & Snider,
2010
) emerge in the model without the presence of
whole-word or whole-phrase representations. Family size effects
(
Moscoso del Prado Martín,
Bertram, Häikiö, Schreuder, & Baayen,
2004
;
Schreuder
& Baayen, 1997
) emerge in the simulations across
simple words, derived words, and compounds, without derived words or compounds
being represented as such. It is shown that for pseudo-derived words no special
morpho-orthographic segmentation mechanism, as posited by
Rastle, Davis, and New (2004)
,
is required. The model also replicates the finding of
Plag and Baayen (2009)
that, on
average, words with more productive affixes elicit longer response latencies; at
the same time, it predicts that productive affixes afford faster response
latencies for new words. English phrasal paradigmatic effects modulating
isolated word reading are reported and modeled, showing that the paradigmatic
effects characterizing Serbian case inflection have crosslinguistic scope.
The behavioral effects of Transcranial Magnetic Stimulation (TMS) can change qualitatively when stimulation is preceded by initial state manipulations such as priming or adaptation. In addition, ...baseline performance level of the participant has been shown to play a role in modulating the impact of TMS. Here we examined the link between these two factors. This was done using data from a previous study using a TMS-priming paradigm, in which, at group level, TMS selectively facilitated targets incongruent with the prime while having no statistically significant effects on other prime-target congruencies. Correlation and linear mixed-effects analyses indicated that, for all prime-target congruencies, a significant linear relationship between baseline performance and the magnitude of the induced TMS effect was present: low levels of baseline performance were associated with TMS-induced facilitations and high baseline performance with impairments. Thus as performance level increased, TMS effects turned from facilitation to impairment. The key finding was that priming shifted the transition from facilitatory to disruptive effects for targets incongruent with the prime, such that TMS-induced facilitations were obtained until a higher level of performance than for other prime-target congruencies. Given that brain state manipulations such as priming operate via modulations of neural excitability, this result is consistent with the view that neural excitability, coupled with non-linear neural effects, underlie behavioral effects of TMS.
Noun compounds, consisting of two nouns (the head and the modifier) that are combined into a single concept, differ in terms of their plausibility: school bus is a more plausible compound than saddle ...olive. The present study investigates which factors influence the plausibility of attested and novel noun compounds. Distributional Semantic Models (DSMs) are used to obtain formal (vector) representations of word meanings, and compositional methods in DSMs are employed to obtain such representations for noun compounds. From these representations, different plausibility measures are computed. Three of those measures contribute in predicting the plausibility of noun compounds: The relatedness between the meaning of the head noun and the compound (Head Proximity), the relatedness between the meaning of modifier noun and the compound (Modifier Proximity), and the similarity between the head noun and the modifier noun (Constituent Similarity). We find non-linear interactions between Head Proximity and Modifier Proximity, as well as between Modifier Proximity and Constituent Similarity. Furthermore, Constituent Similarity interacts non-linearly with the familiarity with the compound. These results suggest that a compound is perceived as more plausible if it can be categorized as an instance of the category denoted by the head noun, if the contribution of the modifier to the compound meaning is clear but not redundant, and if the constituents are sufficiently similar in cases where this contribution is not clear. Furthermore, compounds are perceived to be more plausible if they are more familiar, but mostly for cases where the relation between the constituents is less clear.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Scientific studies of language behavior need to grapple with a large diversity of languages in the world and, for reading, a further variability in writing systems. Yet, the ability to form ...meaningful theories of reading is contingent on the availability of cross-linguistic behavioral data. This paper offers new insights into aspects of reading behavior that are shared and those that vary systematically across languages through an investigation of eye-tracking data from 13 languages recorded during text reading. We begin with reporting a bibliometric analysis of eye-tracking studies showing that the current empirical base is insufficient for cross-linguistic comparisons. We respond to this empirical lacuna by presenting the Multilingual Eye-Movement Corpus (MECO), the product of an international multi-lab collaboration. We examine which behavioral indices differentiate between reading in written languages, and which measures are stable across languages. One of the findings is that readers of different languages vary considerably in their skipping rate (i.e., the likelihood of not fixating on a word even once) and that this variability is explained by cross-linguistic differences in word length distributions. In contrast, if readers do not skip a word, they tend to spend a similar average time viewing it. We outline the implications of these findings for theories of reading. We also describe prospective uses of the publicly available MECO data, and its further development plans.
A largely overlooked side effect in most studies of morphological priming is a consistent main effect of semantic transparency across priming conditions. That is, participants are faster at ...recognizing stems from transparent sets (e.g., farm) in comparison to stems from opaque sets (e.g., fruit), regardless of the preceding primes. This suggests that semantic transparency may also be consistently associated with some property of the stem word. We propose that this property might be traced back to the consistency, throughout the lexicon, between the orthographic form of a word and its meaning, here named Orthography-Semantics Consistency (OSC), and that an imbalance in OSC scores might explain the "stem transparency" effect. We exploited distributional semantic models to quantitatively characterize OSC, and tested its effect on visual word identification relying on large-scale data taken from the British Lexicon Project (BLP). Results indicated that (a) the "stem transparency" effect is solid and reliable, insofar as it holds in BLP lexical decision times (Experiment 1); (b) an imbalance in terms of OSC can account for it (Experiment 2); and (c) more generally, OSC explains variance in a large item sample from the BLP, proving to be an effective predictor in visual word access (Experiment 3).
The strongest formulations of grounded cognition assume that perceptual intuitions about concepts involve the re-activation of sensorimotor experience we have made with their referents in the world. ...Within this framework, concreteness and imageability ratings are indeed of crucial importance by operationalising the amount of perceptual interaction we have made with objects. Here we tested such an assumption by asking whether visual intuitions about concepts are provided accurately even when direct visual experience is absent. To this aim, we considered concreteness and imageability intuitions in blind people and tested whether these judgments are predicted by Image-based Frequency (IF, i.e. a data-driven estimate approximating the availability of the word referent in the visual environment). Results indicated that IF predicts perceptual intuitions with a larger extent in sighted compared to blind individuals, thus suggesting a role of direct experience in shaping our judgements. However, the effect of IF was significant not only in sighted but also in blind individuals. This indicates that having direct visual experience with objects does not play a critical role in making them concrete and imageable in a person's intuitions: people do not need visual experience to develop intuition about the availability of things in the external visual environment and use this intuition to inform concreteness/imageability judgments. Our findings fit closely the idea that perceptual judgments are the outcome of introspection/abstraction tasks invoking high-level conceptual knowledge that is not necessarily acquired via direct perceptual experience.
Sophisticated senator and legislative onion. Whether or not you have ever heard of these things, we all have some intuition that one of them makes much less sense than the other. In this paper, we ...introduce a large dataset of human judgments about novel adjective‐noun phrases. We use these data to test an approach to semantic deviance based on phrase representations derived with compositional distributional semantic methods, that is, methods that derive word meanings from contextual information, and approximate phrase meanings by combining word meanings. We present several simple measures extracted from distributional representations of words and phrases, and we show that they have a significant impact on predicting the acceptability of novel adjective‐noun phrases even when a number of alternative measures classically employed in studies of compound processing and bigram plausibility are taken into account. Our results show that the extent to which an attributive adjective alters the distributional representation of the noun is the most significant factor in modeling the distinction between acceptable and deviant phrases. Our study extends current applications of compositional distributional semantic methods to linguistically and cognitively interesting problems, and it offers a new, quantitatively precise approach to the challenge of predicting when humans will find novel linguistic expressions acceptable and when they will not.