In this article, the author analyses lexemes that nominate fishing nets in the dialects of Karelian. The purpose of the article is to determine the motives of nomination in complex designations with ...the component (-)verkko. The relevance of the study is determined by the insufficient knowledge of some lexical and thematic groups of Karelian. A novelty is seen in the first analysis of motivation principles in designations of fishing gear on the example of nets. Altogether 48 Karelian dialectal expressions designating fishing nets and containing the component (-)verkko were found and the following features of motivation were identified: by the name of the fish (for example, haugiverkko 'pike net', lohiverkko 'salmon net', riäpuzverkod 'vendace nets'); by the place of fishing (for example, jiäverkko 'ice net', koskiverko 'waterfall net'); by the fishing method (for example, kuulleverkko 'dragnet', la.ski.verk. 'gillnet'); by the appearance of the nets (for example, kuplatušverkko 'float net', bruakkuverkko 'nagged net').
While considerable attention has been given to the analysis of texts written by depressed individuals, few studies were interested in evaluating and improving lexical resources for supporting the ...detection of signs of depression in text. In this paper, we present a search-based methodology to evaluate existing depression lexica. To meet this aim, we exploit existing resources for depression and language use and we analyze which elements of the lexicon are the most effective at revealing depression symptoms. Furthermore, we propose innovative expansion strategies able to further enhance the quality of the lexica.
Word associations have been used widely in psychology, but the validity of their application strongly depends on the number of cues included in the study and the extent to which they probe all ...associations known by an individual. In this work, we address both issues by introducing a new English word association dataset. We describe the collection of word associations for over 12,000 cue words, currently the largest such English-language resource in the world. Our procedure allowed subjects to provide multiple responses for each cue, which permits us to measure weak associations. We evaluate the utility of the dataset in several different contexts, including lexical decision and semantic categorization. We also show that measures based on a mechanism of spreading activation derived from this new resource are highly predictive of direct judgments of similarity. Finally, a comparison with existing English word association sets further highlights systematic improvements provided through these new norms.
Cet article présente une théorie générale des néologismes capable de les décrire dans leur complexité, sans limiter l’analyse à une seule dimension. Au sein de ce cadre théorique multidimensionnel, ...adapté à des objets complexes tels que les néologismes, l’auteure développe leur dimension linguistique, souligne les limites explicatives d’autres dimensions (psychologique, sociale, etc.) et précise les informations que les locuteurs associent à chaque néologisme dans leur lexique mental.
There is mounting evidence that language users are sensitive to distributional information at many grain-sizes. Much of this research has focused on the distributional properties of words, the units ...they consist of (morphemes, phonemes), and the syntactic structures they appear in (verb-categorization frames, syntactic constructions). In a series of studies we show that comprehenders are also sensitive to the frequencies of compositional four-word phrases (e.g. don’t have to worry): more frequent phrases are processed faster. The effect is not reducible to the frequency of the individual words or substrings and is observed across the entire frequency range (for low, mid and high frequency phrases). Comprehenders seem to learn and store frequency information about multi-word phrases. These findings call for processing models that can capture and predict phrase-frequency effects and support accounts where linguistic knowledge consists of patterns of varying sizes and levels of abstraction.
Toute approche vers une langue ne peut pas manquer, tôt ou tard, de requérir à une confrontation avec un manuel de grammaire et un dictionnaire. Ces outils, voués à la systématisation du savoir, ont ...toujours constitué la base de n'importe quelle formation linguistique. C'est pourquoi, en dépit d'être parfois considérés obsolètes par les jeunes générations, ils peuvent être, encore aujourd'hui, apparentés à d'incontournables atavofigures.
The notion that the form of a word bears an arbitrary relation to its meaning accounts only partly for the attested relations between form and meaning in the languages of the world. Recent research ...suggests a more textured view of vocabulary structure, in which arbitrariness is complemented by iconicity (aspects of form resemble aspects of meaning) and systematicity (statistical regularities in forms predict function). Experimental evidence suggests these form-to-meaning correspondences serve different functions in language processing, development, and communication: systematicity facilitates category learning by means of phonological cues, iconicity facilitates word learning and communication by means of perceptuomotor analogies, and arbitrariness facilitates meaning individuation through distinctive forms. Processes of cultural evolution help to explain how these competing motivations shape vocabulary structure.
•Urdu Sentiment Analysis in multiple domains is performed.•Lexicon-based approach and Supervised Machine Learning approach are compared.•Lexicon-based approach achieved high Accuracy, Precision, ...Recall and F-measure.•Lexicon-based is also better in terms of economy of time and efforts used.
Web is facilitating people to express their views and opinions on different topics through reviews and blogs. Effective advantages can be reaped from these reviews and blogs by fusing the sentiment knowledge. In this research, Sentiment Analysis of Urdu blogs from multiple domains is done by using the two widely used approaches i.e. the Lexicon-based approach and the Supervised Machine Learning approach. Three well known classifiers i.e. Support Vector Machine, Decision Tree and K Nearest Neighbor are used in case of Supervised Machine Learning approach whereas a wide coverage Urdu Sentiment Lexicon and an efficient Urdu Sentiment Analyzer are used in Lexicon-based approach. In both the approaches the information are fused from two sources to successfully perform Sentiment Analysis. In case of Lexicon-based approach, the two sources are the wide coverage Urdu Sentiment Lexicon and the efficient Urdu Sentiment Analyzer. In case of Supervised Machine Learning approach, the two sources are the un-annotated data and annotated data along with important attributes. After performing Sentiment Analysis using both the approaches, the results are observed carefully and on the basis of experiments performed in this research, it is concluded that the Lexicon-based approach outperforms Supervised Machine Learning approach not only in terms of Accuracy, Precision, Recall and F-measure but also in terms of economy of time and efforts used.
•Lexicon information has been proved to be very useful in Chinese Named Entity task.•The existing methods make insufficient use of lexicon information.•We propose a Polymorphic Graph Attention ...Network for fusion lexicon into character.•Our method can be easily combined with pre-trained or sequence encoding model.•The proposed method supports multi-head attention and has excellent inference speed.
Fusing lexicon information into Chinese characters, which has normally a number of meanings, has been proven to be effective for Chinese Named Entity Recognition (NER). However, the existing approaches to incorporating a matched Chinese word into its composition characters only take the word as a whole (no subdivision or part), which failed to capture fine-grained correlation in word-character space and failed to make full use of lexicon information. Moreover, existing approaches use the fixed (static) weights between words and characters. This limits the performance of NER. Considering the fact that the same word-character pairs have different interactions in different contexts, the weights of matched word-character pairs should be dynamic rather than fixed. In this paper, we propose a Polymorphic Graph Attention Network (PGAT), aiming at capturing dynamic correlation between characters and matched words from multiple dimensions, to enhance the character representation. By obtaining matched words of characters from lexicon, we carefully map the word-character in four positions, which are B (begin), M (middle), E (end) and S (single word). The proposed semantic fusion unit based on Graph Attention Network (GAT) can dynamically modulate attention of matched words and characters in the four dimensions B, M, E, and S. Thus, it can explicitly capture fine-grained correlation between characters and matched words across each dimension. Experiments on four Chinese NER datasets show that PGAT outperforms the baseline models. It demonstrates the significance of the attention capture and fusion capabilities of the proposed polymorphic graph. Furthermore, PGAT is used in character representation layer, which makes it easier to be combined with pre-trained models like BERT and other sequence encoding models like CNN and Transformer.