While considerable attention has been given to the analysis of texts written by depressed individuals, few studies were interested in evaluating and improving lexical resources for supporting the ...detection of signs of depression in text. In this paper, we present a search-based methodology to evaluate existing depression lexica. To meet this aim, we exploit existing resources for depression and language use and we analyze which elements of the lexicon are the most effective at revealing depression symptoms. Furthermore, we propose innovative expansion strategies able to further enhance the quality of the lexica.
The long-term associations between early receptive/expressive lexical skills and later language/pre-literacy skills require clarification.
To study the association between and predictive values of ...early receptive/expressive lexical skills and language/pre-literacy skills at 5;0 years, and to examine the language profiles at 5;0 years of children with weak receptive language/expressive lexical skills at 2;0 years.
The participants were 66 monolingual children. Their lexical skills were measured using the Finnish short-form version of the MacArthur–Bates Communicative Development Inventories at 1;6 and 2;0 years. Receptive language skills were measured at 2;0 years using the Reynell Developmental Language Scales III. A broader assessment at 5;0 years measured lexical, phonological, morphological and pre-literacy skills.
Significant associations between receptive/expressive lexical skills at 1;6 years and language and pre-literacy skills at 5;0 years were found. Both receptive language and expressive lexical development measured at 2;0 years were greatly and relatively evenly associated with language and pre-literacy skills at 5;0 years. Lexicon/language variables at 1;6 years and 2;0 years had statistically significant predictive values for general language and pre-literacy scores at 5;0 years. The best models that included early lexical predictors explained 20–34% of later language/literacy outcome. Weak skills at 2;0 years proposed vulnerability in language and pre-literacy skills at 5;0 years.
Language and pre-literacy skills at 5;0 years can to some extent be explained by early receptive language and/or expressive lexical development. Further assessment and/or follow-up is important for children who have had weak language/lexical skills at 2;0 years.
•Significant associations between early lexical skills and different language and pre-literacy skills at 5 years were found.•The best models that included early lexical predictors explained 20–34% of later language/literacy outcome.•Most children with weak lexical/language skills at 2 years had weak skills in at least one language domain at 5 years.•The findings emphasize the importance of screening for both early receptive and expressive lexical/language skills.
The notion that the form of a word bears an arbitrary relation to its meaning accounts only partly for the attested relations between form and meaning in the languages of the world. Recent research ...suggests a more textured view of vocabulary structure, in which arbitrariness is complemented by iconicity (aspects of form resemble aspects of meaning) and systematicity (statistical regularities in forms predict function). Experimental evidence suggests these form-to-meaning correspondences serve different functions in language processing, development, and communication: systematicity facilitates category learning by means of phonological cues, iconicity facilitates word learning and communication by means of perceptuomotor analogies, and arbitrariness facilitates meaning individuation through distinctive forms. Processes of cultural evolution help to explain how these competing motivations shape vocabulary structure.
Cet article présente une théorie générale des néologismes capable de les décrire dans leur complexité, sans limiter l’analyse à une seule dimension. Au sein de ce cadre théorique multidimensionnel, ...adapté à des objets complexes tels que les néologismes, l’auteure développe leur dimension linguistique, souligne les limites explicatives d’autres dimensions (psychologique, sociale, etc.) et précise les informations que les locuteurs associent à chaque néologisme dans leur lexique mental.
•Urdu Sentiment Analysis in multiple domains is performed.•Lexicon-based approach and Supervised Machine Learning approach are compared.•Lexicon-based approach achieved high Accuracy, Precision, ...Recall and F-measure.•Lexicon-based is also better in terms of economy of time and efforts used.
Web is facilitating people to express their views and opinions on different topics through reviews and blogs. Effective advantages can be reaped from these reviews and blogs by fusing the sentiment knowledge. In this research, Sentiment Analysis of Urdu blogs from multiple domains is done by using the two widely used approaches i.e. the Lexicon-based approach and the Supervised Machine Learning approach. Three well known classifiers i.e. Support Vector Machine, Decision Tree and K Nearest Neighbor are used in case of Supervised Machine Learning approach whereas a wide coverage Urdu Sentiment Lexicon and an efficient Urdu Sentiment Analyzer are used in Lexicon-based approach. In both the approaches the information are fused from two sources to successfully perform Sentiment Analysis. In case of Lexicon-based approach, the two sources are the wide coverage Urdu Sentiment Lexicon and the efficient Urdu Sentiment Analyzer. In case of Supervised Machine Learning approach, the two sources are the un-annotated data and annotated data along with important attributes. After performing Sentiment Analysis using both the approaches, the results are observed carefully and on the basis of experiments performed in this research, it is concluded that the Lexicon-based approach outperforms Supervised Machine Learning approach not only in terms of Accuracy, Precision, Recall and F-measure but also in terms of economy of time and efforts used.
There is mounting evidence that language users are sensitive to distributional information at many grain-sizes. Much of this research has focused on the distributional properties of words, the units ...they consist of (morphemes, phonemes), and the syntactic structures they appear in (verb-categorization frames, syntactic constructions). In a series of studies we show that comprehenders are also sensitive to the frequencies of compositional four-word phrases (e.g. don’t have to worry): more frequent phrases are processed faster. The effect is not reducible to the frequency of the individual words or substrings and is observed across the entire frequency range (for low, mid and high frequency phrases). Comprehenders seem to learn and store frequency information about multi-word phrases. These findings call for processing models that can capture and predict phrase-frequency effects and support accounts where linguistic knowledge consists of patterns of varying sizes and levels of abstraction.
Toute approche vers une langue ne peut pas manquer, tôt ou tard, de requérir à une confrontation avec un manuel de grammaire et un dictionnaire. Ces outils, voués à la systématisation du savoir, ont ...toujours constitué la base de n'importe quelle formation linguistique. C'est pourquoi, en dépit d'être parfois considérés obsolètes par les jeunes générations, ils peuvent être, encore aujourd'hui, apparentés à d'incontournables atavofigures.
•Lexicon information has been proved to be very useful in Chinese Named Entity task.•The existing methods make insufficient use of lexicon information.•We propose a Polymorphic Graph Attention ...Network for fusion lexicon into character.•Our method can be easily combined with pre-trained or sequence encoding model.•The proposed method supports multi-head attention and has excellent inference speed.
Fusing lexicon information into Chinese characters, which has normally a number of meanings, has been proven to be effective for Chinese Named Entity Recognition (NER). However, the existing approaches to incorporating a matched Chinese word into its composition characters only take the word as a whole (no subdivision or part), which failed to capture fine-grained correlation in word-character space and failed to make full use of lexicon information. Moreover, existing approaches use the fixed (static) weights between words and characters. This limits the performance of NER. Considering the fact that the same word-character pairs have different interactions in different contexts, the weights of matched word-character pairs should be dynamic rather than fixed. In this paper, we propose a Polymorphic Graph Attention Network (PGAT), aiming at capturing dynamic correlation between characters and matched words from multiple dimensions, to enhance the character representation. By obtaining matched words of characters from lexicon, we carefully map the word-character in four positions, which are B (begin), M (middle), E (end) and S (single word). The proposed semantic fusion unit based on Graph Attention Network (GAT) can dynamically modulate attention of matched words and characters in the four dimensions B, M, E, and S. Thus, it can explicitly capture fine-grained correlation between characters and matched words across each dimension. Experiments on four Chinese NER datasets show that PGAT outperforms the baseline models. It demonstrates the significance of the attention capture and fusion capabilities of the proposed polymorphic graph. Furthermore, PGAT is used in character representation layer, which makes it easier to be combined with pre-trained models like BERT and other sequence encoding models like CNN and Transformer.
Successful learning involves integrating new material into existing memory networks. A learning procedure known as fast mapping (FM), thought to simulate the word-learning environment of children, ...has recently been linked to distinct neuroanatomical substrates in adults. This idea has suggested the (never-before tested) hypothesis that FM may promote rapid incorporation into cortical memory networks. We test this hypothesis here in 2 experiments. In our 1st experiment, we introduced 50 participants to 16 unfamiliar animals and names through FM or explicit encoding (EE) and tested participants on the training day, and again after sleep. Learning through EE produced strong declarative memories, without immediate lexical competition, as expected from slow-consolidation models. Learning through FM, however, led to almost immediate lexical competition, which continued to the next day. Additionally, the learned words began to prime related concepts on the day following FM (but not EE) training. In a 2nd experiment, we replicated the lexical integration results and determined that presenting an already-known item during learning was crucial for rapid integration through FM. The findings presented here indicate that learned items can be integrated into cortical memory networks at an accelerated rate through fast mapping. The retrieval of a related known concept, in order to infer the target of the FM question, is critical for this effect.
Sentiment analysis, which refers to the task of detecting whether a textual item (e.g., a product review and a blog post) expresses a positive or negative opinion in general or about a given entity ...(e.g., a product, person, or policy), has received increasing attention in recent years. It serves as an important role in natural language processing. User generated content, like tourism reviews, developed dramatically during the past years, generating a large amount of unstructured data from which it is hard to obtain useful information. Due to the changes in textual order, sequence length and complicated logic, it is still a challenging task to predict the exact sentiment polarities of the user reviews, especially for fine-grained sentiment classification. In this paper, we first propose sentiment padding, a novel padding method compared with zero padding, making the input data sample of a consistent size and improving the proportion of sentiment information in each review. Inspired by the most recent studies with respect to neural networks, we propose deep learning based sentiment analysis models named lexicon integrated two-channel CNN–LSTM family models, combining CNN and LSTM/BiLSTM branches in a parallel manner. Experiments on several challenging datasets, like Stanford Sentiment Treebank, demonstrate that the proposed method outperforms many baseline methods.
•We proposed sentiment padding to improve the proportion of sentiment information in each review.•We presented lexicon integrated two-channel CNN–BiLSTM model.•This paper studied the influence of the skip connection operation on two-channel deep model.•Experiment showed superiority of the proposed model on analyzing English and Chinese reviews.