Experiments in research on memory, language, and in other areas of cognitive science are increasingly being analyzed using Bayesian methods. This has been facilitated by the development of ...probabilistic programming languages such as Stan, and easily accessible front-end packages such as brms. The utility of Bayesian methods, however, ultimately depends on the relevance of the Bayesian model, in particular whether or not it accurately captures the structure of the data and the data analyst's domain expertise. Even with powerful software, the analyst is responsible for verifying the utility of their model. To demonstrate this point, we introduce a principled Bayesian workflow (Betancourt, 2018) to cognitive science. Using a concrete working example, we describe basic questions one should ask about the model: prior predictive checks, computational faithfulness, model sensitivity, and posterior predictive checks. The running example for demonstrating the workflow is data on reading times with a linguistic manipulation of object versus subject relative clause sentences. This principled Bayesian workflow also demonstrates how to use domain knowledge to inform prior distributions. It provides guidelines and checks for valid data analysis, avoiding overfitting complex models to noise, and capturing relevant data structure in a probabilistic model. Given the increasing use of Bayesian methods, we aim to discuss how these methods can be properly employed to obtain robust answers to scientific questions. All data and code accompanying this article are available from https://osf.io/b2vx9/.
Big Data can bring enormous benefits to psychology. However, many psychological researchers show skepticism in undertaking Big Data research. Psychologists often do not take Big Data into ...consideration while developing their research projects because they have difficulties imagining how Big Data could help in their specific field of research, imagining themselves as "Big Data scientists," or for lack of specific knowledge. This article provides an introductory guide for conducting Big Data research for psychologists who are considering using this approach and want to have a general idea of its processes. By taking the Knowledge Discovery from Database steps as the fil rouge, we provide useful indications for finding data suitable for psychological investigations, describe how these data can be preprocessed, and list some techniques to analyze them and programming languages (R and Python) through which all these steps can be realized. In doing so, we explain the concepts with the terminology and take examples from psychology. For psychologists, familiarizing with the language of data science is important because it may appear difficult and esoteric at first approach. As Big Data research is often multidisciplinary, this overview helps build a general insight into the research steps and a common language, facilitating collaboration across different fields.
Translational Abstract
Technological advances have led to an abundance of widely available data on every aspect of life, called Big Data. Big Data provide psychologists with new means to research psychological constructs in new ways. Psychologists often show excitement when talking about these new data opportunities. However, this enthusiasm has not yet led to extensive use of Big Data in the psychological community. One reason for this is that psychologists usually do not acquire the various skills and competencies that successful Big Data research requires during their training. This work provides an introductory guide for conducting Big Data research for psychologists who are considering using this approach and want to have a general idea of its processes and techniques. It gives useful indications for finding data suitable for psychological investigations, describes how these data can be pre-processed, enlists some techniques and tools to analyze them.
Numerous applications in psychological research require that a pool of elements is partitioned into multiple parts. While many applications seek groups that are well-separated, that is, dissimilar ...from each other, others require the different groups to be as similar as possible. Examples include the assignment of students to parallel courses, assembling stimulus sets in experimental psychology, splitting achievement tests into parts of equal difficulty, and dividing a data set for cross-validation. We present anticlust, an easy-to-use and free software package for solving these problems fast and in an automated manner. The package anticlust is an open source extension to the R programming language and implements the methodology of anticlustering. Anticlustering divides elements into similar parts, ensuring similarity between groups by enforcing heterogeneity within groups. Thus, anticlustering is the direct reversal of cluster analysis that aims to maximize homogeneity within groups and dissimilarity between groups. Our package anticlust implements 2 anticlustering criteria, reversing the clustering methods k-means and cluster editing, respectively. In a simulation study, we show that anticlustering returns excellent results and outperforms alternative approaches like random assignment and matching. In 3 example applications, we illustrate how to apply anticlust on real data sets. We demonstrate how to assign experimental stimuli to equivalent sets based on norming data, how to divide a large data set for cross-validation, and how to split a test into parts of equal item difficulty and discrimination.
Translational Abstract
Numerous applications in psychological research require that a pool of elements is partitioned into multiple parts, while ensuring that the different parts are as similar as possible. Examples include the assignment of students to parallel courses, assembling stimulus sets in experimental psychology, splitting achievement tests into parts of equal difficulty, and dividing a data set for cross validation. To solve such problems, researchers usually rely on strategies such as manual assigment, random assignment, or matching. As an improvement over these approaches, we present anticlust, an easy-to-use and free software package that quickly and effectively partitions elements into groups that are as similar as possible. Anticlust is an open source package written in the R programming language, implementing many established and new anticlustering methods. Anticlustering assembles groups in such a way that within-group heterogenity is high and between-group similarity is high. Thus, anticlustering reverses the logic of cluster analysis that strives to maximize homogeneity within groups and dissimilarity between groups. Tests on simulated and real data show that our anticlustering algorithms return excellent results that outperform previous approaches like manual assignment, random assignment, and matching. In three example applications, we demonstrate how to use anticlust to assign experimental stimuli to equivalent sets based on norming data, how to divide a large data set for cross validation, and how to split a test into parts of equal item difficulty and discrimination.
Interaction Flow Modeling Language describes how to apply model-driven techniques to the problem of designing the front end of software applications, i.e., the user interaction. The book introduces ...the reader to the novel OMG standard Interaction Flow Modeling Language (IFML). Authors Marco Brambilla and Piero Fraternali are authors of the IFML standard and wrote this book to explain the main concepts of the language. They effectively illustrate how IFML can be applied in practice to the specification and implementation of complex web and mobile applications, featuring rich interactive interfaces, both browser based and native, client side components and widgets, and connections to data sources, business logic components and services. Interaction Flow Modeling Language provides you with unique insight into the benefits of engineering web and mobile applications with an agile model driven approach. Concepts are explained through intuitive examples, drawn from real-world applications. The authors accompany you in the voyage from visual specifications of requirements to design and code production. The book distills more than twenty years of practice and provides a mix of methodological principles and concrete and immediately applicable techniques.
Learn OMG's new IFML standard from the authors of the standard with this approachable referenceIntroduces IFML concepts step-by-step, with many practical examples and an end-to-end case exampleShows how to integrate IFML with other OMG standards including UML, BPMN, CWM, SoaML and SysMLDiscusses how to map models into code for a variety of web and mobile platforms and includes many useful interface modeling patterns and best practices
The two-volume open access book set LNCS 14576 + 14577 constitutes the proceedings of the 33rd European Symposium on Programming, ESOP 2024, which was held during April 6-11, 2024, in Luxemburg, as ...part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2024. The 25 full papers and 1 fresh perspective paper presented in these proceedings were carefully reviewed and selected from 72 submissions. The papers were organized in topical sections as follows: Part I: Effects and modal types; bidirectional typing and session types; dependent types; Part II: Quantum programming and domain-specific languages; verification; program analysis; abstract interpretation.
The two-volume open access book set LNCS 14576 + 14577 constitutes the proceedings of the 33rd European Symposium on Programming, ESOP 2024, which was held during April 6-11, 2024, in Luxemburg, as ...part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2024. The 25 full papers and 1 fresh perspective paper presented in these proceedings were carefully reviewed and selected from 72 submissions. The papers were organized in topical sections as follows: Part I: Effects and modal types; bidirectional typing and session types; dependent types; Part II: Quantum programming and domain-specific languages; verification; program analysis; abstract interpretation.
Identifying source code expertise is useful in several situations. Activities like bug fixing and helping newcomers are best performed by knowledgeable developers. Some studies have proposed ...repository-mining techniques to identify source code experts. However, there is a gap in understanding which variables are most related to code knowledge and how they can be used for identifying expertise.
This study explores models of expertise identification and how these models can be used to improve a Truck Factor algorithm.
First, we built an oracle with the knowledge of developers from software projects. Then, we use this oracle to analyze the correlation between measures from the development history and source code knowledge. We investigate the use of linear and machine-learning models to identify file experts. Finally, we use the proposed models to improve a Truck Factor algorithm and analyze their performance using data from public and private repositories.
First Authorship and Recency of Modification have the highest positive and negative correlations with source code knowledge, respectively. Machine learning classifiers outperformed the linear techniques (F-Score = 71% to 73%) in the largest analyzed dataset, but this advantage is unclear in the smallest one. The Truck Factor algorithm using the proposed models could handle developers missed by the previous expertise model with the best average F-Score of 74%. It was perceived as more accurate in computing the Truck Factor of an industrial project.
If we analyze F-Score, the studied models have similar performance. However, machine learning classifiers get higher Precision while linear models obtained the highest Recall. Therefore, choosing the best technique depends on the user’s tolerance to false positives and negatives. Additionally, the proposed models significantly improved the accuracy of a Truck Factor algorithm, affirming their effectiveness in precisely identifying the key developers within software projects.
Several secondary studies have investigated the relationship between internal quality attributes, source code metrics and external quality attributes. Sometimes they have contradictory results.
We ...synthesize evidence of the link between internal quality attributes, source code metrics and external quality attributes along with the efficacy of the prediction models used.
We conducted a tertiary review to identify, evaluate and synthesize secondary studies. We used several characteristics of secondary studies as indicators for the strength of evidence and considered them when synthesizing the results.
From 711 secondary studies, we identified 15 secondary studies that have investigated the link between source code and external quality. Our results show : (1) primarily, the focus has been on object-oriented systems, (2) maintainability and reliability are most often linked to internal quality attributes and source code metrics, with only one secondary study reporting evidence for security, (3) only a small set of complexity, coupling, and size-related source code metrics report a consistent positive link with maintainability and reliability, and (4) group method of data handling (GMDH) based prediction models have performed better than other prediction models for maintainability prediction.
Based on our results, lines of code, coupling, complexity and the cohesion metrics from Chidamber & Kemerer (CK) metrics are good indicators of maintainability with consistent evidence from high and moderate-quality secondary studies. Similarly, four CK metrics related to coupling, complexity and cohesion are good indicators of reliability, while inheritance and certain cohesion metrics show no consistent evidence of links to maintainability and reliability. Further empirical studies are needed to explore the link between internal quality attributes, source code metrics and other external quality attributes, including functionality, portability, and usability. The results will help researchers and practitioners understand the body of knowledge on the subject and identify future research directions.
Matter, meaning and semiotics O’Halloran, Kay L
Visual communication (London, England),
02/2023, Letnik:
22, Številka:
1
Journal Article
Recenzirano
Odprti dostop
We inhabit two worlds – the world of matter and the world of meaning (see Halliday, ‘On matter and meaning: The two realms of human experience, 2005). In this article, these two worlds and the ...physical, biological, social and semiotic systems that connect them are investigated. In this respect, semiotic systems are the most complex because they involve physical systems (the material sign), biological systems (human beings), social systems (society and culture) and meaning itself. Semiotic frameworks need to take into account these various dimensions as changes in one system reverberate across the meta-system as a whole. With this in mind, the interplay between material and semiotic worlds from a social semiotic perspective, are explored with a focus on meaning and its significance in relation to human existence. Using examples from various industrial ages, the article explores how semiotic resources (in this case, in mathematics, science and computer programming languages) are organized to structure reality in specific ways, and how semiotic combinations and the technologies arising from those constructions have changed the course of human history. In this discussion, attention is paid to the role of visual communication, both in terms of visual semiotic resources (e.g. graphs, digital images) and visual aspects of multimodal texts. It thus becomes evident that the functionalities of any one semiotic resource (including language) must be viewed in relation to its collective co-deployment with other semiotic resources. Lastly, the author examines semiosis in the digital age and considers the social implications of the current digital ecosystem. In doing so, she conceptualizes digital technologies as a one-way mirror where members of society use digital media for every facet of their lives while being watched, analysed and manipulated by those who have designed and own the digital platforms. It is apparent that semiotics has a major role to play in terms of design, policymaking and activism around future digital technologies.