Careless responding is a bias in survey responses that disregards the actual item content, constituting a threat to the factor structure, reliability, and validity of psychological measurements. ...Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (e.g., bogus items), auxiliary or paradata (e.g., response times), or data-driven statistical techniques (e.g., Mahalanobis distance). In the present study, gradient boosted trees, a state-of-the-art machine learning technique, are introduced to identify careless respondents. The performance of the approach was compared with established techniques previously described in the literature (e.g., statistical outlier methods, consistency analyses, and response pattern functions) using simulated data and empirical data from a web-based study, in which diligent versus careless response behavior was experimentally induced. In the simulation study, gradient boosting machines outperformed traditional detection mechanisms in flagging aberrant responses. However, this advantage did not transfer to the empirical study. In terms of precision, the results of both traditional and the novel detection mechanisms were unsatisfactory, although the latter incorporated response times as additional information. The comparison between the results of the simulation and the online study showed that responses in real-world settings seem to be much more erratic than can be expected from the simulation studies. We critically discuss the generalizability of currently available detection methods and provide an outlook on future research on the detection of aberrant response patterns in survey research.
Attrition in longitudinal studies is a major threat to the representativeness of the data and the generalizability of the findings. Typical approaches to address systematic nonresponse are either ...expensive and unsatisfactory (e.g., oversampling) or rely on the unrealistic assumption of data missing at random (e.g., multiple imputation). Thus, models that effectively predict who most likely drops out in subsequent occasions might offer the opportunity to take countermeasures (e.g., incentives). With the current study, we introduce a longitudinal model validation approach and examine whether attrition in two nationally representative longitudinal panel studies can be predicted accurately. We compare the performance of a basic logistic regression model with a more flexible, data-driven machine learning algorithm—gradient boosting machines. Our results show almost no difference in accuracies for both modeling approaches, which contradicts claims of similar studies on survey attrition. Prediction models could not be generalized across surveys and were less accurate when tested at a later survey wave. We discuss the implications of these findings for survey retention, the use of complex machine learning algorithms, and give some recommendations to deal with study attrition.
The advent of large-scale assessment, but also the more frequent use of longitudinal and multivariate approaches to measurement in psychological, educational, and sociological research, caused an ...increased demand for psychometrically sound short scales. Shortening scales economizes on valuable administration time, but might result in inadequate measures because reducing an item set could: a) change the internal structure of the measure, b) result in poorer reliability and measurement precision, c) deliver measures that cannot effectively discriminate between persons on the intended ability spectrum, and d) reduce test-criterion relations. Different approaches to abbreviate measures fare differently with respect to the above-mentioned problems. Therefore, we compare the quality and efficiency of three item selection strategies to derive short scales from an existing long version: a Stepwise COnfirmatory Factor Analytical approach (SCOFA) that maximizes factor loadings and two metaheuristics, specifically an Ant Colony Optimization (ACO) with a tailored user-defined optimization function and a Genetic Algorithm (GA) with an unspecific cost-reduction function. SCOFA compiled short versions were highly reliable, but had poor validity. In contrast, both metaheuristics outperformed SCOFA and produced efficient and psychometrically sound short versions (unidimensional, reliable, sensitive, and valid). We discuss under which circumstances ACO and GA produce equivalent results and provide recommendations for conditions in which it is advisable to use a metaheuristic with an unspecific out-of-the-box optimization function.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
•Students with higher interest showed higher achievement in five domains.•Students showed higher achievement in domains they were more interested in.•Interest effects were found beyond effects of ...general cognitive abilities and SES.•The effects generalized across grades and test-scores.•The relation between achievement and interest was higher in math compared to German.
We examined the incremental effect of academic interest on achievement beyond general cognitive ability and students’ background characteristics in five domains (math, German, biology, chemistry, and physics). We analyzed a nationally representative German dataset of 39,192 ninth-grade students and found a unique effect of interest over and above the other predictors across the five domains, both for class grades and standardized test scores. The effect was present between persons (in a given domain, students with higher interest showed higher achievement) and within persons (the same student showed a higher achievement in domains she/he was more interested in). The effects were stronger for grades than test scores and stronger in math than in other domains. The results emphasize the positive relation between interest and academic achievement in different domains. Furthermore, they expand the literature by emphasizing the role of the achievement measure and the domain as moderators of the interest–achievement relation and by showing that interest can predict both inter- and intraindividual variation in achievement.
Self-regulation is an essential ability of children to cope with various developmental challenges. This study examines the developmental interplay between emotional and behavioral self-regulation ...during childhood and the relationship with academic achievement using data from the longitudinal Millennium Cohort Study (UK). Using cross-lagged panel analyses, we found that emotional and behavioral self-regulation were separate and stable constructs. In addition, both emotional and behavioral self-regulation had positive cross-lagged effects from ages 3 to 7. At an early developmental stage (ages 3 to 5), emotional regulation affected behavioral regulation more strongly than later developmental stages. However, the difference between the reciprocal effects was small from ages 5 to 7. Moreover, behavioral regulation during the third year of primary education (age 7) had a substantial and positive effect on teachers’ evaluations of educational achievement during the last year of primary school (age 11). In contrast, emotional self-regulation only had a small indirect and positive effect via behavioral self-regulation. The current study suggests the structure of self-regulation was multidimensional and its facets are mutually dependent in the child’s development. In order to gain a complete picture of the development of self-regulation and its effect on educational achievement, the facets emotional and behavioral regulation should both be studied in concert.
Metaheuristics are optimization algorithms that efficiently solve a variety of complex combinatorial problems. In psychological research, metaheuristics have been applied in short-scale construction ...and model specification search. In the present study, we propose a bee swarm optimization (BSO) algorithm to explore the structure underlying a psychological measurement instrument. The algorithm assigns items to an unknown number of nested factors in a confirmatory bifactor model, while simultaneously selecting items for the final scale. To achieve this, the algorithm follows the biological template of bees’ foraging behavior: Scout bees explore new food sources, whereas onlooker bees search in the vicinity of previously explored, promising food sources. Analogously, scout bees in BSO introduce major changes to a model specification (e.g., adding or removing a specific factor), whereas onlooker bees only make minor changes (e.g., adding an item to a factor or swapping items between specific factors). Through this division of labor in an artificial bee colony, the algorithm aims to strike a balance between two opposing strategies diversification (or exploration) versus intensification (or exploitation). We demonstrate the usefulness of the algorithm to find the underlying structure in two empirical data sets (Holzinger–Swineford and short dark triad questionnaire, SDQ3). Furthermore, we illustrate the influence of relevant hyperparameters such as the number of bees in the hive, the percentage of scouts to onlookers, and the number of top solutions to be followed. Finally, useful applications of the new algorithm are discussed, as well as limitations and possible future research opportunities.
Students evaluate their achievement in a specific domain in relation to their achievement in other domains and form their self-concepts accordingly. These comparison processes have been termed ...dimensional comparisons and shown to be an important source of academic self-concepts in addition to social and temporal comparisons. Research on the internal/external frame of reference model (I/E model) has frequently found negative effects of students' achievement on their academic self-concept between different scholastic domains (mathematics and the language of instruction) that are interpreted as contrast effects of dimensional comparisons. There is mixed evidence with regard to whether negative contrast effects or positive assimilation effects occur when students compare their achievement in domains that are more similar. In this study, we extended the original I/E model with 3 science domains (biology, chemistry, and physics). Using structural equation modeling, we analyzed the domain-specific self-concepts, grades, and test scores of a representative sample of 9th-grade students in Germany (N = 20,050) across 5 domains. Mathematics, physics, and chemistry showed contrast effects to German, whereas small assimilation effects were found between mathematics, physics, and chemistry. This effect pattern was present for both grades and test scores. Achievement in mathematics and the language of instruction affected self-concepts in the sciences, whereas achievement in the sciences had no effect on self-concepts in other subjects. The results support the hypotheses derived from dimensional comparison theory that both contrast and assimilation effects can result from dimensional comparisons and that the 3 science subjects are affected differentially by these comparisons.
Unproctored, web-based assessments are frequently
compromised by a lack of control over the participants' test-taking
behavior. It is likely that participants cheat if personal consequences are
high. ...This meta-analysis summarizes findings on context effects in unproctored
and proctored ability assessments and examines mean score differences and
correlations between both assessment contexts. As potential moderators, we
consider (a) the perceived consequences of the assessment, (b) countermeasures
against cheating, (c) the susceptibility to cheating of the measure itself, and
(d) the use of different test media. For standardized mean differences, a
three-level random-effects meta-analysis based on 109 effect sizes from 49
studies (total N = 100,434) identified a pooled
effect of Δ = 0.20, 95% CI 0.10, 0.31, indicating higher
scores in unproctored assessments. Moderator analyses revealed significantly
smaller effects for measures that are difficult to research on the Internet.
These results demonstrate that unproctored ability assessments are biased by
cheating. Unproctored assessments may be most suitable for tasks that are
difficult to search on the Internet.
Intelligence has been declared as a necessary but not sufficient condition for creativity, which was subsequently (erroneously) translated into the so-called threshold hypothesis. This hypothesis ...predicts a change in the correlation between creativity and intelligence at around 1.33 standard deviations above the population mean. A closer inspection of previous inconclusive results suggests that the heterogeneity is mostly due to the use of suboptimal data analytical procedures. Herein, we applied and compared three methods that allowed us to handle intelligence as a continuous variable. In more detail, we examined the threshold of the creativity-intelligence relation with (a) scatterplots and heteroscedasticity analysis, (b) segmented regression analysis, and (c) local structural equation models in two multivariate studies (
= 456;
= 438). We found no evidence for the threshold hypothesis of creativity across different analytical procedures in both studies. Given the problematic history of the threshold hypothesis and its unequivocal rejection with appropriate multivariate methods, we recommend the total abandonment of the threshold.
Medical education research has focused almost entirely on the education of future physicians. In comparison, findings on other health-related occupations, such as medical assistants, are scarce. With ...the current study, we wanted to examine the knowledge-is-power hypothesis in a real life educational setting and add to the sparse literature on medical assistants. Acquisition of vocational knowledge in vocational education and training (VET) was examined for medical assistant students (
= 448). Differences in domain-specific vocational knowledge were predicted by crystallized and fluid intelligence in the course of VET. A multiple matrix design with 3 year-specific booklets was used for the vocational knowledge tests of the medical assistants. The unique and joint contributions of the predictors were investigated with structural equation modeling. Crystallized intelligence emerged as the strongest predictor of vocational knowledge at every stage of VET, while fluid intelligence only showed weak effects. The present results support the knowledge-is-power hypothesis, even in a broad and more naturalistic setting. This emphasizes the relevance of general knowledge for occupations, such as medical assistants, which are more focused on learning hands-on skills than the acquisition of academic knowledge.