Latent class analysis often aims to relate the classes to continuous external consequences ("distal outcomes"), but estimating such relationships necessitates distributional assumptions. Lanza, Tan, ...and Bray (2013) suggested circumventing such assumptions with their LTB approach: Linear logistic regression of latent class membership on each distal outcome is first used, after which this estimated relationship is reversed using Bayes' rule. However, the LTB approach currently has 3 drawbacks, which we address in this article. First, LTB interchanges the assumption of normality for one of homoskedasticity, or, equivalently, of linearity of the logistic regression, leading to bias. Fortunately, we show introducing higher order terms prevents this bias. Second, we improve coverage rates by replacing approximate standard errors with resampling methods. Finally, we introduce a bias-corrected 3-step version of LTB as a practical alternative to standard LTB. The improved LTB methods are validated by a simulation study, and an example application demonstrates their usefulness.
In linear regression problems with many predictors, penalized regression techniques are often used to guard against overfitting and to select variables relevant for predicting an outcome variable. ...Recently, Bayesian penalization is becoming increasingly popular in which the prior distribution performs a function similar to that of the penalty term in classical penalization. Specifically, the so-called shrinkage priors in Bayesian penalization aim to shrink small effects to zero while maintaining true large effects. Compared to classical penalization techniques, Bayesian penalization techniques perform similarly or sometimes even better, and they offer additional advantages such as readily available uncertainty estimates, automatic estimation of the penalty parameter, and more flexibility in terms of penalties that can be considered. However, many different shrinkage priors exist and the available, often quite technical, literature primarily focuses on presenting one shrinkage prior and often provides comparisons with only one or two other shrinkage priors. This can make it difficult for researchers to navigate through the many prior options and choose a shrinkage prior for the problem at hand. Therefore, the aim of this paper is to provide a comprehensive overview of the literature on Bayesian penalization. We provide a theoretical and conceptual comparison of nine different shrinkage priors and parametrize the priors, if possible, in terms of scale mixture of normal distributions to facilitate comparisons. We illustrate different characteristics and behaviors of the shrinkage priors and compare their performance in terms of prediction and variable selection in a simulation study. Additionally, we provide two empirical examples to illustrate the application of Bayesian penalization. Finally, an R package bayesreg is available online (https://github.com/sara-vanerp/bayesreg) which allows researchers to perform Bayesian penalized regression with novel shrinkage priors in an easy manner.
•Various shrinkage priors have distinctive theoretical characteristics.•Most priors have a similar prediction accuracy unless p>n.•Different shrinkage priors vary in variable selection accuracy.
Abstract
Bayesian structural equation modeling (BSEM) has recently gained popularity because it enables researchers to fit complex models and solve some of the issues often encountered in classical ...maximum likelihood estimation, such as nonconvergence and inadmissible solutions. An important component of any Bayesian analysis is the prior distribution of the unknown model parameters. Often, researchers rely on default priors, which are constructed in an automatic fashion without requiring substantive prior information. However, the prior can have a serious influence on the estimation of the model parameters, which affects the mean squared error, bias, coverage rates, and quantiles of the estimates. In this article, we investigate the performance of three different default priors: noninformative improper priors, vague proper priors, and empirical Bayes priors-with the latter being novel in the BSEM literature. Based on a simulation study, we find that these three default BSEM methods may perform very differently, especially with small samples. A careful prior sensitivity analysis is therefore needed when performing a default BSEM analysis. For this purpose, we provide a practical step-by-step guide for practitioners to conducting a prior sensitivity analysis in default BSEM. Our recommendations are illustrated using a well-known case study from the structural equation modeling literature, and all code for conducting the prior sensitivity analysis is available in the online supplemental materials.
Translational Abstract
Psychologists and social scientists often ask complex questions regarding group- and individual differences and how these change over time. To answer these questions, researchers generally use structural equation modeling (SEM), a general framework to fit complex models. Traditionally, structural equation models are estimated using maximum likelihood estimation, which uses only the data at hand. An alternative approach is Bayesian SEM, which is becoming increasingly popular because it can solve several problems of maximum likelihood estimation. Bayesian SEM combines the data at hand with a prior distribution of the parameters in the model. This prior distribution can be based on subjective beliefs or previous research. In practice, however, researchers tend to use noninformative "default" priors which are constructed in an automatic fashion without including any substantive prior information. Different default priors can be used for this purpose. Through a simulation study, we show that the exact choice of the default prior can have a serious influence on the estimation of the model parameters, especially when the sample size is small. Because of this finding, we recommend researchers who use default Bayesian SEM to always perform a prior sensitivity analysis to determine how robust the conclusions are across the various analyses. We provide a practical step-by-step guide on how to conduct a prior sensitivity analysis and we illustrate our recommendations with a practical application.
Abstract
To help researchers conduct a systematic review or meta-analysis as efficiently and transparently as possible, we designed a tool to accelerate the step of screening titles and abstracts. ...For many tasks—including but not limited to systematic reviews and meta-analyses—the scientific literature needs to be checked systematically. Scholars and practitioners currently screen thousands of studies by hand to determine which studies to include in their review or meta-analysis. This is error prone and inefficient because of extremely imbalanced data: only a fraction of the screened studies is relevant. The future of systematic reviewing will be an interaction with machine learning algorithms to deal with the enormous increase of available text. We therefore developed an open source machine learning-aided pipeline applying active learning: ASReview. We demonstrate by means of simulation studies that active learning can yield far more efficient reviewing than manual reviewing while providing high quality. Furthermore, we describe the options of the free and open source research software and present the results from user experience tests. We invite the community to contribute to open source projects such as our own that provide measurable and reproducible improvements over current practice.
Surveys are well known to contain response errors of different types, including acquiescence, social desirability, common method variance and random error simultaneously. Nevertheless, a single error ...source at a time is all that most methods developed to estimate and correct for such errors consider in practice. Consequently, estimation of response errors is inefficient, their relative importance is unknown and the optimal question format may not be discoverable. To remedy this situation, we demonstrate how multiple types of errors can be estimated concurrently with the recently introduced ‘multitrait‐multierror’ (MTME) approach. MTME combines the theory of design of experiments with latent variable modelling to estimate response error variances of different error types simultaneously. This allows researchers to evaluate which errors are most impactful, and aids in the discovery of optimal question formats. We apply this approach using representative data from the United Kingdom to six survey items measuring attitudes towards immigrants that are commonly used across public opinion studies.
To gain better understanding of osteoarthritis (OA) heterogeneity and its predictors for distinguishing OA phenotypes. This could provide the opportunity to tailor prevention and treatment strategies ...and thus improve care.
Ten year follow-up data from CHECK (1002 early-OA subjects with first general practitioner visit for complaints ≤6 months before inclusion) was used. Data were collected on WOMAC (pain, function, stiffness), quantitative radiographic tibiofemoral (TF) OA characteristics, and semi-quantitative radiographic patellofemoral (PF) OA characteristics. Using functional data analysis, distinctive sets of trajectories were identified for WOMAC, TF and PF characteristics, based on model fit and clinical interpretation. The probabilities of knee membership to each trajectory were used in hierarchical cluster analyses to derive knee OA phenotypes. The number and composition of potential phenotypes was selected again based on model fit (silhouette score) and clinical interpretation.
Five trajectories representing different constant levels or changing WOMAC scores were identified. For TF and PF OA, eight and six trajectories respectively were identified based on (changes in) joint space narrowing, osteophytes and sclerosis. Combining the probabilities of knees belonging to these different trajectories resulted in six clusters ('phenotypes') of knees with different degrees of functional (WOMAC) and radiographic (PF) parameters; TF parameters were found not to significantly contribute to clustering. Including baseline characteristics as well resulted in eight clusters of knees, dominated by sex, menopausal status and WOMAC scores, with only limited contribution of PF features.
Several stable and progressive trajectories of OA symptoms and radiographic features were identified, resulting in phenotypes with relatively independent symptomatic and radiographic features. Sex and menopausal status may be especially important when phenotyping knee OA patients, while radiographic features contributed less. Possible phenotypes were identified that, after validation, could aid personalized treatments and patients selection.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Text embedding models from Natural Language Processing can map text data (e.g. words, sentences, documents) to meaningful numerical representations (a.k.a. text embeddings). While such models are ...increasingly applied in social science research, one important issue is often not addressed: the extent to which these embeddings are high-quality representations of the information needed to be encoded. We view this quality evaluation problem from a measurement validity perspective, and propose the use of the classic construct validity framework to evaluate the quality of text embeddings. First, we describe how this framework can be adapted to the opaque and high-dimensional nature of text embeddings. Second, we apply our adapted framework to an example where we compare the validity of survey question representation across text embedding models.
Social and behavioral scientists are increasingly employing technologies such as fMRI, smartphones, and gene sequencing, which yield 'high-dimensional' datasets with more columns than rows. There is ...increasing interest, but little substantive theory, in the role the variables in these data play in known processes.
This necessitates exploratory mediation analysis, for which structural equation modeling is the benchmark method. However, this method cannot perform mediation analysis with more variables than observations. One option is to run a series of univariate mediation models, which incorrectly assumes independence of the mediators. Another option is regularization, but the available implementations may lead to high false-positive rates.
In this article, we develop a hybrid approach which uses components of both filter and regularization: the 'Coordinate-wise Mediation Filter'. It performs filtering conditional on the other selected mediators. We show through simulation that it improves performance over existing methods. Finally, we provide an empirical example, showing how our method may be used for epigenetic research.
Temperature is a primary driver of the distribution of biodiversity as well as of ecosystem boundaries. Declining temperature with increasing elevation in montane systems has long been recognized as ...a major factor shaping plant community biodiversity, metabolic processes, and ecosystem dynamics. Elevational gradients, as thermoclines, also enable prediction of long-term ecological responses to climate warming. One of the most striking manifestations of increasing elevation is the abrupt transitions from forest to treeless alpine tundra. However, whether there are globally consistent above- and belowground responses to these transitions remains an open question. To disentangle the direct and indirect effects of temperature on ecosystem properties, here we evaluate replicate treeline ecotones in seven temperate regions of the world. We find that declining temperatures with increasing elevation did not affect tree leaf nutrient concentrations, but did reduce ground-layer community-weighted plant nitrogen, leading to the strong stoichiometric convergence of ground-layer plant community nitrogen to phosphorus ratios across all regions. Further, elevation-driven changes in plant nutrients were associated with changes in soil organic matter content and quality (carbon to nitrogen ratios) and microbial properties. Combined, our identification of direct and indirect temperature controls over plant communities and soil properties in seven contrasting regions suggests that future warming may disrupt the functional properties of montane ecosystems, particularly where plant community reorganization outpaces treeline advance.
Personal growth initiative (PGI), defined as being proactive about one's personal development, is critical to graduate students' academic success. Prior research has shown that students' PGI can be ...enhanced through interventions that focus on stimulating developmental activities. Within this study, we aimed to investigate whether an intervention that stimulates development in the area of one's personal strengths (strengths intervention) has more beneficial effects on students' PGI than an intervention that stimulates development in the area of individual deficiencies (deficiency intervention). We conducted 2 longitudinal field experiments to investigate the effects of the 2 interventions on students' PGI (Experiment 1) and the potential mediating role of psychological capital (PsyCap) in this regard (Experiment 2). In Experiment 1, 105 (N = 105) university students participated in either a strengths intervention or a deficiency intervention. Results indicated that the strengths intervention increased the students' PGI in the short but not in the long term, whereas the deficiency intervention did not affect PGI. Ninety students (N = 90) participated in Experiment 2, in which we slightly refined both interventions by putting a stronger emphasis on the ongoing development of strengths (strengths intervention) or correction of deficiencies (deficiency intervention) by adding posttraining assignments. Results suggested that participating in both interventions led to increases in PGI over a 3-month period, but that these increases were bigger for the strengths intervention group. Furthermore, the relationship between the strengths intervention and PGI was mediated by hope as one component of PsyCap.