Although event-related potential (ERP) research on language processing has capitalized on key, theoretically influential components such as the N400 and P600, their measurement properties—especially ...the variability in their temporal and spatial parameters—have rarely been examined. The current study examined the measurement properties of the N400 and P600 effects elicited by semantic and syntactic anomalies, respectively, during sentence processing. We used a bootstrap resampling procedure to randomly draw many thousands of resamples varying in sample size and stimulus count from a larger sample of 187 participants and 40 stimulus sentences of each type per condition. Our resampling investigation focused on three issues: (a) statistical power; (b) variability in the magnitudes of the effects; and (c) variability in the temporal and spatial profiles of the effects. At the level of grand averages, the N400 and P600 effects were both robust and substantial. However, across resamples, there was a high degree of variability in effect magnitudes, onset times, and scalp distributions, which may be greater than is currently appreciated in the literature, especially for the P600 effects. These results provide a useful basis for designing future studies using these two well-established ERP components. At the same time, the results also highlight challenges that need to be addressed in future research (e.g., how best to analyze the ERP data without engaging in such questionable research practices as p-hacking).
Effect sizes are the currency of psychological research. They quantify the results of a study to answer the research question and are used to calculate statistical power. The interpretation of effect ...sizes-when is an effect small, medium, or large?-has been guided by the recommendations Jacob Cohen gave in his pioneering writings starting in 1962: Either compare an effect with the effects found in past research or use certain conventional benchmarks. The present analysis shows that neither of these recommendations is currently applicable. From past publications without pre-registration, 900 effects were randomly drawn and compared with 93 effects from publications with pre-registration, revealing a large difference: Effects from the former (median
= 0.36) were much larger than effects from the latter (median
= 0.16). That is, certain biases, such as publication bias or questionable research practices, have caused a dramatic inflation in published effects, making it difficult to compare an actual effect with the real population effects (as these are unknown). In addition, there were very large differences in the mean effects between psychological sub-disciplines and between different study designs, making it impossible to apply any global benchmarks. Many more pre-registered studies are needed in the future to derive a reliable picture of real population effects.
Purpose
Partial least squares structural equation modeling (PLS-SEM) is an important statistical technique in the toolbox of methods that researchers in marketing and other social sciences ...disciplines frequently use in their empirical analyses. The purpose of this paper is to shed light on several misconceptions that have emerged as a result of the proposed “new guidelines” for PLS-SEM. The authors discuss various aspects related to current debates on when or when not to use PLS-SEM, and which model evaluation metrics to apply. In addition, this paper summarizes several important methodological extensions of PLS-SEM researchers can use to improve the quality of their analyses, results and findings.
Design/methodology/approach
The paper merges literature from various disciplines, including marketing, strategic management, information systems, accounting and statistics, to present a state-of-the-art review of PLS-SEM. Based on these findings, the paper offers a point of orientation on how to consider and apply these latest developments when executing or assessing PLS-SEM-based research.
Findings
This paper offers guidance regarding situations that favor the use of PLS-SEM and discusses the need to consider certain model evaluation metrics. It also summarizes how to deal with endogeneity in PLS-SEM, and critically comments on the recent proposal to adjust PLS-SEM estimates to mimic common factor models that are the foundation of covariance-based SEM. Finally, this paper opposes characterizing common concepts and practices of PLS-SEM as “out-of-date” without providing well-substantiated alternatives and solutions.
Research limitations/implications
The paper paves the way for future discussions and suggests a way forward to reach consensus regarding situations that favor PLS-SEM use and its application.
Practical implications
This paper offers guidance on how to consider the latest methodological developments when executing or assessing PLS-SEM-based research.
Originality/value
This paper complements recently proposed “new guidelines” with the aim of offering a counter perspective on some strong claims made in the latest literature on PLS-SEM. It also clarifies some misconceptions regarding the application of PLS-SEM.
Quantifying the uncertainty of the renewable energy generation units and loads is critical to ensure the dynamic security of next-generation power systems. To achieve that goal, the time-consuming ...Monte Carlo simulations are usually used, which is not suitable for online dynamic analysis of large-scale power systems. To circumvent this difficulty, two uncertainty quantification approaches using polynomial-chaos-based methods are proposed and investigated. The first approach is the generalized polynomial chaos method that is able to reduce the computing time by three orders of magnitude compared with Monte Carlo methods while achieving the same accuracy. We find that this approach is very useful for short-term power system dynamic simulations, but it may produce unreliable results for long-term simulations. To address the weakness of that approach, we present the second method, namely the multi-element generalized-polynomial-chaos method. It is seen that this method is more accurate and more numerically stable than the generalized polynomial chaos method. Since the uncertainties of the renewable energy generation units and loads can follow very different distributions, we extend the Stieltjes' recursive procedure that allows us to derive the orthogonal basis functions for any assumed probability distribution of the input random variables. Extensive simulations carried out on the WECC 3-machine 9-bus system and the New England 10-machine 39-bus system reveal that our proposed approaches are able to produce comparable accuracy as the Monte Carlo based method while achieving significantly improved computational efficiency for both stable and unstable power system operating conditions.
Repeated Measures Correlation Bakdash, Jonathan Z; Marusich, Laura R
Frontiers in psychology,
04/2017, Volume:
8
Journal Article
Peer reviewed
Open access
Repeated measures correlation (rmcorr) is a statistical technique for determining the common within-individual association for paired measures assessed on two or more occasions for multiple ...individuals. Simple regression/correlation is often applied to non-independent observations or aggregated data; this may produce biased, specious results due to violation of independence and/or differing patterns between-participants versus within-participants. Unlike simple regression/correlation, rmcorr does not violate the assumption of independence of observations. Also, rmcorr tends to have much greater statistical power because neither averaging nor aggregation is necessary for an intra-individual research question. Rmcorr estimates the common regression slope, the association shared among individuals. To make rmcorr accessible, we provide background information for its assumptions and equations, visualization, power, and tradeoffs with rmcorr compared to multilevel modeling. We introduce the R package (rmcorr) and demonstrate its use for inferential statistics and visualization with two example datasets. The examples are used to illustrate research questions at different levels of analysis, intra-individual, and inter-individual. Rmcorr is well-suited for research questions regarding the common linear association in paired repeated measures data. All results are fully reproducible.
The sample size necessary to obtain a desired level of statistical power depends in part on the population value of the effect size, which is, by definition, unknown. A common approach to sample-size ...planning uses the sample effect size from a prior study as an estimate of the population value of the effect to be detected in the future study. Although this strategy is intuitively appealing, effect-size estimates, taken at face value, are typically not accurate estimates of the population effect size because of publication bias and uncertainty. We show that the use of this approach often results in underpowered studies, sometimes to an alarming degree. We present an alternative approach that adjusts sample effect sizes for bias and uncertainty, and we demonstrate its effectiveness for several experimental designs. Furthermore, we discuss an open-source R package, BUCSS, and user-friendly Web applications that we have made available to researchers so that they can easily implement our suggested methods.
Rejoinder Meyners, Michael; Carr, B. Thomas; Hasted, Anne
Food quality and preference,
January 2020, Volume:
79
Journal Article
Peer reviewed
•Responses to main discussion points from commentary authors.•Additional builds highlighting the need for statistical power calculations.•Careful distinction between a-priori and post-hoc ...power.•Designs for future studies should not rely on post hoc power.
Five commentaries by widely acknowledged experts were received for our initial letter regarding the use of replications in sensory studies. These commentaries are much appreciated, as they add substantial value to the discussion, covering various relevant aspects. Here, we reply to some of the main aspects and build on these further.
Abstract
The estimation of power in two-level models used to analyze data that are hierarchically structured is particularly complex because the outcome contains variance at two levels that is ...regressed on predictors at two levels. Methods for the estimation of power in two-level models have been based on formulas and Monte Carlo simulation. We provide a hands-on tutorial illustrating how a priori and post hoc power analyses for the most frequently used two-level models are conducted. We describe how a population model for the power analysis can be specified by using standardized input parameters and how the power analysis is implemented in SIMR, a very flexible power estimation method based on Monte Carlo simulation. Finally, we provide case-sensitive rules of thumb for deriving sufficient sample sizes as well as minimum detectable effect sizes that yield a power ≥ .80 for the effects and input parameters most frequently analyzed by psychologists. For medium variance components, the results indicate that with lower level (L1) sample sizes up to 30 and higher level (L2) sample sizes up to 200, medium and large fixed effects can be detected. However, small L2 direct- or cross-level interaction effects cannot be detected with up to 200 clusters. The tutorial and guidelines should be of help to researchers dealing with multilevel study designs such as individuals clustered within groups or repeated measurements clustered within individuals.
Translational Abstract
In psychological research, two-level models are used to analyze data that are hierarchically structured. Such hierarchies in data can occur when participants are clustered within groups or repeated measurements are made for the same participants. Hierarchically structured data lead to quite complex dependencies among variances: (a) the outcome variable contains variance at two different levels, (b) predictor variables at both levels relate to outcome variance at the respective level (direct effects), (c) the size of the effect of a predictor variable on the lower level can vary between clusters at the higher level and (d) this variation can be explained by predictor variables of the higher level (so called cross-level interaction effects). All these variances and their dependencies must be specified to estimate the likelihood of obtaining statistically significant effects in a two-level model-known as the statistical power. We provide a hands-on tutorial illustrating the specification of these parameters and the implementation of a power analysis in the statistical environment R. We also provide rules of thumb for the sample sizes necessary to detect an effect of a certain size with sufficient power.
Full text
Available for:
CEKLJ, FFLJ, NUK, ODKLJ, PEFLJ
Abstract
In this article we address a number of important issues that arise in the analysis of nonindependent data. Such data are common in studies in which predictors vary within "units" (e.g., ...within-subjects, within-classrooms). Most researchers analyze categorical within-unit predictors with repeated-measures ANOVAs, but continuous within-unit predictors with linear mixed-effects models (LMEMs). We show that both types of predictor variables can be analyzed within the LMEM framework. We discuss designs with multiple sources of nonindependence, for example, studies in which the same subjects rate the same set of items or in which students nested in classrooms provide multiple answers. We provide clear guidelines about the types of random effects that should be included in the analysis of such designs. We also present a number of corrective steps that researchers can take when convergence fails in LMEM models with too many parameters. We end with a brief discussion on the trade-off between power and generalizability in designs with "within-unit" predictors.
Translational Abstract
Researchers and practitioners sometimes want to analyze data that are "nonindependent." Data are said to be nonindependent when the study is designed such that certain data points can be expected to be on average more similar to each other than other data points. This is usually the case when each subject provides multiple data points (so-called within-subject designs), when subjects belonging to higher-order units influence each other (e.g., students clustered in classrooms, employees clustered in teams), or when subjects react to or evaluate the same set of items (e.g., pictures, words, sentences, products, art works, target individuals). In the present article, we propose that all types of nonindependent data can be analyzed with the same statistical technique called "linear mixed-effects models." Compared to standard statistical tests belonging to the family of "General Linear Models" (e.g., ANOVA, regression), linear mixed-effects models have a "complex error term," i.e., the data analyst has to explicitly include all possible reasons for why the predictions of the statistical model may be wrong (these possible reasons are called "random effects"). It is not always obvious how to identify all possible sources of error. In this article, we provide clear guidelines on the type of random effects that researchers and practitioners should include when estimating linear mixed-effects models. Failure to include the appropriate random effects leads to an unacceptable false positive rate (or "type I error rate"), i.e., a high proportion of statistically significant results for effects that do not exist in reality.
Full text
Available for:
CEKLJ, FFLJ, NUK, ODKLJ, PEFLJ
RNA-seq is now the technology of choice for genome-wide differential gene expression experiments, but it is not clear how many biological replicates are needed to ensure valid biological ...interpretation of the results or which statistical tools are best for analyzing the data. An RNA-seq experiment with 48 biological replicates in each of two conditions was performed to answer these questions and provide guidelines for experimental design. With three biological replicates, nine of the 11 tools evaluated found only 20%-40% of the significantly differentially expressed (SDE) genes identified with the full set of 42 clean replicates. This rises to >85% for the subset of SDE genes changing in expression by more than fourfold. To achieve >85% for all SDE genes regardless of fold change requires more than 20 biological replicates. The same nine tools successfully control their false discovery rate at ≲5% for all numbers of replicates, while the remaining two tools fail to control their FDR adequately, particularly for low numbers of replicates. For future RNA-seq experiments, these results suggest that at least six biological replicates should be used, rising to at least 12 when it is important to identify SDE genes for all fold changes. If fewer than 12 replicates are used, a superior combination of true positive and false positive performances makes edgeR and DESeq2 the leading tools. For higher replicate numbers, minimizing false positives is more important and DESeq marginally outperforms the other tools.