Selecting the number of different classes that will be assumed to exist in the population is an important step in latent class analysis (LCA). The bootstrap likelihood ratio test (BLRT) provides a ...data-driven way to evaluate the relative adequacy of a (K - 1)-class model compared to a K-class model. However, very little is known about how to predict the power or the required sample size for the BLRT in LCA. Based on extensive Monte Carlo simulations, we provide practical effect size measures and power curves that can be used to predict power for the BLRT in LCA given a proposed sample size and a set of hypothesized population parameters. Estimated power curves and tables provide guidance for researchers wishing to size a study to have sufficient power to detect hypothesized underlying latent classes.
Post-hoc power estimates (power calculated for hypothesis tests after performing them) are sometimes requested by reviewers in an attempt to promote more rigorous designs. However, they should never ...be requested or reported because they have been shown to be logically invalid and practically misleading. We review the problems associated with post-hoc power, particularly the fact that the resulting calculated power is a monotone function of the
p
value and therefore contains no additional helpful information. We then discuss some situations that seem at first to call for post-hoc power analysis, such as attempts to decide on the practical implications of a null finding, or attempts to determine whether the sample size of a secondary data analysis is adequate for a proposed analysis, and consider possible approaches to achieving these goals. We make recommendations for practice in situations in which clear recommendations can be made, and point out other situations where further methodological research and discussion are required.
Abstract
Information criteria (ICs) based on penalized likelihood, such as Akaike’s information criterion (AIC), the Bayesian information criterion (BIC) and sample-size-adjusted versions of them, ...are widely used for model selection in health and biological research. However, different criteria sometimes support different models, leading to discussions about which is the most trustworthy. Some researchers and fields of study habitually use one or the other, often without a clearly stated justification. They may not realize that the criteria may disagree. Others try to compare models using multiple criteria but encounter ambiguity when different criteria lead to substantively different answers, leading to questions about which criterion is best. In this paper we present an alternative perspective on these criteria that can help in interpreting their practical implications. Specifically, in some cases the comparison of two models using ICs can be viewed as equivalent to a likelihood ratio test, with the different criteria representing different alpha levels and BIC being a more conservative test than AIC. This perspective may lead to insights about how to interpret the ICs in more complex situations. For example, AIC or BIC could be preferable, depending on the relative importance one assigns to sensitivity versus specificity. Understanding the differences and similarities among the ICs can make it easier to compare their results and to use them to make informed decisions.
An understanding of the individual and combined effects of a set of intervention components is important for moving the science of preventive medicine interventions forward. This understanding can ...often be achieved in an efficient and economical way via a factorial experiment, in which two or more independent variables are manipulated. The factorial experiment is a complement to the RCT; the two designs address different research questions.
To offer an introduction to factorial experiments aimed at investigators trained primarily in the RCT.
The factorial experiment is compared and contrasted with other experimental designs used commonly in intervention science to highlight where each is most efficient and appropriate.
Several points are made: factorial experiments make very efficient use of experimental subjects when the data are properly analyzed; a factorial experiment can have excellent statistical power even if it has relatively few subjects per experimental condition; and when conducting research to select components for inclusion in a multicomponent intervention, interactions should be studied rather than avoided.
Investigators in preventive medicine and related areas should begin considering factorial experiments alongside other approaches. Experimental designs should be chosen from a resource management perspective, which states that the best experimental design is the one that provides the greatest scientific benefit without exceeding available resources.
Dynamic treatment regimens (DTRs), also known as treatment algorithms or adaptive interventions, play an increasingly important role in many health domains. DTRs are motivated to address the unique ...and changing needs of individuals by delivering the type of treatment needed, when needed, while minimizing unnecessary treatment. Practically, a DTR is a sequence of decision rules that specify, for each of several points in time, how available information about the individual's status and progress should be used in practice to decide which treatment (e.g. type or intensity) to deliver. The sequential multiple assignment randomized trial (SMART) is an experimental design widely used to empirically inform the development of DTRs. Sample size planning resources for SMARTs have been developed for continuous, binary, and survival outcomes. However, an important gap exists in sample size estimation methodology for SMARTs with longitudinal count outcomes. Furthermore, in many health domains, count data are overdispersed—having variance greater than their mean. We propose a Monte Carlo-based approach to sample size estimation applicable to many types of longitudinal outcomes and provide a case study with longitudinal overdispersed count outcomes. A SMART for engaging alcohol and cocaine-dependent patients in treatment is used as motivation.
Abstract Purpose The purpose of this study was to describe historical trends in rates of recent substance use and associations between marijuana and other substances, among U.S. high school seniors ...by race and gender. Methods Data from Monitoring the Future (1976–2013; N = 599,109) were used to estimate historical trends in alcohol use, heavy episodic drinking (HED), cigarette use, and marijuana use. We used time-varying effect models to flexibly estimate changes in associations of substance use behaviors. Results Past-month marijuana use rates peaked in the 1970s, declined through 1990, then rose again to reach levels of use of more than 20% for both black and white participants. Recent years show increasing disparities across groups such that males, and in particular black youth, are on a trajectory toward higher use. This rise in marijuana use is particularly concerning among black youth, with rates far exceeding those for cigarette use and HED. The association of marijuana use with both cigarette use and HED is particularly high in recent years among black adolescents. Conclusions Substance use recently declined among high school seniors, except for marijuana use, particularly among black youth. The increasing association between marijuana and other substances among black adolescents suggests future amplification in critical health disparities.
An investigator who plans to conduct an experiment with multiple independent variables must decide whether to use a complete or reduced factorial design. This article advocates a resource management ...perspective on making this decision, in which the investigator seeks a strategic balance between service to scientific objectives and economy. Considerations in making design decisions include whether research questions are framed as main effects or simple effects; whether and which effects are aliased (confounded) in a particular design; the number of experimental conditions that must be implemented in a particular design and the number of experimental subjects the design requires to maintain the desired level of statistical power; and the costs associated with implementing experimental conditions and obtaining experimental subjects. In this article 4 design options are compared: complete factorial, individual experiments, single factor, and fractional factorial. Complete and fractional factorial designs and single-factor designs are generally more economical than conducting individual experiments on each factor. Although relatively unfamiliar to behavioral scientists, fractional factorial designs merit serious consideration because of their economy and versatility.
Sequential Multiple-Assignment Randomized Trials (SMARTs) play an increasingly important role in psychological and behavioral health research. This experimental approach enables researchers to answer ...scientific questions about how to sequence and match interventions to the unique, changing needs of individuals. A variety of sample size planning resources for SMART studies have been developed, enabling researchers to plan SMARTs for addressing different types of scientific questions. However, relatively limited attention has been given to planning SMARTs with binary (dichotomous) outcomes, which often require higher sample sizes relative to continuous outcomes. Existing resources for estimating sample size requirements for SMARTs with binary outcomes do not consider the potential to improve power by including a baseline measurement and/or multiple repeated outcome measurements. The current paper addresses this issue by providing sample size planning simulation procedures and approximate formulas for two-wave repeated measures binary outcomes (i.e., two measurement times for the outcome variable, before and after intervention delivery). The simulation results agree well with the formulas. We also discuss how to use simulations to calculate power for studies with more than two outcome measurement occasions. Results show that having at least one repeated measurement of the outcome can substantially improve power under certain conditions.
Factorial experimental designs have many applications in the behavioral sciences. In the context of intervention development, factorial experiments play a critical role in building and optimizing ...high-quality, multicomponent behavioral interventions. One challenge in implementing factorial experiments in the behavioral sciences is that individuals are often clustered in social or administrative units and may be more similar to each other than to individuals in other clusters. This means that data are dependent within clusters. Power planning resources are available for factorial experiments in which the multilevel structure of the data is due to individuals' membership in groups that existed before experimentation. However, in many cases clusters are generated in the course of the study itself. Such experiment-induced clustering (EIC) requires different data analysis models and power planning resources from those available for multilevel experimental designs in which clusters exist prior to experimentation. Despite the common occurrence of both experimental designs with EIC and factorial designs, a bridge has yet to be built between EIC and factorial designs. Therefore, resources are limited or nonexistent for planning factorial experiments that involve EIC. This article seeks to bridge this gap by extending prior models for EIC, developed for single-factor experiments, to factorial experiments involving various types of EIC. We also offer power formulas to help investigators decide whether a particular experimental design involving EIC is feasible. We demonstrate that factorial experiments can be powerful and feasible even with EIC. We discuss design considerations and directions for future research.
Psychological interventions, especially those leveraging mobile and wireless technologies, often include multiple components that are delivered and adapted on multiple timescales (e.g., coaching ...sessions adapted monthly based on clinical progress, combined with motivational messages from a mobile device adapted daily based on the person’s daily emotional state). The hybrid experimental design (HED) is a new experimental approach that enables researchers to answer scientific questions about the construction of psychological interventions in which components are delivered and adapted on different timescales. These designs involve sequential randomizations of study participants to intervention components, each at an appropriate timescale (e.g., monthly randomization to different intensities of coaching sessions and daily randomization to different forms of motivational messages). The goal of the current manuscript is twofold. The first is to highlight the flexibility of the HED by conceptualizing this experimental approach as a special form of a factorial design in which different factors are introduced at multiple timescales. We also discuss how the structure of the HED can vary depending on the scientific question(s) motivating the study. The second goal is to explain how data from various types of HEDs can be analyzed to answer a variety of scientific questions about the development of multicomponent psychological interventions. For illustration, we use a completed HED to inform the development of a technology-based weight loss intervention that integrates components that are delivered and adapted on multiple timescales.