Abstract Pilot studies represent a fundamental phase of the research process. The purpose of conducting a pilot study is to examine the feasibility of an approach that is intended to be used in a ...larger scale study. The roles and limitations of pilot studies are described here using a clinical trial as an example. A pilot study can be used to evaluate the feasibility of recruitment, randomization, retention, assessment procedures, new methods, and implementation of the novel intervention. A pilot study is not a hypothesis testing study. Safety, efficacy and effectiveness are not evaluated in a pilot. Contrary to tradition, a pilot study does not provide a meaningful effect size estimate for planning subsequent studies due to the imprecision inherent in data from small samples. Feasibility results do not necessarily generalize beyond the inclusion and exclusion criteria of the pilot design. A pilot study is a requisite initial step in exploring a novel intervention or an innovative application of an intervention. Pilot results can inform feasibility and identify modifications needed in the design of a larger, ensuing hypothesis testing study. Investigators should be forthright in stating these objectives of a pilot study. Grant reviewers and other stakeholders should expect no more.
Objective: Many EEG studies have reported that ADHD is characterized by elevated Theta/Beta ratio (TBR). In this study we conducted a meta-analysis on the TBR in ADHD. Method: TBR data during Eyes ...Open from location Cz were analyzed from children/adolescents 6-18 years of age with and without ADHD. Results: Nine studies were identified with a total of 1253 children/adolescents with and 517 without ADHD. The grand-mean effect size (ES) for the 6-13 year-olds was 0.75 and for the 6-18 year-olds was 0.62. However the test for heterogeneity remained significant; therefore these ESs are misleading and considered an overestimation. Post-hoc analysis found a decreasing difference in TBR across years, explained by an increasing TBR for the non-ADHD groups. Conclusion: Excessive TBR cannot be considered a reliable diagnostic measure of ADHD, however a substantial sub-group of ADHD patients do deviate on this measure and TBR has prognostic value in this sub-group, warranting its use as a prognostic measure rather than a diagnostic measure.
The DSM-5 Field Trials were designed to obtain precise (standard error,0.1) estimates of the intraclass kappa asa measure of the degree to which two clinicians could independently agree on the ...presence or absence of selected DSM-5 diagnoses when the same patient was interviewed on separate occasions, in clinical settings, and evaluated with usual clinical interview methods.
Eleven academic centers in the United States and Canada were selected,and each was assigned several target diagnoses frequently treated in that setting.Consecutive patients visiting a site during the study were screened and stratified on the basis of DSM-IV diagnoses or symptomatic presentations. Patients were randomly assigned to two clinicians for a diagnostic interview; clinicians were blind to any previous diagnosis. All data were entered directly via an Internet-based software system to a secure central server. Detailed research design and statistical methods are presented in an accompanying article.
There were a total of 15 adult and eight child/adolescent diagnoses for which adequate sample sizes were obtained to report adequately precise estimates of the intraclass kappa. Overall, five diagnoses were in the very good range(kappa=0.60–0.79), nine in the good range(kappa=0.40–0.59), six in the questionable range (kappa = 0.20–0.39), and three in the unacceptable range (kappa values,0.20). Eight diagnoses had insufficient sample sizes to generate precise kappa estimates at any site.
Most diagnoses adequately tested had good to very good reliability with these representative clinical populations assessed with usual clinical interview methods. Some diagnoses that were revised to encompass a broader spectrum of symptom expression or had a more dimensional approach tested in the good to very good range.
Identifying treatment moderators may help mental health practitioners arrive at more precise treatment selection for individual patients and can focus clinical research on subpopulations that differ ...in treatment response.
To demonstrate a novel exploratory approach to moderation analysis in randomized clinical trials.
A total of 291 adults from a randomized clinical trial that compared an empirically supported psychotherapy with selective serotonin reuptake inhibitor (SSRI) pharmacotherapy as treatments for depression.
We selected 8 relatively independent individual moderators out of 32 possible variables. A combined moderator, M*, was developed as a weighted combination of the 8 selected individual moderators. M* was then used to identify individuals for whom psychotherapy may be preferred to SSRI pharmacotherapy or vice versa.
Among individual moderators, psychomotor activation had the largest moderator effect size (0.12; 95% CI, <.01 to 0.24). The combined moderator, M*, had a larger moderator effect size than any individual moderator (0.31; 95% CI, 0.15 to 0.46). Although the original analyses demonstrated no overall difference in treatment response, M* divided the study population into 2 subpopulations, with each showing a clinically significant difference in response to psychotherapy vs SSRI pharmacotherapy.
Our results suggest that the strongest determinations for personalized treatment selection will likely require simultaneous consideration of multiple moderators, emphasizing the value of the methods presented here. After validation in a randomized clinical trial, a mental health practitioner could input a patient's relevant baseline values into a handheld computer programmed with the weights needed to calculate M*. The device could then output the patient's M* value and suggested treatment, thereby allowing the mental health practitioner to select the treatment that would offer the greatest likelihood of success for each patient.
Objective:
In recognition of the increasingly important role of moderators and mediators in clinical research, clear definitions are sought of the two terms to avoid inconsistent, ambiguous, and ...possibly misleading results across clinical research studies.
Design:
The criteria used to define moderators and mediators proposed by the Baron & Kenny approach, which have been long used in social/behavioral research, are directly compared to the criteria proposed by the recent MacArthur approach, which modified the Baron & Kenny criteria.
Results:
After clarifying the differences in criteria between approaches, the rationale for the modifications is clarified and the implications for the design and interpretation of future studies considered.
Conclusions:
Researchers may find modifications introduced in the MacArthur approach more appropriate to their research objectives, particularly if their research might have a direct influence on decision making.
Background
The Multimodal Treatment Study (MTA) began as a 14‐month randomized clinical trial of behavioral and pharmacological treatments of 579 children (7–10 years of age) diagnosed with ...attention‐deficit/hyperactivity disorder (ADHD)‐combined type. It transitioned into an observational long‐term follow‐up of 515 cases consented for continuation and 289 classmates (258 without ADHD) added as a local normative comparison group (LNCG), with assessments 2–16 years after baseline.
Methods
Primary (symptom severity) and secondary (adult height) outcomes in adulthood were specified. Treatment was monitored to age 18, and naturalistic subgroups were formed based on three patterns of long‐term use of stimulant medication (Consistent, Inconsistent, and Negligible). For the follow‐up, hypothesis‐generating analyses were performed on outcomes in early adulthood (at 25 years of age). Planned comparisons were used to estimate ADHD‐LNCG differences reflecting persistence of symptoms and naturalistic subgroup differences reflecting benefit (symptom reduction) and cost (height suppression) associated with extended use of medication.
Results
For ratings of symptom severity, the ADHD‐LNCG comparison was statistically significant for the parent/self‐report average (0.51 ± 0.04, p < .0001, d = 1.11), documenting symptom persistence, and for the parent/self‐report difference (0.21 ± 0.04, p < .0001, d = .60), documenting source discrepancy, but the comparisons of naturalistic subgroups reflecting medication effects were not significant. For adult height, the ADHD group was 1.29 ± 0.55 cm shorter than the LNCG (p < .01, d = .21), and the comparisons of the naturalistic subgroups were significant: the treated group with the Consistent or Inconsistent pattern was 2.55 ± 0.73 cm shorter than the subgroup with the Negligible pattern (p < .0005, d = .42), and within the treated group, the subgroup with the Consistent pattern was 2.36 ± 1.13 cm shorter than the subgroup with the Inconsistent pattern (p < .04, d = .38).
Conclusions
In the MTA follow‐up into adulthood, the ADHD group showed symptom persistence compared to local norms from the LNCG. Within naturalistic subgroups of ADHD cases, extended use of medication was associated with suppression of adult height but not with reduction of symptom severity.
Read the Commentary on this article at doi: 10.1111/jcpp.12758
The goal of moderator/mediator research in treatment evaluation is to provide guidance to clinicians to choose the best treatment for each patient with a disorder (moderators), and to advise on its ...optimal protocol or implementation (mediators): personalized/precision medicine. McClure et al. report a systematic review of studies addressing moderators/mediators of the treatment effect of digital interventions for eating disorders, finding no robust moderators or mediators. They attribute this failure to methodological problems, an assessment with which I concur. The focus of this discussion is to clarify which methodological approaches are not likely to be successful, and to envision a research strategy encompassing both hypothesis-generating (exploratory) and hypothesis-testing approaches likely to produce better results not only for eating disorders, but also for all medical treatments.
The authors sought to document, in adult and pediatric patient populations, the development, descriptive statistics,and test-retest reliability of cross-cutting symptom measures proposed for ...inclusion in DSM-5.
Data were collected as part of the multisite DSM-5 Field Trials in large academic settings. There were seven sites focusing on adult patients and four sites focusing on child and adolescent patients.Cross-cutting symptom measures were self-completed by the patient or an informant before the test and the retest interviews, which were conducted from 4 hours to 2 weeks apart. Clinician-report measures were completed during or after the clinical diagnostic interviews. Informants included adult patients, child patients age 11 and older, parents of all child patients age 6 and older, and legal guardians for adult patients unable to self-complete the measures. Study patients were sampled in a stratified design,and sampling weights were used in data analyses. The mean scores and standard deviations were computed and pooled across adult and child sites. Reliabilities were reported as pooled intraclass correlation coefficients (ICCs) with 95% confidence intervals.
In adults, test-retest reliabilities of the cross-cutting symptom items generally were good to excellent. At the child and adolescent sites, parents were also reliablereporters of their children’s symptoms,with few exceptions. Reliabilities were not as uniformly good for child respondents, and ICCs for several items fell into the questionable range in this age group. Clinicians rated psychosis with good reliability in adult patients but were less reliable in assessing clinical domains related to psychosis in children and to suicide in all age groups.
These results show promising test-retest reliability results for this group of assessments, many of which are newly developed or have not been previously tested in psychiatric populations
The most pervasive and damaging myth in clinical research is that the smaller the p‐value, the stronger the hypothesis. In reality, the p‐value primarily reflects the quality of research design ...decisions. The most common proposal to avoid misleading conclusions from clinical research requires the appropriate use of effect sizes, but which effect size, used when and how, is an open question. A solution is proposed for perhaps the most common problem in clinical research, the comparison between two populations, for example, comparison of two treatments in a randomized clinical trial or comparison of high risk versus low risk individuals in an epidemiological study: the success rate difference or equivalently the number needed to treat/take (NNT).