To develop methods guidance to support the conduct of rapid reviews (RRs) produced within Cochrane and beyond, in response to requests for timely evidence syntheses for decision-making purposes ...including urgent health issues of high priority.
Interim recommendations were informed by a scoping review of the underlying evidence, primary methods studies conducted, and a survey sent to 119 representatives from 20 Cochrane entities, who were asked to rate and rank RR methods across stages of review conduct. Discussions among those with expertise in RR methods further informed the list of recommendations with accompanying rationales provided.
Based on survey results from 63 respondents (53% response rate), 26 RR methods recommendations are presented for which there was a high or moderate level of agreement or scored highest in the absence of such agreement. Where possible, how recommendations align with Cochrane methods guidance for systematic reviews is highlighted.
The Cochrane Rapid Reviews Methods Group offers new, interim guidance to support the conduct of RRs. Because best practice is limited by the lack of currently available evidence for some RR methods shortcuts taken, this guidance will need to be updated as additional abbreviated methods are evaluated.
The aim of this systematic review was to assess the performance of anthropometric tools to determine obesity in the general population (CRD42018086888). Our review included 32 studies. To detect ...obesity with body mass index (BMI), the meta-analyses rendered a sensitivity of 51.4% (95% CI 38.5-64.2%) and a specificity of 95.4% (95% CI 90.7-97.8%) in women, and 49.6% (95% CI 34.8-64.5%) and 97.3% (95% CI 92.1-99.1%), respectively, in men. For waist circumference (WC), the summary estimates for the sensitivity were 62.4% (95% CI 49.2-73.9%) and 88.1% for the specificity (95% CI 77.0-94.2%) in men, and 57.0% (95% CI 32.2-79.0%) and 94.8% (95% CI 85.8-98.2%), respectively, in women. The data were insufficient to pool the results for waist-to-hip ratio (WHR) and waist-to-height ratio (WHtR) but were similar to BMI and WC. In conclusion, BMI and WC have serious limitations for use as obesity screening tools in clinical practice despite their widespread use. No evidence supports that WHR and WHtR are more suitable than BMI or WC to assess body fat. However, due to the lack of more accurate and feasible alternatives, BMI and WC might still have a role as initial tools for assessing individuals for excess adiposity until new evidence emerges.
Abstract Objective To clarify the GRADE (grading of recommendations assessment, development and evaluation) definition of certainty of evidence and suggest possible approaches to rating certainty of ...the evidence for systematic reviews, health technology assessments and guidelines. Study Design and Setting This work was carried out by a project group within the GRADE Working Group, through brainstorming and iterative refinement of ideas, using input from workshops, presentations, and discussions at GRADE Working Group meetings to produce this document, which constitutes official GRADE guidance. Results Certainty of evidence is best considered as the certainty that a true effect lies on one side of a specified threshold, or within a chosen range. We define possible approaches for choosing threshold or range. For guidelines, what we call a fully contextualized approach requires simultaneously considering all critical outcomes and their relative value. Less contextualized approaches, more appropriate for systematic reviews and health technology assessments, include using specified ranges of magnitude of effect, e.g. ranges of what we might consider no effect, trivial, small, moderate, or large effects. Conclusion It is desirable for systematic review authors, guideline panelists, and health technology assessors to specify the threshold or ranges they are using when rating the certainty in evidence.
We aimed to assess the impact of timing of surgery in elderly patients with acute hip fracture on morbidity and mortality. We systematically searched MEDLINE, the Cochrane Library, Embase, PubMed, ...and trial registries from 01/1997 to 05/2017, as well as reference lists of relevant reviews, archives of orthopaedic conferences, and contacted experts. Eligible studies had to be randomised controlled trials (RCTs) or prospective cohort studies, including patients 60 years or older with acute hip fracture. Two authors independently assessed study eligibility, abstracted data, and critically appraised study quality. We conducted meta-analyses using the generic inverse variance model. We included 28 prospective observational studies reporting data of 31,242 patients. Patients operated on within 48 hours had a 20% lower risk of dying within 12 months (risk ratio (RR) 0.80, 95% confidence interval (CI) 0.66-0.97). No statistical significant different mortality risk was observed when comparing patients operated on within or after 24 hours (RR 0.82, 95% CI 0.67-1.01). Adjusted data demonstrated fewer complications (8% vs. 17%) in patients who had early surgery, and increasing risk for pressure ulcers with increased time of delay in another study. Early hip surgery within 48 hours was associated with lower mortality risk and fewer perioperative complications.
Background
Coronavirus disease 2019 (COVID‐19) is caused by the novel betacoronavirus, severe acute respiratory syndrome coronavirus‐2 (SARS‐CoV‐2). Most people infected with SARS‐CoV‐2 have mild ...disease with unspecific symptoms, but about 5% become critically ill with respiratory failure, septic shock and multiple organ failure. An unknown proportion of infected individuals never experience COVID‐19 symptoms although they are infectious, that is, they remain asymptomatic. Those who develop the disease, go through a presymptomatic period during which they are infectious. Universal screening for SARS‐CoV‐2 infections to detect individuals who are infected before they present clinically, could therefore be an important measure to contain the spread of the disease.
Objectives
We conducted a rapid review to assess (1) the effectiveness of universal screening for SARS‐CoV‐2 infection compared with no screening and (2) the accuracy of universal screening in people who have not presented to clinical care for symptoms of COVID‐19.
Search methods
An information specialist searched Ovid MEDLINE and the Centers for Disease Control (CDC) COVID‐19 Research Articles Downloadable Database up to 26 May 2020. We searched Embase.com, the CENTRAL, and the Cochrane Covid‐19 Study Register on 14 April 2020. We searched LitCovid to 4 April 2020. The World Health Organization (WHO) provided records from daily searches in Chinese databases and in PubMed up to 15 April 2020. We also searched three model repositories (Covid‐Analytics, Models of Infectious Disease Agent Study MIDAS, and Society for Medical Decision Making) on 8 April 2020.
Selection criteria
Trials, observational studies, or mathematical modelling studies assessing screening effectiveness or screening accuracy among general populations in which the prevalence of SARS‐CoV2 is unknown.
Data collection and analysis
After pilot testing review forms, one review author screened titles and s. Two review authors independently screened the full text of studies and resolved any disagreements by discussion with a third review author. s excluded by a first review author were dually reviewed by a second review author prior to exclusion. One review author independently extracted data, which was checked by a second review author for completeness and accuracy. Two review authors independently rated the quality of included studies using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS‐2) tool for diagnostic accuracy studies and a modified form designed originally for economic evaluations for modelling studies. We resolved differences by consensus. We synthesized the evidence in narrative and tabular formats. We rated the certainty of evidence for days to outbreak, transmission, cases missed and detected, diagnostic accuracy (i.e. true positives, false positives, true negatives, false negatives) using the GRADE approach.
Main results
We included 22 publications. Two modelling studies reported on effectiveness of universal screening. Twenty studies (17 cohort studies and 3 modelling studies) reported on screening test accuracy.
Effectiveness of screening
We included two modelling studies. One study suggests that symptom screening at travel hubs, such as airports, may slightly slow but not stop the importation of infected cases (assuming 10 or 100 infected travellers per week reduced the delay in a local outbreak to 8 days or 1 day, respectively). We assessed risk of bias as minor or no concerns, and certainty of evidence was low, downgraded for very serious indirectness. The second modelling study provides very low‐certainty evidence that screening of healthcare workers in emergency departments using laboratory tests may reduce transmission to patients and other healthcare workers (assuming a transmission constant of 1.2 new infections per 10,000 people, weekly screening reduced infections by 5.1% within 30 days). The certainty of evidence was very low, downgraded for high risk of bias (major concerns) and indirectness. No modelling studies reported on harms of screening.
Screening test accuracy
All 17 cohort studies compared an index screening strategy to a reference reverse transcriptase polymerase chain reaction (RT‐PCR) test. All but one study reported on the accuracy of single point‐in‐time screening and varied widely in prevalence of SARS‐CoV‐2, settings, and methods of measurement.
We assessed the overall risk of bias as unclear in 16 out of 17 studies, mainly due to limited information on the index test and reference standard. We rated one study as being at high risk of bias due to the inclusion of two separate populations with likely different prevalences. For several screening strategies, the estimates of sensitivity came from small samples.
For single point‐in‐time strategies, for symptom assessment, the sensitivity from 12 cohorts (524 people) ranged from 0.00 to 0.60 (very low‐certainty evidence) and the specificity from 12 cohorts (16,165 people) ranged from 0.66 to 1.00 (low‐certainty evidence). For screening using direct temperature measurement (3 cohorts, 822 people), international travel history (2 cohorts, 13,080 people), or exposure to known infected people (3 cohorts, 13,205 people) or suspected infected people (2 cohorts, 954 people), sensitivity ranged from 0.00 to 0.23 (very low‐ to low‐certainty evidence) and specificity ranged from 0.90 to 1.00 (low‐ to moderate‐certainty evidence). For symptom assessment plus direct temperature measurement (2 cohorts, 779 people), sensitivity ranged from 0.12 to 0.69 (very low‐certainty evidence) and specificity from 0.90 to 1.00 (low‐certainty evidence). For rapid PCR test (1 cohort, 21 people), sensitivity was 0.80 (95% confidence interval (CI) 0.44 to 0.96; very low‐certainty evidence) and specificity was 0.73 (95% CI 0.39 to 0.94; very low‐certainty evidence). One cohort (76 people) reported on repeated screening with symptom assessment and demonstrates a sensitivity of 0.44 (95% CI 0.29 to 0.59; very low‐certainty evidence) and specificity of 0.62 (95% CI 0.42 to 0.79; low‐certainty evidence).
Three modelling studies evaluated the accuracy of screening at airports. The main outcomes measured were cases missed or detected by entry or exit screening, or both, at airports. One study suggests very low sensitivity at 0.30 (95% CI 0.1 to 0.53), missing 70% of infected travellers. Another study described an unrealistic scenario to achieve a 90% detection rate, requiring 0% asymptomatic infections. The final study provides very uncertain evidence due to low methodological quality.
Authors' conclusions
The evidence base for the effectiveness of screening comes from two mathematical modelling studies and is limited by their assumptions. Low‐certainty evidence suggests that screening at travel hubs may slightly slow the importation of infected cases. This review highlights the uncertainty and variation in accuracy of screening strategies. A high proportion of infected individuals may be missed and go on to infect others, and some healthy individuals may be falsely identified as positive, requiring confirmatory testing and potentially leading to the unnecessary isolation of these individuals. Further studies need to evaluate the utility of rapid laboratory tests, combined screening, and repeated screening. More research is also needed on reference standards with greater accuracy than RT‐PCR.
Given the poor sensitivity of existing approaches, our findings point to the need for greater emphasis on other ways that may prevent transmission such as face coverings, physical distancing, quarantine, and adequate personal protective equipment for frontline workers.
Abstract Objective This article is to establish recommendations for conducting quantitative synthesis, or meta-analysis, using study-level data in comparative effectiveness reviews (CERs) for the ...Evidence-based Practice Center (EPC) program of the Agency for Healthcare Research and Quality. Study Design and Setting We focused on recurrent issues in the EPC program and the recommendations were developed using group discussion and consensus based on current knowledge in the literature. Results We first discussed considerations for deciding whether to combine studies, followed by discussions on indirect comparison and incorporation of indirect evidence. Then, we described our recommendations on choosing effect measures and statistical models, giving special attention to combining studies with rare events; and on testing and exploring heterogeneity. Finally, we briefly presented recommendations on combining studies of mixed design and on sensitivity analysis. Conclusion Quantitative synthesis should be conducted in a transparent and consistent way. Inclusion of multiple alternative interventions in CERs increases the complexity of quantitative synthesis, whereas the basic issues in quantitative synthesis remain crucial considerations in quantitative synthesis for a CER. We will cover more issues in future versions and update and improve recommendations with the accumulation of new research to advance the goal for transparency and consistency.
Study question What are the benefits and harms of second generation antidepressants and cognitive behavioral therapies (CBTs) in the initial treatment of a current episode of major depressive ...disorder in adults?Methods This was a systematic review including qualitative assessment and meta-analyses using random and fixed effects models. Medline, Embase, the Cochrane Library, the Allied and Complementary Medicine Database, PsycINFO, and the Cumulative Index to Nursing and Allied Health Literature were searched from January1990 through January 2015. The 11 randomized controlled trials included compared a second generation antidepressant CBT. Ten trials compared antidepressant monotherapy with CBT alone; three compared antidepressant monotherapy with antidepressant plus CBT.Summary answer and limitations Meta-analyses found no statistically significant difference in effectiveness between second generation antidepressants and CBT for response (risk ratio 0.91, 0.77 to 1.07), remission (0.98, 0.73 to 1.32), or change in 17 item Hamilton Rating Scale for Depression score (weighted mean difference, −0.38, −2.87 to 2.10). Similarly, no significant differences were found in rates of overall study discontinuation (risk ratio 0.90, 0.49 to 1.65) or discontinuation attributable to lack of efficacy (0.40, 0.05 to 2.91). Although more patients treated with a second generation antidepressant than receiving CBT withdrew from studies because of adverse events, the difference was not statistically significant (risk ratio 3.29, 0.42 to 25.72). No conclusions could be drawn about other outcomes because of lack of evidence. Results should be interpreted cautiously given the low strength of evidence for most outcomes. The scope of this review was limited to trials that enrolled adult patients with major depressive disorder and compared a second generation antidepressant with CBT, and many of the included trials had methodological shortcomings that may limit confidence in some of the findings.What this study adds Second generation antidepressants and CBT have evidence bases of benefits and harms in major depressive disorder. Available evidence suggests no difference in treatment effects of second generation antidepressants and CBT, either alone or in combination, although small numbers may preclude detection of small but clinically meaningful differences.Funding, competing interests, data sharing This project was funded under contract from the Agency for Healthcare Research and Quality by the RTI-UNC Evidence-based Practice Center. Detailed methods and additional information are available in the full report, available at http://effectivehealthcare.ahrq.gov/.
Therapeutic strategies with immune checkpoint inhibitors (ICIs) counteract the immunosuppressive effects of programmed cell death protein-1 (PD-1) and ligand-1 (PD-L1). ICI treatment has emerged in ...first- and second-line therapy of non-small cell lung cancer (NSCLC). As immunotherapeutic treatment with ICIs is a dynamic field where new drugs and combinations are constantly evaluated, we conducted an up-to-date systematic review on comparative efficacy and safety in patients with advanced NSCLC.
We searched PubMed up to February 2020 and Embase, CENTRAL, and clinical trial registries up to August 2018. Additionally, we checked reference lists. We dually screened titles, abstracts and, subsequently, full-texts for eligibility. Two reviewers assessed the risk of bias and graded the certainty of evidence following GRADE (Grading of Recommendations Assessment, Development and Evaluation). For second-line therapy, we performed random-effects meta-analyses. Due to considerable clinical heterogeneity, we reported first-line results narratively.
Of 1497 references, we identified 22 relevant publications of 16 studies. For first-line therapy, a combination of an ICI with chemotherapy improved progression-free survival and overall survival compared to chemotherapy but increased the risk of serious adverse events. Single-agent pembrolizumab increased overall and progression-free survival in patients with PD-L1 expression of ≥50% and resulted in less TRAE than chemotherapy. Compared to placebo, maintenance therapy with durvalumab increased overall and progression-free survival at the downside of higher risk of TRAE. For second-line therapy, a random-effects meta-analysis yielded a statistically significantly improved overall survival (OS) and progression-free survival (PFS) for ICIs compared to docetaxel (HR 0.69; 95% CI: 0.63-0.75 for OS; HR 0.85; 95% CI: 0.77 − 0.93 for PFS; 6 studies, 3478 patients; median OS benefit in months: 2.4 to 4.2). In meta-analysis, risk of any treatment-related adverse events of any grade was lower for ICI than docetaxel as second-line therapy (RR 0.76, 95% CI: 0.73-0.79; 6 studies, 3763 patients).
In first-line therapy of patients with advanced NSCLC, ICI is effective when combined with chemotherapy not depending on PD-L1 expression, or as monotherapy in high PD-L1 expressing tumors. For second-line therapy, single-agent ICI improves efficacy and safety compared to docetaxel.
Abstract Objective We evaluated the inter-rater reliability (IRR) of assessing the quality of evidence (QoE) using the Grading of Recommendations, Assessment, Development, and Evaluation (GRADE) ...approach. Study Design and Setting On completing two training exercises, participants worked independently as individual raters to assess the QoE of 16 outcomes. After recording their initial impression using a global rating, raters graded the QoE following the GRADE approach. Subsequently, randomly paired raters submitted a consensus rating. Results The IRR without using the GRADE approach for two individual raters was 0.31 (95% confidence interval 95% CI = 0.21–0.42) among Health Research Methodology students ( n = 10) and 0.27 (95% CI = 0.19–0.37) among the GRADE working group members ( n = 15). The corresponding IRR of the GRADE approach in assessing the QoE was significantly higher, that is, 0.66 (95% CI = 0.56–0.75) and 0.72 (95% CI = 0.61–0.79), respectively. The IRR further increased for three (0.80 95% CI = 0.73–0.86 and 0.74 95% CI = 0.65–0.81) or four raters (0.84 95% CI = 0.78–0.89 and 0.79 95% CI = 0.71–0.85). The IRR did not improve when QoE was assessed through a consensus rating. Conclusion Our findings suggest that trained individuals using the GRADE approach improves reliability in comparison to intuitive judgments about the QoE and that two individual raters can reliably assess the QoE using the GRADE system.
Aim of this study was investigate the prevalence and incidence of atrial fibrillation (AF) and to describe the clinical characteristics, risk profiles, and types of anticoagulant therapy for stroke ...prevention and the clinical outcomes in persons admitted to a long-term care hospital. We conducted a retrospective cohort study using data from the electronic medical records of patients aged 65 years or older living in two long-term care hospitals between January 1, 2014 and October 31, 2017. Overall data from 1148 patients (mean age 84.1 ± 7.9 years, 74.2% women) were analyzed. At baseline, the median CHA
DS
-VASc score was 4 (IQR 3-5) and the HAS-BLED score 2 (IQR 2-3). We observed patients over a median period of 3.7 years. The point prevalence of AF was 29.6% (95% CI 25.8-33.7) on January 1, 2014. The 1-year cumulative incidence of de novo AF was 4.0% (2.8-5.6). Oral anticoagulants were prescribed in 48% of patients with AF. The cumulative incidence at 1 year for a composite outcome of TIA, stroke, or systemic arterial embolism was 0.6% (0.1-3.1) and 1.7% (0.5-4.6) and for bleeding 2.6% (0.9-6.2) and 1.8% (0.5-4.8) in patients with AF and oral anticoagulants or no oral anticoagulants, respectively. In long-term care hospital patients, we observed a high burden of AF. However, only about half of patients with AF received oral anticoagulation for stroke prevention.