Dengue hemorrhagic fever (DHF), a severe manifestation of dengue viral infection that can cause severe bleeding, organ impairment, and even death, affects between 15,000 and 105,000 people each year ...in Thailand. While all Thai provinces experience at least one DHF case most years, the distribution of cases shifts regionally from year to year. Accurately forecasting where DHF outbreaks occur before the dengue season could help public health officials prioritize public health activities. We develop statistical models that use biologically plausible covariates, observed by April each year, to forecast the cumulative DHF incidence for the remainder of the year. We perform cross-validation during the training phase (2000–2009) to select the covariates for these models. A parsimonious model based on preseason incidence outperforms the 10-y median for 65% of province-level annual forecasts, reduces the mean absolute error by 19%, and successfully forecasts outbreaks (area under the receiver operating characteristic curve = 0.84) over the testing period (2010–2014). We find that functions of past incidence contribute most strongly to model performance, whereas the importance of environmental covariates varies regionally. This work illustrates that accurate forecasts of dengue risk are possible in a policy-relevant timeframe.
The COVID-19 pandemic has made it clear that epidemic models play an important role in how governments and the public respond to infectious disease crises. Early in the pandemic, models were used to ...estimate the true number of infections. Later, they estimated key parameters, generated short-term forecasts of outbreak trends, and quantified possible effects of interventions on the unfolding epidemic.1,2 In contrast to the coordinating role played by major national or international agencies in weather-related emergencies, pandemic modeling efforts were initially scattered across many research institutions. Differences in modeling approaches led to contrasting results, contributing to confusion in public perception of the pandemic. Efforts to coordinate modeling efforts in so-called "hubs" have provided governments, healthcare agencies, and the public with assessments and forecasts that reflect the consensus in the modeling community.3-6 This has been achieved by openly synthesizing uncertainties across different modeling approaches and facilitating comparisons between them.
Since 2013, the Centers for Disease Control and Prevention (CDC) has hosted an annual influenza season forecasting challenge. The 2015-2016 challenge consisted of weekly probabilistic forecasts of ...multiple targets, including fourteen models submitted by eleven teams. Forecast skill was evaluated using a modified logarithmic score. We averaged submitted forecasts into a mean ensemble model and compared them against predictions based on historical trends. Forecast skill was highest for seasonal peak intensity and short-term forecasts, while forecast skill for timing of season onset and peak week was generally low. Higher forecast skill was associated with team participation in previous influenza forecasting challenges and utilization of ensemble forecasting techniques. The mean ensemble consistently performed well and outperformed historical trend predictions. CDC and contributing teams will continue to advance influenza forecasting and work to improve the accuracy and reliability of forecasts to facilitate increased incorporation into public health response efforts.
Identifying data streams that can consistently improve the accuracy of epidemiological forecasting models is challenging. Using models designed to predict daily state-level hospital admissions due to ...COVID-19 in California and Massachusetts, we investigated whether incorporating COVID-19 case data systematically improved forecast accuracy. Additionally, we considered whether using case data aggregated by date of test or by date of report from a surveillance system made a difference to the forecast accuracy. Evaluating forecast accuracy in a test period, after first having selected the best-performing methods in a validation period, we found that overall the difference in accuracy between approaches was small, especially at forecast horizons of less than two weeks. However, forecasts from models using cases aggregated by test date showed lower accuracy at longer horizons and at key moments in the pandemic, such as the peak of the Omicron wave in January 2022. Overall, these results highlight the challenge of finding a modeling approach that can generate accurate forecasts of outbreak trends both during periods of relative stability and during periods that show rapid growth or decay of transmission rates. While COVID-19 case counts seem to be a natural choice to help predict COVID-19 hospitalizations, in practice any benefits we observed were small and inconsistent.
The U.S. COVID-19 Forecast Hub aggregates forecasts of the short-term burden of COVID-19 in the United States from many contributing teams. We study methods for building an ensemble that combines ...forecasts from these teams. These experiments have informed the ensemble methods used by the Hub. To be most useful to policymakers, ensemble forecasts must have stable performance in the presence of two key characteristics of the component forecasts: (1) occasional misalignment with the reported data, and (2) instability in the relative performance of component forecasters over time. Our results indicate that in the presence of these challenges, an untrained and robust approach to ensembling using an equally weighted median of all component forecasts is a good choice to support public health decision-makers. In settings where some contributing forecasters have a stable record of good performance, trained ensembles that give those forecasters higher weight can also be helpful.
A person's physical activity has important health implications, so it is important to be able to measure aspects of physical activity objectively. One approach to doing that is to use data from an ...accelerometer to classify physical activity according to activity type (e.g., lying down, sitting, standing, or walking) or intensity (e.g., sedentary, light, moderate, or vigorous). This can be formulated as a labeled classification problem, where the model relates a feature vector summarizing the accelerometer signal in a window of time to the activity type or intensity in that window. These data exhibit two key characteristics: (1) the activity classes in different time windows are not independent, and (2) the accelerometer features have moderately high dimension and follow complex distributions. Through a simulation study and applications to three datasets, we demonstrate that a model's classification performance is related to how it addresses these aspects of the data. Dynamic methods that account for temporal dependence achieve better performance than static methods that do not. Generative methods that explicitly model the distribution of the accelerometer signal features do not perform as well as methods that take a discriminative approach to establishing the relationship between the accelerometer signal and the activity class. Specifically, Conditional Random Fields consistently have better performance than commonly employed methods that ignore temporal dependence or attempt to model the accelerometer features.
Forecasting has emerged as an important component of informed, data-driven decision-making in a wide array of fields. We introduce a new data model for probabilistic predictions that encompasses a ...wide range of forecasting settings. This framework clearly defines the constituent parts of a probabilistic forecast and proposes one approach for representing these data elements. The data model is implemented in Zoltar, a new software application that stores forecasts using the data model and provides standardized API access to the data. In one real-time case study, an instance of the Zoltar web application was used to store, provide access to, and evaluate real-time forecast data on the order of 10
rows, provided by over 40 international research teams from academia and industry making forecasts of the COVID-19 outbreak in the US. Tools and data infrastructure for probabilistic forecasts, such as those introduced here, will play an increasingly important role in ensuring that future forecasting research adheres to a strict set of rigorous and reproducible standards.
REPLY TO BRACHER Reich, Nicholas G.; Osthus, Dave; Ray, Evan L. ...
Proceedings of the National Academy of Sciences - PNAS,
10/2019, Letnik:
116, Številka:
42
Journal Article
Recenzirano
Odprti dostop
Evaluating probabilistic forecasts in the context of a real-time public health surveillance system is a complicated business. We agree with Bracher’s (1) observations that the scores established by ...the US Centers for Disease Control and Prevention (CDC) and used to evaluate our forecasts of seasonal influenza in the United States are not “proper” by definition (2). We thank him for raising this important issue.
Heart disease has been the leading cause of death in the United States since 1910 and cancer the second leading cause of death since 1933. However, cancer emerged recently as the leading cause of ...death in many US states. The objective of this study was to provide an in-depth analysis of age-standardized annual state-specific mortality rates for heart disease and cancer.
We used population-based mortality data from 1999 through 2016 to compare 2 underlying cause-of-death categories: diseases of heart (International Classification of Diseases, 10th Revision ICD-10 codes I00-I09, I11, I13, and I20-I51) and malignant neoplasms (ICD-10 codes C00-C97). We calculated age-standardized annual state-specific mortality rate ratios (MRRs) as heart disease mortality rate divided by cancer mortality rate.
In 1999, age-standardized heart disease mortality exceeded that for cancer in all 50 states. Median state-specific MRR in 1999 was 1.26 (interquartile range IQR, 1.17-1.34; range, 1.03-1.56), indicating predominance of heart disease mortality nationwide. Median state-specific MRR decreased annually through 2010, reaching a low of 1.00 (IQR, 0.95-1.07; range, 0.71-1.25), indicating that predominance of heart disease mortality prevailed in approximately half of states. Median state-specific MRR increased to 1.03 (IQR, 0.97-1.12; range, 0.77-1.31) in 2016. In 2016, age-standardized cancer mortality exceeded that for heart disease in 19 states. State-level transitions were most apparent for people aged 65 to 84 and affected men, women, and all racial/ethnic groups.
State-level data indicated heterogeneity across US states in the predominance of heart disease mortality relative to cancer mortality. Timing and magnitude of transitions toward cancer mortality predominance varied by state.
The mechanistic pathways linking genetic polymorphisms and complex disease traits remain largely uncharacterized. At the same time, expansive new transcriptome data resources offer unprecedented ...opportunity to unravel the mechanistic underpinnings of complex disease associations. Two-stage strategies involving conditioning on a single, penalized regression imputation for transcriptome association analysis have been described for cross-sectional traits. In this manuscript, we propose an alternative two-stage approach based on stochastic regression imputation that additionally incorporates error in the predictive model. Application of a bootstrap procedure offers flexibility when a closed form predictive distribution is not available. The two-stage strategy is also generalized to longitudinally measured traits, using a linear mixed effects modeling framework and a composite test statistic to evaluate whether the genetic component of gene-level expression modifies the biomarker trajectory over time. Simulations studies are performed to evaluate relative performance with respect to type-1 error rates, coverage, estimation error, and power under a range of conditions. A case study is presented to investigate the association between whole blood expression for each of five inflammasome genes with inflammatory response over time after endotoxin challenge.