Serological antibody levels are a sensitive marker of pathogen exposure, and advances in multiplex assays have created enormous potential for large-scale, integrated infectious disease surveillance. ...Most methods to analyze antibody measurements reduce quantitative antibody levels to seropositive and seronegative groups, but this can be difficult for many pathogens and may provide lower resolution information than quantitative levels. Analysis methods have predominantly maintained a single disease focus, yet integrated surveillance platforms would benefit from methodologies that work across diverse pathogens included in multiplex assays.
We developed an approach to measure changes in transmission from quantitative antibody levels that can be applied to diverse pathogens of global importance. We compared age-dependent immunoglobulin G curves in repeated cross-sectional surveys between populations with differences in transmission for multiple pathogens, including: lymphatic filariasis (Wuchereria bancrofti) measured before and after mass drug administration on Mauke, Cook Islands, malaria (Plasmodium falciparum) before and after a combined insecticide and mass drug administration intervention in the Garki project, Nigeria, and enteric protozoans (Cryptosporidium parvum, Giardia intestinalis, Entamoeba histolytica), bacteria (enterotoxigenic Escherichia coli, Salmonella spp.), and viruses (norovirus groups I and II) in children living in Haiti and the USA. Age-dependent antibody curves fit with ensemble machine learning followed a characteristic shape across pathogens that aligned with predictions from basic mechanisms of humoral immunity. Differences in pathogen transmission led to shifts in fitted antibody curves that were remarkably consistent across pathogens, assays, and populations. Mean antibody levels correlated strongly with traditional measures of transmission intensity, such as the entomological inoculation rate for P. falciparum (Spearman's rho = 0.75). In both high- and low transmission settings, mean antibody curves revealed changes in population mean antibody levels that were masked by seroprevalence measures because changes took place above or below the seropositivity cutoff.
Age-dependent antibody curves and summary means provided a robust and sensitive measure of changes in transmission, with greatest sensitivity among young children. The method generalizes to pathogens that can be measured in high-throughput, multiplex serological assays, and scales to surveillance activities that require high spatiotemporal resolution. Our results suggest quantitative antibody levels will be particularly useful to measure differences in exposure for pathogens that elicit a transient antibody response or for monitoring populations with very high- or very low transmission, when seroprevalence is less informative. The approach represents a new opportunity to conduct integrated serological surveillance for neglected tropical diseases, malaria, and other infectious diseases with well-defined antigen targets.
Poor nutrition and exposure to faecal contamination are associated with diarrhoea and growth faltering, both of which have long-term consequences for child health. We aimed to assess whether water, ...sanitation, handwashing, and nutrition interventions reduced diarrhoea or growth faltering.
The WASH Benefits cluster-randomised trial enrolled pregnant women from villages in rural Kenya and evaluated outcomes at 1 year and 2 years of follow-up. Geographically-adjacent clusters were block-randomised to active control (household visits to measure mid-upper-arm circumference), passive control (data collection only), or compound-level interventions including household visits to promote target behaviours: drinking chlorinated water (water); safe sanitation consisting of disposing faeces in an improved latrine (sanitation); handwashing with soap (handwashing); combined water, sanitation, and handwashing; counselling on appropriate maternal, infant, and young child feeding plus small-quantity lipid-based nutrient supplements from 6–24 months (nutrition); and combined water, sanitation, handwashing, and nutrition. Primary outcomes were caregiver-reported diarrhoea in the past 7 days and length-for-age Z score at year 2 in index children born to the enrolled pregnant women. Masking was not possible for data collection, but analyses were masked. Analysis was by intention to treat. This trial is registered with ClinicalTrials.gov, number NCT01704105.
Between Nov 27, 2012, and May 21, 2014, 8246 women in 702 clusters were enrolled and randomly assigned an intervention or control group. 1919 women were assigned to the active control group; 938 to passive control; 904 to water; 892 to sanitation; 917 to handwashing; 912 to combined water, sanitation, and handwashing; 843 to nutrition; and 921 to combined water, sanitation, handwashing, and nutrition. Data on diarrhoea at year 1 or year 2 were available for 6494 children and data on length-for-age Z score in year 2 were available for 6583 children (86% of living children were measured at year 2). Adherence indicators for sanitation, handwashing, and nutrition were more than 70% at year 1, handwashing fell to less than 25% at year 2, and for water was less than 45% at year 1 and less than 25% at year 2; combined groups were comparable to single groups. None of the interventions reduced diarrhoea prevalence compared with the active control. Compared with active control (length-for-age Z score −1·54) children in nutrition and combined water, sanitation, handwashing, and nutrition were taller by year 2 (mean difference 0·13 95% CI 0·01–0·25 in the nutrition group; 0·16 0·05–0·27 in the combined water, sanitation, handwashing, and nutrition group). The individual water, sanitation, and handwashing groups, and combined water, sanitation, and handwashing group had no effect on linear growth.
Behaviour change messaging combined with technologically simple interventions such as water treatment, household sanitation upgrades from unimproved to improved latrines, and handwashing stations did not reduce childhood diarrhoea or improve growth, even when adherence was at least as high as has been achieved by other programmes. Counselling and supplementation in the nutrition group and combined water, sanitation, handwashing, and nutrition interventions led to small growth benefits, but there was no advantage to integrating water, sanitation, and handwashing with nutrition. The interventions might have been more efficacious with higher adherence or in an environment with lower baseline sanitation coverage, especially in this context of high diarrhoea prevalence.
Bill & Melinda Gates Foundation, United States Agency for International Development.
As computational power improves, the application of more advanced machine learning techniques to the analysis of large genome-wide association (GWA) datasets becomes possible. While most traditional ...statistical methods can only elucidate main effects of genetic variants on risk for disease, certain machine learning approaches are particularly suited to discover higher order and non-linear effects. One such approach is the Random Forests (RF) algorithm. The use of RF for SNP discovery related to human disease has grown in recent years; however, most work has focused on small datasets or simulation studies which are limited.
Using a multiple sclerosis (MS) case-control dataset comprised of 300 K SNP genotypes across the genome, we outline an approach and some considerations for optimally tuning the RF algorithm based on the empirical dataset. Importantly, results show that typical default parameter values are not appropriate for large GWA datasets. Furthermore, gains can be made by sub-sampling the data, pruning based on linkage disequilibrium (LD), and removing strong effects from RF analyses. The new RF results are compared to findings from the original MS GWA study and demonstrate overlap. In addition, four new interesting candidate MS genes are identified, MPHOSPH9, CTNNA3, PHACTR2 and IL7, by RF analysis and warrant further follow-up in independent studies.
This study presents one of the first illustrations of successfully analyzing GWA data with a machine learning algorithm. It is shown that RF is computationally feasible for GWA data and the results obtained make biologic sense based on previous studies. More importantly, new genes were identified as potentially being associated with MS, suggesting new avenues of investigation for this complex disease.
Machine learning (ML) and artificial intelligence (AI) algorithms have the potential to derive insights from clinical data and improve patient outcomes. However, these highly complex systems are ...sensitive to changes in the environment and liable to performance decay. Even after their successful integration into clinical practice, ML/AI algorithms should be continuously monitored and updated to ensure their long-term safety and effectiveness. To bring AI into maturity in clinical care, we advocate for the creation of hospital units responsible for quality assurance and improvement of these algorithms, which we refer to as "AI-QI" units. We discuss how tools that have long been used in hospital quality assurance and quality improvement can be adapted to monitor static ML algorithms. On the other hand, procedures for continual model updating are still nascent. We highlight key considerations when choosing between existing methods and opportunities for methodological innovation.
In this work we introduce the personalized online super learner (POSL), an online personalizable ensemble machine learning algorithm for streaming data. POSL optimizes predictions with respect to ...baseline covariates, so personalization can vary from completely individualized, that is, optimization with respect to subject ID, to many individuals, that is, optimization with respect to common baseline covariates. As an online algorithm, POSL learns in real time. As a super learner, POSL is grounded in statistical optimality theory and can leverage a diversity of candidate algorithms, including online algorithms with different training and update times, fixed/offline algorithms that are not updated during POSL's fitting procedure, pooled algorithms that learn from many individuals' time series, and individualized algorithms that learn from within a single time series. POSL's ensembling of the candidates can depend on the amount of data collected, the stationarity of the time series, and the mutual characteristics of a group of time series. Depending on the underlying data‐generating process and the information available in the data, POSL is able to adapt to learning across samples, through time, or both. For a range of simulations that reflect realistic forecasting scenarios and in a medical application, we examine the performance of POSL relative to other current ensembling and online learning methods. We show that POSL is able to provide reliable predictions for both short and long time series, and it's able to adjust to changing data‐generating environments. We further cultivate POSL's practicality by extending it to settings where time series dynamically enter and exit.
Inhibition of DAF-2 (insulin-like growth factor 1 IGF-1 receptor) or RSKS-1 (S6K), key molecules in the insulin/IGF-1 signaling (IIS) and target of rapamycin (TOR) pathways, respectively, extend ...lifespan in Caenorhabditis elegans. However, it has not been clear how and in which tissues they interact with each other to modulate longevity. Here, we demonstrate that a combination of mutations in daf-2 and rsks-1 produces a nearly 5-fold increase in longevity that is much greater than the sum of single mutations. This synergistic lifespan extension requires positive feedback regulation of DAF-16 (FOXO) via the AMP-activated protein kinase (AMPK) complex. Furthermore, we identify germline as the key tissue for this synergistic longevity. Moreover, germline-specific inhibition of rsks-1 activates DAF-16 in the intestine. Together, our findings highlight the importance of the germline in the significantly increased longevity produced by daf-2 rsks-1, which has important implications for interactions between the two major conserved longevity pathways in more complex organisms.
Display omitted
•The daf-2 rsks-1 double mutant shows synergistic lifespan extension in C. elegans•AMPK mediates positive feedback regulation of DAF-16 in daf-2 rsks-1•Germline tissue is a key tissue in modulating this synergistic longevity•Inhibiting rsks-1 in the germline leads to cell-nonautonomous activation of DAF-16
The evolutionarily conserved insulin/insulin-like growth factor 1 (IGF-1) signaling and ribosomal S6 kinase (S6K) play a critical role in aging. Here, Chen, Kapahi, and colleagues show that simultaneous inhibition of DAF-2 (insulin/IGF-1 receptor) and RSKS-1 (S6K) leads to a nearly 5-fold synergistic lifespan extension in Caenorhabditis elegans. The mechanism of this exceptional longevity involves a positive feedback regulation of DAF-16 (FOXO transcription factor) via AMP-activated protein kinase (AMPK). Further studies highlight the germ line as a critical tissue for the daf-2 rsks-1-mediated synergistic longevity.
Super Learner van der Laan, Mark J.; Polley, Eric C; Hubbard, Alan E.
Statistical Applications in Genetics and Molecular Biology,
9/2007, Letnik:
6, Številka:
1
Journal Article
Recenzirano
When trying to learn a model for the prediction of an outcome given a set of covariates, a statistician has many estimation procedures in their toolbox. A few examples of these candidate learners ...are: least squares, least angle regression, random forests, and spline regression. Previous articles (van der Laan and Dudoit (2003); van der Laan et al. (2006); Sinisi et al. (2007)) theoretically validated the use of cross validation to select an optimal learner among many candidate learners. Motivated by this use of cross validation, we propose a new prediction method for creating a weighted combination of many candidate learners to build the super learner. This article proposes a fast algorithm for constructing a super learner in prediction which uses V-fold cross-validation to select weights to combine an initial set of candidate learners. In addition, this paper contains a practical demonstration of the adaptivity of this so called super learner to various true data generating distributions. This approach for construction of a super learner generalizes to any parameter which can be defined as a minimizer of a loss function.
Diarrhea and acute respiratory infection (ARI) are leading causes of death in children. The WASH Benefits Bangladesh trial implemented a multicomponent sanitation intervention that led to a 39% ...reduction in the prevalence of diarrhea among children and a 25% reduction for ARI, measured 1 to 2 years after intervention implementation. We measured longer-term intervention effects on these outcomes between 1 to 3.5 years after intervention implementation, including periods with differing intensity of behavioral promotion. WASH Benefits Bangladesh was a cluster-randomized controlled trial of water, sanitation, hygiene, and nutrition interventions (NCT01590095). The sanitation intervention included provision of or upgrades to improved latrines, sani-scoops for feces removal, children's potties, and in-person behavioral promotion. Promotion was intensive up to 2 years after intervention initiation, decreased in intensity between years 2 to 3, and stopped after 3 years. Access to and reported use of latrines was high in both arms, and latrine quality was significantly improved by the intervention, while use of child feces management tools was low. We enrolled a random subset of households from the sanitation and control arms into a longitudinal substudy, which measured child health with quarterly visits between 1 to 3.5 years after intervention implementation. The study period therefore included approximately 1 year of high-intensity promotion, 1 year of low-intensity promotion, and 6 months with no promotion. We assessed intervention effects on diarrhea and ARI prevalence among children <5 years through intention-to-treat analysis using generalized linear models with robust standard errors. Masking was not possible during data collection, but data analysis was masked. We enrolled 720 households (360 per arm) from the parent trial and made 9,800 child observations between June 2014 and December 2016. Over the entire study period, diarrheal prevalence was lower among children in the sanitation arm (11.9%) compared to the control arm (14.5%) (prevalence ratio PR = 0.81, 95% CI 0.66, 1.00, p = 0.05; prevalence difference PD = -0.027, 95% CI -0.053, 0, p = 0.05). ARI prevalence did not differ between sanitation (21.3%) and control (22.7%) arms (PR = 0.93, 95% CI 0.82, 1.05, p = 0.23; PD = -0.016, 95% CI -0.043, 0.010, p = 0.23). There were no significant differences in intervention effects between periods with high-intensity versus low-intensity/no promotion. Study limitations include use of caregiver-reported symptoms to define health outcomes and limited data collected after promotion ceased. The observed effect of the WASH Benefits Bangladesh sanitation intervention on diarrhea in children appeared to be sustained for at least 3.5 years after implementation, including 1.5 years after heavy promotion ceased. Existing latrine access was high in the study setting, suggesting that improving on-site latrine quality can deliver health benefits when latrine use practices are in place. Further work is needed to understand how latrine adoption can be achieved and sustained in settings with low existing access and how sanitation programs can adopt transformative approaches of excreta management, including safe disposal of child and animal feces, to generate a hygienic home environment.
Two modeling approaches are commonly used to estimate the associations between neighborhood characteristics and individual-level health outcomes in multilevel studies (subjects within neighborhoods). ...Random effects models (or mixed models) use maximum likelihood estimation. Population average models typically use a generalized estimating equation (GEE) approach. These methods are used in place of basic regression approaches because the health of residents in the same neighborhood may be correlated, thus violating independence assumptions made by traditional regression procedures. This violation is particularly relevant to estimates of the variability of estimates. Though the literature appears to favor the mixed-model approach, little theoretical guidance has been offered to justify this choice. In this paper, we review the assumptions behind the estimates and inference provided by these 2 approaches. We propose a perspective that treats regression models for what they are in most circumstances: reasonable approximations of some true underlying relationship. We argue in general that mixed models involve unverifiable assumptions on the data-generating distribution, which lead to potentially misleading estimates and biased inference. We conclude that the estimation-equation approach of population average models provides a more useful approximation of the truth.
Background To date little conclusive evidence exists on the seasonality of rotavirus incidence in the tropics. We present a systematic review and meta-analysis on the seasonal epidemiology of ...rotavirus in the tropics, including 26 studies reporting continuous monthly rotavirus incidence for which corresponding climatological data was available. Methods Using linear regression models that account for serial correlation between months, monthly rotavirus incidence was significantly negatively correlated with temperature, rainfall and relative humidity in 65%, 55% and 60% of studies, respectively. We carried out pooled analyses using a generalized estimating equation (GEE) that accounts for correlation from between-study variation and serial correlation between months within a given study. Results For every 1°C (1.8°F) increase in mean temperature, 1 cm (0.39 in.) increase in mean monthly rainfall, and 1% increase in relative humidity (22%) this analysis showed reductions in rotavirus incidence of 10% (95% CI: 6–13%), 1% (95% CI: 0–1%), and 3% (95% CI:0–5%), respectively. Conclusions On the basis of the evidence, we conclude that rotavirus responds to changes in climate in the tropics, with the highest number of infections found at the colder and drier times of the year.