Abstract
Summary
A common goal of microbiome studies is the elucidation of community composition and member interactions using counts of taxonomic units extracted from sequence data. Inference of ...interaction networks from sparse and compositional data requires specialized statistical approaches. A popular solution is SparCC, however its performance limits the calculation of interaction networks for very high-dimensional datasets. Here we introduce FastSpar, an efficient and parallelizable implementation of the SparCC algorithm which rapidly infers correlation networks and calculates P-values using an unbiased estimator. We further demonstrate that FastSpar reduces network inference wall time by 2-3 orders of magnitude compared to SparCC.
Availability and implementation
FastSpar source code, precompiled binaries and platform packages are freely available on GitHub: github.com/scwatts/FastSpar
Supplementary information
Supplementary data are available at Bioinformatics online.
Summary Dengue viruses cause more human morbidity and mortality than any other arthropod-borne virus. Dengue prevention relies mainly on vector control; however, the failure of traditional methods ...has promoted the development of novel entomological approaches. Although use of the intracellular bacterium wolbachia to control mosquito populations was proposed 50 years ago, only in the past decade has its use as a potential agent of dengue control gained substantial interest. Here, we review evidence that supports a practical approach for dengue reduction through field release of wolbachia-infected mosquitoes and discuss the additional studies that have to be done before the strategy can be validated and implemented. A crucial next step is to assess the efficacy of wolbachia in reducing dengue virus transmission. We argue that a cluster randomised trial is at this time premature because choice of wolbachia strain for release and deployment strategies are still being optimised. We therefore present a pragmatic approach to acquiring preliminary evidence of efficacy through various complementary methods including a prospective cohort study, a geographical cluster investigation, virus phylogenetic analysis, virus surveillance in mosquitoes, and vector competence assays. This multipronged approach could provide valuable intermediate evidence of efficacy to justify a future cluster randomised trial.
Abstract
Investigation of the genetic architecture of gene expression traits has aided interpretation of disease and trait-associated genetic variants; however, key aspects of expression quantitative ...trait loci (eQTL) study design and analysis remain understudied. We used extensive, empirically driven simulations to explore eQTL study design and the performance of various analysis strategies. Across multiple testing correction methods, false discoveries of genes with eQTLs (eGenes) were substantially inflated when false discovery rate (FDR) control was applied to all tests and only appropriately controlled using hierarchical procedures. All multiple testing correction procedures had low power and inflated FDR for eGenes whose causal SNPs had small allele frequencies using small sample sizes (e.g. frequency <10% in 100 samples), indicating that even moderately low frequency eQTL SNPs (eSNPs) in these studies are enriched for false discoveries. In scenarios with ≥80% power, the top eSNP was the true simulated eSNP 90% of the time, but substantially less frequently for very common eSNPs (minor allele frequencies >25%). Overestimation of eQTL effect sizes, so-called 'Winner's Curse', was common in low and moderate power settings. To address this, we developed a bootstrap method (BootstrapQTL) that led to more accurate effect size estimation. These insights provide a foundation for future eQTL studies, especially those with sampling constraints and subtly different conditions.
Tissue-resident CD8+ memory T (TRM) cells are immune cells that permanently reside at tissue sites where they play an important role in providing rapid protection against reinfection. They are not ...only phenotypically and functionally distinct from their circulating memory counterparts, but also exhibit a unique transcriptional profile. To date, the local tissue signals required for their development and long-term residency are not well understood. So far, the best-characterised tissue-derived signal is transforming growth factor-β (TGF-β), which has been shown to promote the development of these cells within tissues. In this study, we aimed to determine to what extent the transcriptional signatures of TRM cells from multiple tissues reflects TGF-β imprinting. We activated murine CD8+ T cells, stimulated them in vitro by TGF-β, and profiled their transcriptomes using RNA-seq. Upon comparison, we identified a TGF-β-induced signature of differentially expressed genes between TGF-β-stimulated and -unstimulated cells. Next, we linked this in vitro TGF-β-induced signature to a previously identified in vivo TRM-specific gene set and found considerable (>50%) overlap between the two gene sets, thus showing that a substantial part of the TRM signature can be attributed to TGF-β signalling. Finally, gene set enrichment analysis further revealed that the altered gene signature following TGF-β exposure reflected transcriptional signatures found in TRM cells from both epithelial and non-epithelial tissues. In summary, these findings show that TGF-β has a broad footprint in establishing the residency-specific transcriptional profile of TRM cells, which is detectable in TRM cells from diverse tissues. They further suggest that constitutive TGF-β signaling might be involved for their long-term persistence at tissue sites.
Human genetic variation affects the gut microbiota through a complex combination of environmental and host factors. Here we characterize genetic variations associated with microbial abundances in a ...single large-scale population-based cohort of 5,959 genotyped individuals with matched gut microbial metagenomes, and dietary and health records (prevalent and follow-up). We identified 567 independent SNP-taxon associations. Variants at the LCT locus associated with Bifidobacterium and other taxa, but they differed according to dairy intake. Furthermore, levels of Faecalicatena lactaris associated with ABO, and suggested preferential utilization of secreted blood antigens as energy source in the gut. Enterococcus faecalis levels associated with variants in the MED13L locus, which has been linked to colorectal cancer. Mendelian randomization analysis indicated a potential causal effect of Morganella on major depressive disorder, consistent with observational incident disease analysis. Overall, we identify and characterize the intricate nature of host-microbiota interactions and their association with disease.
Polygenic risk scores (PRSs) can stratify populations into cardiovascular disease (CVD) risk groups. We aimed to quantify the potential advantage of adding information on PRSs to conventional risk ...factors in the primary prevention of CVD.
Using data from UK Biobank on 306,654 individuals without a history of CVD and not on lipid-lowering treatments (mean age SD: 56.0 8.0 years; females: 57%; median follow-up: 8.1 years), we calculated measures of risk discrimination and reclassification upon addition of PRSs to risk factors in a conventional risk prediction model (i.e., age, sex, systolic blood pressure, smoking status, history of diabetes, and total and high-density lipoprotein cholesterol). We then modelled the implications of initiating guideline-recommended statin therapy in a primary care setting using incidence rates from 2.1 million individuals from the Clinical Practice Research Datalink. The C-index, a measure of risk discrimination, was 0.710 (95% CI 0.703-0.717) for a CVD prediction model containing conventional risk predictors alone. Addition of information on PRSs increased the C-index by 0.012 (95% CI 0.009-0.015), and resulted in continuous net reclassification improvements of about 10% and 12% in cases and non-cases, respectively. If a PRS were assessed in the entire UK primary care population aged 40-75 years, assuming that statin therapy would be initiated in accordance with the UK National Institute for Health and Care Excellence guidelines (i.e., for persons with a predicted risk of ≥10% and for those with certain other risk factors, such as diabetes, irrespective of their 10-year predicted risk), then it could help prevent 1 additional CVD event for approximately every 5,750 individuals screened. By contrast, targeted assessment only among people at intermediate (i.e., 5% to <10%) 10-year CVD risk could help prevent 1 additional CVD event for approximately every 340 individuals screened. Such a targeted strategy could help prevent 7% more CVD events than conventional risk prediction alone. Potential gains afforded by assessment of PRSs on top of conventional risk factors would be about 1.5-fold greater than those provided by assessment of C-reactive protein, a plasma biomarker included in some risk prediction guidelines. Potential limitations of this study include its restriction to European ancestry participants and a lack of health economic evaluation.
Our results suggest that addition of PRSs to conventional risk factors can modestly enhance prediction of first-onset CVD and could translate into population health benefits if used at scale.
Metabolic biomarker data quantified by nuclear magnetic resonance (NMR) spectroscopy in approximately 121,000 UK Biobank participants has recently been released as a community resource, comprising ...absolute concentrations and ratios of 249 circulating metabolites, lipids, and lipoprotein sub-fractions. Here we identify and characterise additional sources of unwanted technical variation influencing individual biomarkers in the data available to download from UK Biobank. These included sample preparation time, shipping plate well, spectrometer batch effects, drift over time within spectrometer, and outlier shipping plates. We developed a procedure for removing this unwanted technical variation, and demonstrate that it increases signal for genetic and epidemiological studies of the NMR metabolic biomarker data in UK Biobank. We subsequently developed an R package, ukbnmr, which we make available to the wider research community to enhance the utility of the UK Biobank NMR metabolic biomarker data and to facilitate rapid analysis.
Pseudomonas aeruginosa is an opportunistic pathogen and an important cause of infection, particularly amongst cystic fibrosis (CF) patients. While specific strains capable of patient-to-patient ...transmission are known, many infections appear to be caused by unique and unrelated strains. There is a need to understand the relationship between strains capable of colonising the CF lung and the broader set of P. aeruginosa isolates found in natural environments. Here we report the results of a multilocus sequence typing (MLST)-based study designed to understand the genetic diversity and population structure of an extensive regional sample of P. aeruginosa isolates from South East Queensland, Australia. The analysis is based on 501 P. aeruginosa isolates obtained from environmental, animal and human (CF and non-CF) sources with particular emphasis on isolates from the Lower Brisbane River and isolates from CF patients obtained from the same geographical region. Overall, MLST identified 274 different sequence types, of which 53 were shared between one or more ecological settings. Our analysis revealed a limited association between genotype and environment and evidence of frequent recombination. We also found that genetic diversity of P. aeruginosa in Queensland, Australia was indistinguishable from that of the global P. aeruginosa population. Several CF strains were encountered frequently in multiple ecological settings; however, the most frequently encountered CF strains were confined to CF patients. Overall, our data confirm a non-clonal epidemic structure and indicate that most CF strains are a random sample of the broader P. aeruginosa population. The increased abundance of some CF strains in different geographical regions is a likely product of chance colonisation events followed by adaptation to the CF lung and horizontal transmission among patients.
Corrosion is a ubiquitous failure mode of materials. Often, the progression of localized corrosion is accompanied by the evolution of porosity in materials previously reported to be either ...three-dimensional or two-dimensional. However, using new tools and analysis techniques, we have realized that a more localized form of corrosion, which we call 1D wormhole corrosion, has previously been miscategorized in some situations. Using electron tomography, we show multiple examples of this 1D and percolating morphology. To understand the origin of this mechanism in a Ni-Cr alloy corroded by molten salt, we combined energy-filtered four-dimensional scanning transmission electron microscopy and ab initio density functional theory calculations to develop a vacancy mapping method with nanometer-resolution, identifying a remarkably high vacancy concentration in the diffusion-induced grain boundary migration zone, up to 100 times the equilibrium value at the melting point. Deciphering the origins of 1D corrosion is an important step towards designing structural materials with enhanced corrosion resistance.