Bisulfite sequencing is a powerful tool for profiling genomic methylation, an epigenetic modification critical in the understanding of cancer, psychiatric disorders, and many other conditions. Raw ...data generated by whole genome bisulfite sequencing (WGBS) requires several computational steps before it is ready for statistical analysis, and particular care is required to process data in a timely and memory-efficient manner. Alignment to a reference genome is one of the most computationally demanding steps in a WGBS workflow, taking several hours or even days with commonly used WGBS-specific alignment software. This naturally motivates the creation of computational workflows that can utilize GPU-based alignment software to greatly speed up the bottleneck step. In addition, WGBS produces raw data that is large and often unwieldy; a lack of memory-efficient representation of data by existing pipelines renders WGBS impractical or impossible to many researchers. We present BiocMAP, a Bioconductor-friendly methylation analysis pipeline consisting of two modules, to address the above concerns. The first module performs computationally-intensive read alignment using Arioc, a GPU-accelerated short-read aligner. Since GPUs are not always available on the same computing environments where traditional CPU-based analyses are convenient, the second module may be run in a GPU-free environment. This module extracts and merges DNA methylation proportions--the fractions of methylated cytosines across all cells in a sample at a given genomic site. Bioconductor-based output objects in R utilize an on-disk data representation to drastically reduce required main memory and make WGBS projects computationally feasible to more researchers. BiocMAP is implemented using Nextflow and available at http://research.libd.org/BiocMAP/. To enable reproducible analysis across a variety of typical computing environments, BiocMAP can be containerized with Docker or Singularity, and executed locally or with the SLURM or SGE scheduling engines. By providing Bioconductor objects, BiocMAP's output can be integrated with powerful analytical open source software for analyzing methylation data.
Abstract
DNA methylation (DNAm) is an epigenetic regulator of gene expression and a hallmark of gene-environment interaction. Using whole-genome bisulfite sequencing, we have surveyed DNAm in 344 ...samples of human postmortem brain tissue from neurotypical subjects and individuals with schizophrenia. We identify genetic influence on local methylation levels throughout the genome, both at CpG sites and CpH sites, with 86% of SNPs and 55% of CpGs being part of methylation quantitative trait loci (meQTLs). These associations can further be clustered into regions that are differentially methylated by a given SNP, highlighting the genes and regions with which these loci are epigenetically associated. These findings can be used to better characterize schizophrenia GWAS-identified variants as epigenetic risk variants. Regions differentially methylated by schizophrenia risk-SNPs explain much of the heritability associated with risk loci, despite covering only a fraction of the genomic space. We provide a comprehensive, single base resolution view of association between genetic variation and genomic methylation, and implicate schizophrenia GWAS-associated variants as influencing the epigenetic plasticity of the brain.
RNA sequencing (RNA-seq) is a common and widespread biological assay, and an increasing amount of data is generated with it. In practice, there are a large number of individual steps a researcher ...must perform before raw RNA-seq reads yield directly valuable information, such as differential gene expression data. Existing software tools are typically specialized, only performing one step-such as alignment of reads to a reference genome-of a larger workflow. The demand for a more comprehensive and reproducible workflow has led to the production of a number of publicly available RNA-seq pipelines. However, we have found that most require computational expertise to set up or share among several users, are not actively maintained, or lack features we have found to be important in our own analyses.
In response to these concerns, we have developed a Scalable Pipeline for Expression Analysis and Quantification (SPEAQeasy), which is easy to install and share, and provides a bridge towards R/Bioconductor downstream analysis solutions. SPEAQeasy is portable across computational frameworks (SGE, SLURM, local, docker integration) and different configuration files are provided ( http://research.libd.org/SPEAQeasy/ ).
SPEAQeasy is user-friendly and lowers the computational-domain entry barrier for biologists and clinicians to RNA-seq data processing as the main input file is a table with sample names and their corresponding FASTQ files. The goal is to provide a flexible pipeline that is immediately usable by researchers, regardless of their technical background or computing environment.
Neurons derived from human induced pluripotent stem cells (hiPSCs) have been used to model basic cellular aspects of neuropsychiatric disorders, but the relationship between the emergent phenotypes ...and the clinical characteristics of donor individuals has been unclear. We analyzed RNA expression and indices of cellular function in hiPSC-derived neural progenitors and cortical neurons generated from 13 individuals with high polygenic risk scores (PRSs) for schizophrenia (SCZ) and a clinical diagnosis of SCZ, along with 15 neurotypical individuals with low PRS. We identified electrophysiological measures in the patient-derived neurons that implicated altered Na
channel function, action potential interspike interval, and gamma-aminobutyric acid-ergic neurotransmission. Importantly, electrophysiological measures predicted cardinal clinical and cognitive features found in these SCZ patients. The identification of basic neuronal physiological properties related to core clinical characteristics of illness is a potentially critical step in generating leads for novel therapeutics.
DNA methylation (DNAm) is a key epigenetic regulator of gene expression across development. The developing prenatal brain is a highly dynamic tissue, but our understanding of key drivers of ...epigenetic variability across development is limited. We, therefore, assessed genomic methylation at over 39 million sites in the prenatal cortex using whole-genome bisulfite sequencing and found loci and regions in which methylation levels are dynamic across development. We saw that DNAm at these loci was associated with nearby gene expression and enriched for enhancer chromatin states in prenatal brain tissue. Additionally, these loci were enriched for genes associated with neuropsychiatric disorders and genes involved with neurogenesis. We also found autosomal differences in DNAm between the sexes during prenatal development, though these have less clear functional consequences. We lastly confirmed that the dynamic methylation at this critical period is specifically CpG methylation, with generally low levels of CpH methylation. Our findings provide detailed insight into prenatal brain development as well as clues to the pathogenesis of psychiatric traits seen later in life.
Antipsychotic drugs are the current first-line of treatment for schizophrenia and other psychotic conditions. However, their molecular effects on the human brain are poorly studied, due to difficulty ...of tissue access and confounders associated with disease status. Here we examine differences in gene expression and DNA methylation associated with positive antipsychotic drug toxicology status in the human caudate nucleus. We find no genome-wide significant differences in DNA methylation, but abundant differences in gene expression. These gene expression differences are overall quite similar to gene expression differences between schizophrenia cases and controls. Interestingly, gene expression differences based on antipsychotic toxicology are different between brain regions, potentially due to affected cell type differences. We finally assess similarities with effects in a mouse model, which finds some overlapping effects but many differences as well. As a first look at the molecular effects of antipsychotics in the human brain, the lack of epigenetic effects is unexpected, possibly because long term treatment effects may be relatively stable for extended periods.
High-resolution and multiplexed imaging techniques are giving us an increasingly detailed observation of a biological system. However, sharing, exploring, and customizing the visualization of large ...multidimensional images can be a challenge. Here, we introduce Samui, a performant and interactive image visualization tool that runs completely in the web browser. Samui is specifically designed for fast image visualization and annotation and enables users to browse through large images and their selected features within seconds of receiving a link. We demonstrate the broad utility of Samui with images generated with two platforms: Vizgen MERFISH and 10x Genomics Visium Spatial Gene Expression. Samui along with example datasets is available at https://samuibrowser.com.
Summary Background Oesophageal adenocarcinoma represents one of the fastest rising cancers in high-income countries. Barrett's oesophagus is the premalignant precursor of oesophageal adenocarcinoma. ...However, only a few patients with Barrett's oesophagus develop adenocarcinoma, which complicates clinical management in the absence of valid predictors. Within an international consortium investigating the genetics of Barrett's oesophagus and oesophageal adenocarcinoma, we aimed to identify novel genetic risk variants for the development of Barrett's oesophagus and oesophageal adenocarcinoma. Methods We did a meta-analysis of all genome-wide association studies of Barrett's oesophagus and oesophageal adenocarcinoma available in PubMed up to Feb 29, 2016; all patients were of European ancestry and disease was confirmed histopathologically. All participants were from four separate studies within Europe, North America, and Australia and were genotyped on high-density single nucleotide polymorphism (SNP) arrays. Meta-analysis was done with a fixed-effects inverse variance-weighting approach and with a standard genome-wide significance threshold (p<5 × 10−8 ). We also did an association analysis after reweighting of loci with an approach that investigates annotation enrichment among genome-wide significant loci. Furthermore, the entire dataset was analysed with bioinformatics approaches—including functional annotation databases and gene-based and pathway-based methods—to identify pathophysiologically relevant cellular mechanisms. Findings Our sample comprised 6167 patients with Barrett's oesophagus and 4112 individuals with oesophageal adenocarcinoma, in addition to 17 159 representative controls from four genome-wide association studies in Europe, North America, and Australia. We identified eight new risk loci associated with either Barrett's oesophagus or oesophageal adenocarcinoma, within or near the genes CFTR (rs17451754; p=4·8 × 10−10 ), MSRA (rs17749155; p=5·2 × 10−10 ), LINC00208 and BLK (rs10108511; p=2·1 × 10−9 ), KHDRBS2 (rs62423175; p=3·0 × 10−9 ), TPPP and CEP72 (rs9918259; p=3·2 × 10−9 ), TMOD1 (rs7852462; p=1·5 × 10−8 ), SATB2 (rs139606545; p=2·0 × 10−8 ), and HTR3C and ABCC5 (rs9823696; p=1·6 × 10−8 ). The locus identified near HTR3C and ABCC5 (rs9823696) was associated specifically with oesophageal adenocarcinoma (p=1·6 × 10−8 ) and was independent of Barrett's oesophagus development (p=0·45). A ninth novel risk locus was identified within the gene LPA (rs12207195; posterior probability 0·925) after reweighting with significantly enriched annotations. The strongest disease pathways identified (p<10−6 ) belonged to muscle cell differentiation and to mesenchyme development and differentiation. Interpretation Our meta-analysis of genome-wide association studies doubled the number of known risk loci for Barrett's oesophagus and oesophageal adenocarcinoma and revealed new insights into causes of these diseases. Furthermore, the specific association between oesophageal adenocarcinoma and the locus near HTR3C and ABCC5 might constitute a novel genetic marker for prediction of the transition from Barrett's oesophagus to oesophageal adenocarcinoma. Fine-mapping and functional studies of new risk loci could lead to identification of key molecules in the development of Barrett's oesophagus and oesophageal adenocarcinoma, which might encourage development of advanced prevention and intervention strategies. Funding US National Cancer Institute, US National Institutes of Health, National Health and Medical Research Council of Australia, Swedish Cancer Society, Medical Research Council UK, Cambridge NIHR Biomedical Research Centre, Cambridge Experimental Cancer Medicine Centre, Else Kröner Fresenius Stiftung, Wellcome Trust, Cancer Research UK, AstraZeneca UK, University Hospitals of Leicester, University of Oxford, Australian Research Council.