How somatic mutations accumulate in normal cells is poorly understood. A comprehensive analysis of RNA sequencing data from ~6700 samples across 29 normal tissues revealed multiple somatic variants, ...demonstrating that macroscopic clones can be found in many normal tissues. We found that sun-exposed skin, esophagus, and lung have a higher mutation burden than other tested tissues, which suggests that environmental factors can promote somatic mosaicism. Mutation burden was associated with both age and tissue-specific cell proliferation rate, highlighting that mutations accumulate over both time and number of cell divisions. Finally, normal tissues were found to harbor mutations in known cancer genes and hotspots. This study provides a broad view of macroscopic clonal expansion in human tissues, thus serving as a foundation for associating clonal expansion with environmental factors, aging, and risk of disease.
Defining the transcriptional dynamics of a temporal process such as cell differentiation is challenging owing to the high variability in gene expression between individual cells. Time-series gene ...expression analyses of bulk cells have difficulty distinguishing early and late phases of a transcriptional cascade or identifying rare subpopulations of cells, and single-cell proteomic methods rely on a priori knowledge of key distinguishing markers. Here we describe Monocle, an unsupervised algorithm that increases the temporal resolution of transcriptome dynamics using single-cell RNA-Seq data collected at multiple time points. Applied to the differentiation of primary human myoblasts, Monocle revealed switch-like changes in expression of key regulatory factors, sequential waves of gene regulation, and expression of regulators that were not known to act in differentiation. We validated some of these predicted regulators in a loss-of function screen. Monocle can in principle be used to recover single-cell gene expression kinetics from a wide array of cellular processes, including differentiation, proliferation and oncogenic transformation.
Genomic analysis of tumours has led to the identification of hundreds of cancer genes on the basis of the presence of mutations in protein-coding regions. By contrast, much less is known about ...cancer-causing mutations in non-coding regions. Here we perform deep sequencing in 360 primary breast cancers and develop computational methods to identify significantly mutated promoters. Clear signals are found in the promoters of three genes. FOXA1, a known driver of hormone-receptor positive breast cancer, harbours a mutational hotspot in its promoter leading to overexpression through increased E2F binding. RMRP and NEAT1, two non-coding RNA genes, carry mutations that affect protein binding to their promoters and alter expression levels. Our study shows that promoter regions harbour recurrent mutations in cancer with functional consequences and that the mutations occur at similar frequencies as in coding regions. Power analyses indicate that more such regions remain to be discovered through deep sequencing of adequately sized cohorts of patients.
Glioblastoma (GBM) is a prototypical heterogeneous brain tumor refractory to conventional therapy. A small residual population of cells escapes surgery and chemoradiation, resulting in a typically ...fatal tumor recurrence ∼ 7 mo after diagnosis. Understanding the molecular architecture of this residual population is critical for the development of successful therapies. We used whole-genome sequencing and whole-exome sequencing of multiple sectors from primary and paired recurrent GBM tumors to reconstruct the genomic profile of residual, therapy resistant tumor initiating cells. We found that genetic alteration of the p53 pathway is a primary molecular event predictive of a high number of subclonal mutations in glioblastoma. The genomic road leading to recurrence is highly idiosyncratic but can be broadly classified into linear recurrences that share extensive genetic similarity with the primary tumor and can be directly traced to one of its specific sectors, and divergent recurrences that share few genetic alterations with the primary tumor and originate from cells that branched off early during tumorigenesis. Our study provides mechanistic insights into how genetic alterations in primary tumors impact the ensuing evolution of tumor cells and the emergence of subclonal heterogeneity.
Deep sequencing of targeted genomic regions is becoming a common tool for understanding the dynamics and complexity of Plasmodium infections, but its lower limit of detection is currently unknown. ...Here, a new amplicon analysis tool, the Parallel Amplicon Sequencing Error Correction (PASEC) pipeline, is used to evaluate the performance of amplicon sequencing on low-density Plasmodium DNA samples. Illumina-based sequencing of two Plasmodium falciparum genomic regions (CSP and SERA2) was performed on two types of samples: in vitro DNA mixtures mimicking low-density infections (1-200 genomes/μl) and extracted blood spots from a combination of symptomatic and asymptomatic individuals (44-653,080 parasites/μl). Three additional analysis tools-DADA2, HaplotypR, and SeekDeep-were applied to both datasets and the precision and sensitivity of each tool were evaluated.
Amplicon sequencing can contend with low-density samples, showing reasonable detection accuracy down to a concentration of 5 Plasmodium genomes/μl. Due to increased stochasticity and background noise, however, all four tools showed reduced sensitivity and precision on samples with very low parasitaemia (< 5 copies/μl) or low read count (< 100 reads per amplicon). PASEC could distinguish major from minor haplotypes with an accuracy of 90% in samples with at least 30 Plasmodium genomes/μl, but only 61% at low Plasmodium concentrations (< 5 genomes/μl) and 46% at very low read counts (< 25 reads per amplicon). The four tools were additionally used on a panel of extracted parasite-positive blood spots from natural malaria infections. While all four identified concordant patterns of complexity of infection (COI) across four sub-Saharan African countries, COI values obtained for individual samples differed in some cases.
Amplicon deep sequencing can be used to determine the complexity and diversity of low-density Plasmodium infections. Despite differences in their approach, four state-of-the-art tools resolved known haplotype mixtures with similar sensitivity and precision. Researchers can therefore choose from multiple robust approaches for analysing amplicon data, however, error filtration approaches should not be uniformly applied across samples of varying parasitaemia. Samples with very low parasitaemia and very low read count have higher false positive rates and call for read count thresholds that are higher than current default recommendations.
Although fundamental to the study of invasion mechanisms, the relationship between mode of reproduction and plant invasion is not well understood. Fallopia japonica (Japanese knotweed), a highly ...aggressive invasive plant in both Europe and North America, serves as a model species for examining this relationship. In Britain, F. japonica var. japonica is a single female clone reproducing solely through vegetative growth or obligate hybridization with other Fallopia spp. In the U.S., however, there is more evidence for sexual reproduction. Here, simple sequence repeat (SSR) markers were developed, and three Massachusetts populations were sampled at regular intervals. The amount of sexual and clonal reproduction in each population was determined based on within-population genetic diversity. Clonal growth was apparent, but the populations together contained 26 genotypes and had evidence of sexual reproduction. One genotype that was present in all populations matched the single aggressive British clone of F. japonica var. japonica. Also, a potentially diagnostic marker for the F. sachalinensis genome provided evidence of inter- and intraspecific sexual reproduction and introgression. These differences observed in U.S. populations compared to European populations have significant implications for management of Fallopia spp. in the U.S. and underscore the importance of regional studies of invasive species.
Hemoglobin A1c (HbA1c) levels diagnose diabetes, predict mortality and are associated with ten single nucleotide polymorphisms (SNPs) in white individuals. Genetic associations in other race groups ...are not known. We tested the hypotheses that there is race-ethnic variation in 1) HbA1c-associated risk allele frequencies (RAFs) for SNPs near SPTA1, HFE, ANK1, HK1, ATP11A, FN3K, TMPRSS6, G6PC2, GCK, MTNR1B; 2) association of SNPs with HbA1c and 3) association of SNPs with mortality.
We studied 3,041 non-diabetic individuals in the NHANES (National Health and Nutrition Examination Survey) III. We stratified the analysis by race/ethnicity (NHW: non-Hispanic white; NHB: non-Hispanic black; MA: Mexican American) to calculate RAF, calculated a genotype score by adding risk SNPs, and tested associations with SNPs and the genotype score using an additive genetic model, with type 1 error = 0.05.
RAFs varied widely and at six loci race-ethnic differences in RAF were significant (p < 0.0002), with NHB usually the most divergent. For instance, at ATP11A, the SNP RAF was 54% in NHB, 18% in MA and 14% in NHW (p < .0001). The mean genotype score differed by race-ethnicity (NHW: 10.4, NHB: 11.0, MA: 10.7, p < .0001), and was associated with increase in HbA1c in NHW (β = 0.012 HbA1c increase per risk allele, p = 0.04) and MA (β = 0.021, p = 0.005) but not NHB (β = 0.007, p = 0.39). The genotype score was not associated with mortality in any group (NHW: OR (per risk allele increase in mortality) = 1.07, p = 0.09; NHB: OR = 1.04, p = 0.39; MA: OR = 1.03, p = 0.71).
At many HbA1c loci in NHANES III there is substantial RAF race-ethnic heterogeneity. The combined impact of common HbA1c-associated variants on HbA1c levels varied by race-ethnicity, but did not influence mortality.
In many mammals, including humans, removal of one lung (pneumonectomy) results in the compensatory growth of the remaining lung. Compensatory growth involves not only an increase in lung size, but ...also an increase in the number of alveoli in the peripheral lung; however, the process of compensatory neoalveolarization remains poorly understood. Here, we show that the expression of α-smooth muscle actin (SMA)-a cytoplasmic protein characteristic of myofibroblasts-is induced in the pleura following pneumonectomy. SMA induction appears to be dependent on pleural deformation (stretch) as induction is prevented by plombage or phrenic nerve transection (P < 0.001). Within 3 days of pneumonectomy, the frequency of SMA
cells in subpleural alveolar ducts was significantly increased (P < 0.01). To determine the functional activity of these SMA
cells, we isolated regenerating alveolar ducts by laser microdissection and analyzed individual cells using microfluidic single-cell quantitative PCR. Single cells expressing the SMA (Acta2) gene demonstrated significantly greater transcriptional activity than endothelial cells or other discrete cell populations in the alveolar duct (P < 0.05). The transcriptional activity of the Acta2
cells, including expression of TGF signaling as well as repair-related genes, suggests that these myofibroblast-like cells contribute to compensatory lung growth.
Fallopia japonica (Japanese knotweed, Polygonaceae) is a well-known East Asian perennial that is established throughout the U.S. and Europe. Another congener, F. sachalinensis, and their hybrid, F. ...xbohemica, also persist on both continents. Their invasive success is primarily attributed to their ability to spread via clonal growth. However, mounting evidence suggests invasion history and dynamics differ between continents and that sexual reproduction is more common than previously assumed. We used published morphological traits designed to distinguish the three taxa to characterize their distribution in 24 New England towns. We found continuous variation of all five traits, with 84% of our 81 individuals having at least one trait outside parental limits. Hierarchical cluster analysis, along with two chloroplast and one nuclear species-specific markers, suggests the presence of intercrossing, segregating hybrids, and likely introgression between F1 hybrids and F. japonica. Our markers also show the first evidence of bidirectional hybridization between parental taxa in the U.S., emphasizing the complex structure of populations in our region. This study is a first step toward unraveling the evolutionary forces that have made these taxa such aggressive invaders in the U.S. The data may also affect management strategies originally designed for largely monomorphic, clonal populations.
We isolated and characterized the entire coding sequence of a human gene encoding a protein that interacts with RPGR, a protein that is absent or mutant in many cases of X-linked retinitis ...pigmentosa. The newly identified gene, called
“RPGRIP1” for RPGR-interacting protein (MIM
605446), is located within 14q11, and it encodes a protein predicted to contain 1,259 amino acids. Previously published work showed that both proteins, RPGR and RPGRIP1, are present in the ciliary structure that connects the inner and outer segments of rod and cone photoreceptors. We surveyed 57 unrelated patients who had Leber congenital amaurosis for mutations in
RPGRIP1 and found recessive mutations involving both
RPGRIP1 alleles in 3 (6%) patients. The mutations all create premature termination codons and are likely to be null alleles. Patients with
RPGRIP1 mutations have a degeneration of both rod and cone photoreceptors, and, early in life, they experience a severe loss of central acuity, which leads to nystagmus.