Individual instances of cancer are primarily a result of a combination of a small number of genetic mutations (hits). Knowing the number of such mutations is a prerequisite for identifying specific ...combinations of carcinogenic mutations and understanding the etiology of cancer. We present a mathematical model for estimating the number of hits based on the distribution of somatic mutations. The model is fundamentally different from previous approaches, which are based on cancer incidence by age. Our somatic mutation based model is likely to be more robust than age-based models since it does not require knowing or accounting for the highly variable mutation rate, which can vary by over three orders of magnitude. In fact, we find that the number of somatic mutations at diagnosis is weakly correlated with age at cancer diagnosis, most likely due to the extreme variability in mutation rates between individuals. Comparing the distribution of somatic mutations predicted by our model to the actual distribution from 6904 tumor samples we estimate the number of hits required for carcinogenesis for 17 cancer types. We find that different cancer types exhibit distinct somatic mutational profiles corresponding to different numbers of hits. Why might different cancer types require different numbers of hits for carcinogenesis? The answer may provide insight into the unique etiology of different cancer types.
Space-filling curves have been used for decades to study the folding principles of globular proteins, compact polymers, and chromatin. Formally, space-filling curves trace a single circuit through a ...set of points (x,y,z); informally, they correspond to a polymer melt. Although not quite a melt, the folding principles of Human chromatin are likened to the Hilbert curve: a type of space-filling curve. Hilbert-like curves in general make biologically compelling models of chromatin; in particular, they lack knots which facilitates chromatin folding, unfolding, and easy access to genes. Knot complexity has been intensely studied with the aid of Alexander polynomials; however, the approach does not generalize well to cases of more than one chromosome. Crossing complexity is an understudied alternative better suited for quantifying entanglement between chromosomes. Do Hilbert-like configurations limit crossing complexity between chromosomes? How does crossing complexity for Hilbert-like configurations compare to equilibrium configurations? To address these questions, we extend the Mansfield algorithm to enable sampling of Hilbert-like space filling curves on a simple cubic lattice. We use the extended algorithm to generate equilibrium, intermediate, and Hilbert-like configurational ensembles and compute crossing complexity between curves (chromosomes) in each configurational snapshot. Our main results are twofold: (a) Hilbert-like configurations limit entanglement between chromosomes and (b) Hilbert-like configurations do not limit entanglement in a model of S-phase DNA. Our second result is particularly surprising yet easily rationalized with a geometric argument. We explore ergodicity of the extended algorithm and discuss our results in the context of more sophisticated models of chromatin.
Human lung epithelial cells are likely among the first targets to encounter invading severe acute respiratory syndrome-associated coronavirus (SARS-CoV). Not only can these cells support the growth ...of SARS-CoV infection, but they are also capable of secreting inflammatory cytokines to initiate and, eventually, aggravate host innate inflammatory responses, causing detrimental immune-mediated pathology within the lungs. Thus, a comprehensive evaluation of the complex epithelial signaling to SARS-CoV is crucial for paving the way to better understand SARS pathogenesis. Based on microarray-based functional genomics, we report here the global gene response of 2B4 cells, a cloned bronchial epithelial cell line derived from Calu-3 cells. Specifically, we found a temporal and spatial activation of nuclear factor (NF)kappaB, activator protein (AP)-1, and interferon regulatory factor (IRF)-3/7 in infected 2B4 cells at 12-, 24-, and 48-hrs post infection (p.i.), resulting in the activation of many antiviral genes, including interferon (IFN)-beta, -lambdas, inflammatory mediators, and many IFN-stimulated genes (ISGs). We also showed, for the first time, that IFN-beta and IFN-lambdas were capable of exerting previously unrecognized, non-redundant, and complementary abilities to limit SARS-CoV replication, even though their expression could not be detected in infected 2B4 bronchial epithelial cells until 48 hrs p.i. Collectively, our results highlight the mechanics of the sequential events of antiviral signaling pathway/s triggered by SARS-CoV in bronchial epithelial cells and identify novel cellular targets for future studies, aiming at advancing strategies against SARS.
Aortic valve calcification is the most common form of valvular heart disease, but the mechanisms of calcific aortic valve disease (CAVD) are unknown. NOTCH1 mutations are associated with aortic valve ...malformations and adult-onset calcification in families with inherited disease. The Notch signaling pathway is critical for multiple cell differentiation processes, but its role in the development of CAVD is not well understood. The aim of this study was to investigate the molecular changes that occur with inhibition of Notch signaling in the aortic valve. Notch signaling pathway members are expressed in adult aortic valve cusps, and examination of diseased human aortic valves revealed decreased expression of NOTCH1 in areas of calcium deposition. To identify downstream mediators of Notch1, we examined gene expression changes that occur with chemical inhibition of Notch signaling in rat aortic valve interstitial cells (AVICs). We found significant downregulation of Sox9 along with several cartilage-specific genes that were direct targets of the transcription factor, Sox9. Loss of Sox9 expression has been published to be associated with aortic valve calcification. Utilizing an in vitro porcine aortic valve calcification model system, inhibition of Notch activity resulted in accelerated calcification while stimulation of Notch signaling attenuated the calcific process. Finally, the addition of Sox9 was able to prevent the calcification of porcine AVICs that occurs with Notch inhibition. In conclusion, loss of Notch signaling contributes to aortic valve calcification via a Sox9-dependent mechanism.
Mutations in cis-regulatory sequences have been implicated as being the predominant source of variation in morphological evolution. We offer a hypothesis that gene-associated tandem repeat expansions ...and contractions are a major source of phenotypic variation in evolution. Here, we describe a comparative genomic study of repetitive elements in developmental genes of 92 breeds of dogs. We find evidence for selection for divergence at coding repeat loci in the form of both elevated purity and extensive length polymorphism among different breeds. Variations in the number of repeats in the coding regions of the Alx-4 (aristaless-like 4) and Runx-2 (runt-related transcription factor 2) genes were quantitatively associated with significant differences in limb and skull morphology. We identified similar repeat length variation in the coding repeats of Runx-2, Twist, and Dlx-2 in several other species. The high frequency and incremental effects of repeat length mutations provide molecular explanations for swift, yet topologically conservative morphological evolution.
Abiotic environmental factors play a fundamental role in determining the distribution, abundance and adaptive diversification of species. Empowered by new technologies enabling rapid and increasingly ...accurate examination of genomic variation in populations, researchers may gain new insights into the genomic background of adaptive radiation and stress resistance. We investigated genomic variation across generations of large‐scale experimental selection regimes originating from a single founder population of Drosophila melanogaster, diverging in response to ecologically relevant environmental stressors: heat shock, heat knock down, cold shock, desiccation and starvation. When compared to the founder population, and to parallel unselected controls, there were more than 100,000 single nucleotide polymorphisms (SNPs) displaying consistent allelic changes in response to selective pressures across generations. These SNPs were found in both coding and noncoding sequences, with the highest density in promoter regions, and involved a broad range of functionalities, including molecular chaperoning by heat‐shock proteins. The SNP patterns were highly stressor‐specific despite considerable variation among line replicates within each selection regime, as reflected by a principal component analysis, and co‐occurred with selective sweep regions. Only ~15% of SNPs with putatively adaptive changes were shared by at least two selective regimes, while less than 1% of SNPs diverged in opposite directions. Divergent stressors driving evolution in the experimental system of adaptive radiation left distinct genomic signatures, most pronounced in starvation and heat‐shock selection regimes.
see also the Perspective by Barghi
Facioscapulohumeral muscular dystrophy (FSHD) is caused by an unusual deletion with neomorphic activity. This deletion derepresses genes in cis; however which candidate gene causes the FSHD ...phenotype, and through what mechanism, is unknown. We describe a novel genetic tool, inducible cassette exchange, enabling rapid generation of isogenetically modified cells with conditional and variable transgene expression. We compare the effects of expressing variable levels of each FSHD candidate gene on myoblasts. This screen identified only one gene with overt toxicity: DUX4 (double homeobox, chromosome 4), a protein with two homeodomains, each similar in sequence to Pax3 and Pax7. DUX4 expression recapitulates key features of the FSHD molecular phenotype, including repression of MyoD and its target genes, diminished myogenic differentiation, repression of glutathione redox pathway components, and sensitivity to oxidative stress. We further demonstrate competition between DUX4 and Pax3/Pax7: when either Pax3 or Pax7 is expressed at high levels, DUX4 is no longer toxic. We propose a hypothesis for FSHD in which DUX4 expression interferes with Pax7 in satellite cells, and inappropriately regulates Pax targets, including myogenic regulatory factors, during regeneration.
Microsatellites-a type of short tandem repeat (STR)-have been used for decades as putatively neutral markers to study the genetic structure of diverse human populations. However, recent studies have ...demonstrated that some microsatellites contribute to gene expression, cis heritability, and phenotype. As a corollary, some microsatellites may contribute to differential gene expression and RNA/protein structure stability in distinct human populations. To test this hypothesis, we investigate genotype frequencies, functional relevance, and adaptive potential of microsatellites in five super-populations (ethnicities) drawn from the 1000 Genomes Project. We discover 3,984 ethnically-biased microsatellite loci (EBML); for each EBML at least one ethnicity has genotype frequencies statistically different from the remaining four. South Asian, East Asian, European, and American EBML show significant overlap; on the contrary, the set of African EBML is mostly unique. We cross-reference the 3,984 EBML with 2,060 previously identified expression STRs (eSTRs); repeats known to affect gene expression (64 total) are over-represented. The most significant pathway enrichments are those associated with the matrisome: a broad collection of genes encoding the extracellular matrix and its associated proteins. At least 14 of the EBML have established links to human disease. Analysis of the 3,984 EBML with respect to known selective sweep regions in the genome shows that allelic variation in some of them is likely associated with adaptive evolution.
Abstract
The human genome harbors an abundance of repetitive DNA; however, its function continues to be debated. Microsatellites—a class of short tandem repeat—are established as an important source ...of genetic variation. Array length variants are common among microsatellites and affect gene expression; but, efforts to understand the role and diversity of microsatellite variation has been hampered by several challenges. Without adequate depth, both long-read and short-read sequencing may not detect the variants present in a sample; additionally, large sample sizes are needed to reveal the degree of population-level polymorphism. To address these challenges we present the Comparative Analysis of Germline Microsatellites (CAGm): a database of germline microsatellites from 2529 individuals in the 1000 genomes project. A key novelty of CAGm is the ability to aggregate microsatellite variation by population, ethnicity (super population) and gender. The database provides advanced searching for microsatellites embedded in genes and functional elements. All data can be downloaded as Microsoft Excel spreadsheets. Two use-case scenarios are presented to demonstrate its utility: a mononucleotide (A) microsatellite at the BAT-26 locus and a dinucleotide (CA) microsatellite in the coding region of FGFRL1. CAGm is freely available at http://www.cagmdb.org/.
Susceptibility to autoimmunity in B6.
Sle1b mice is associated with extensive polymorphisms between two divergent haplotypes of the SLAM/CD2 family of genes. The B6.
Sle1b-derived SLAM/CD2 family ...haplotype is found in many other laboratory mouse strains but only causes autoimmunity in the context of the C57Bl/6 (B6) genome. Phenotypic analyses have revealed variations in the structure and expression of several members of the SLAM/CD2 family in T and B lymphocytes from B6
.Sle1b mice. T lymphocytes from B6.
Sle1b mice have modified signaling responses to stimulation at 4–6 weeks of age. While autoimmunity may be mediated by a combination of genes in the SLAM/CD2 family cluster, the strongest candidate is
Ly108, a specific isoform of which is constitutively upregulated in B6.
Sle1b lymphocytes.