Human genes governing innate immunity provide a valuable tool for the study of the selective pressure imposed by microorganisms on host genomes. A comprehensive, genome-wide study of how selective ...constraints and adaptations have driven the evolution of innate immunity genes is missing. Using full-genome sequence variation from the 1000 Genomes Project, we first show that innate immunity genes have globally evolved under stronger purifying selection than the remainder of protein-coding genes. We identify a gene set under the strongest selective constraints, mutations in which are likely to predispose individuals to life-threatening disease, as illustrated by STAT1 and TRAF3. We then evaluate the occurrence of local adaptation and detect 57 high-scoring signals of positive selection at innate immunity genes, variation in which has been associated with susceptibility to common infectious or autoimmune diseases. Furthermore, we show that most adaptations targeting coding variation have occurred in the last 6,000–13,000 years, the period at which populations shifted from hunting and gathering to farming. Finally, we show that innate immunity genes present higher Neandertal introgression than the remainder of the coding genome. Notably, among the genes presenting the highest Neandertal ancestry, we find the TLR6-TLR1-TLR10 cluster, which also contains functional adaptive variation in Europeans. This study identifies highly constrained genes that fulfill essential, non-redundant functions in host survival and reveals others that are more permissive to change—containing variation acquired from archaic hominins or adaptive variants in specific populations—improving our understanding of the relative biological importance of innate immunity pathways in natural conditions.
Genome-wide association studies (GWASes) have identified many noncoding germline single-nucleotide polymorphisms (SNPs) that are associated with an increased risk of developing cancer. However, how ...these SNPs affect cancer risk is still largely unknown.
We used a systems biology approach to analyse the regulatory role of cancer-risk SNPs in thirteen tissues. By using data from the Genotype-Tissue Expression (GTEx) project, we performed an expression quantitative trait locus (eQTL) analysis. We represented both significant cis- and trans-eQTLs as edges in tissue-specific eQTL bipartite networks.
Each tissue-specific eQTL network is organised into communities that group sets of SNPs and functionally related genes. When mapping cancer-risk SNPs to these networks, we find that in each tissue, these SNPs are significantly overrepresented in communities enriched for immune response processes, as well as tissue-specific functions. Moreover, cancer-risk SNPs are more likely to be 'cores' of their communities, influencing the expression of many genes within the same biological processes. Finally, cancer-risk SNPs preferentially target oncogenes and tumour-suppressor genes, suggesting that they may alter the expression of these key cancer genes.
This approach provides a new way of understanding genetic effects on cancer risk and provides a biological context for interpreting the results of GWAS cancer studies.
Although all human tissues carry out common processes, tissues are distinguished by gene expression patterns, implying that distinct regulatory programs control tissue specificity. In this study, we ...investigate gene expression and regulation across 38 tissues profiled in the Genotype-Tissue Expression project. We find that network edges (transcription factor to target gene connections) have higher tissue specificity than network nodes (genes) and that regulating nodes (transcription factors) are less likely to be expressed in a tissue-specific manner as compared to their targets (genes). Gene set enrichment analysis of network targeting also indicates that the regulation of tissue-specific function is largely independent of transcription factor expression. In addition, tissue-specific genes are not highly targeted in their corresponding tissue network. However, they do assume bottleneck positions due to variability in transcription factor targeting and the influence of non-canonical regulatory interactions. These results suggest that tissue specificity is driven by context-dependent regulatory paths, providing transcriptional control of tissue-specific processes.
Display omitted
•Regulatory network connections are more tissue specific than nodes (genes and transcription factors)•Tissue-specific function is not solely regulated by transcription factor expression•Tissue-specific genes assume bottleneck positions in their corresponding networks•Tissue specificity is driven by context-dependent, non-canonical regulatory paths
Understanding gene regulation is important for many fields in biology and medicine. Sonawane et al. reconstruct and investigate regulatory networks for 38 human tissues. They find that regulation of tissue-specific function is largely independent of transcription factor expression and that tissue specificity appears to be mediated by tissue-specific regulatory network paths.
Epigenetic biomarkers of aging (the "epigenetic clock") have the potential to address puzzling findings surrounding mortality rates and incidence of cardio-metabolic disease such as: (1) women ...consistently exhibiting lower mortality than men despite having higher levels of morbidity; (2) racial/ethnic groups having different mortality rates even after adjusting for socioeconomic differences; (3) the black/white mortality cross-over effect in late adulthood; and (4) Hispanics in the United States having a longer life expectancy than Caucasians despite having a higher burden of traditional cardio-metabolic risk factors.
We analyzed blood, saliva, and brain samples from seven different racial/ethnic groups. We assessed the intrinsic epigenetic age acceleration of blood (independent of blood cell counts) and the extrinsic epigenetic aging rates of blood (dependent on blood cell counts and tracks the age of the immune system). In blood, Hispanics and Tsimane Amerindians have lower intrinsic but higher extrinsic epigenetic aging rates than Caucasians. African-Americans have lower extrinsic epigenetic aging rates than Caucasians and Hispanics but no differences were found for the intrinsic measure. Men have higher epigenetic aging rates than women in blood, saliva, and brain tissue.
Epigenetic aging rates are significantly associated with sex, race/ethnicity, and to a lesser extent with CHD risk factors, but not with incident CHD outcomes. These results may help elucidate lower than expected mortality rates observed in Hispanics, older African-Americans, and women.
DNA methylation is influenced by both environmental and genetic factors and is increasingly thought to affect variation in complex traits and diseases. Yet, the extent of ancestry-related differences ...in DNA methylation, their genetic determinants, and their respective causal impact on immune gene regulation remain elusive.
We report extensive population differences in DNA methylation between 156 individuals of African and European descent, detected in primary monocytes that are used as a model of a major innate immunity cell type. Most of these differences (~ 70%) are driven by DNA sequence variants nearby CpG sites, which account for ~ 60% of the variance in DNA methylation. We also identify several master regulators of DNA methylation variation in trans, including a regulatory hub nearby the transcription factor-encoding CTCF gene, which contributes markedly to ancestry-related differences in DNA methylation. Furthermore, we establish that variation in DNA methylation is associated with varying gene expression levels following mostly, but not exclusively, a canonical model of negative associations, particularly in enhancer regions. Specifically, we find that DNA methylation highly correlates with transcriptional activity of 811 and 230 genes, at the basal state and upon immune stimulation, respectively. Finally, using a Bayesian approach, we estimate causal mediation effects of DNA methylation on gene expression in ~ 20% of the studied cases, indicating that DNA methylation can play an active role in immune gene regulation.
Using a system-level approach, our study reveals substantial ancestry-related differences in DNA methylation and provides evidence for their causal impact on immune gene regulation.
MicroRNAs (MiRs) play an important role in the pathogenesis of chronic inflammatory diseases. This study is the first to investigate miR expression profiles in purified CD4
T lymphocytes and CD14
...monocytes from patients with axial spondyloarthritis (axSpA) using a high-throughput qPCR approach.
A total of 81 axSpA patients fulfilling the 2009 ASAS classification criteria, and 55 controls were recruited from October 2014 to July 2017. CD14
monocytes and CD4
T lymphocytes were isolated from peripheral blood mononuclear cells. MiR expression was investigated by qPCR using the Exiqon Human MiRnome panel I analyzing 372 miRNAs. Differentially expressed miRNAs identified in the discovery cohort were validated in the replication cohort.
We found a major difference in miR expression patterns between T lymphocytes and monocytes regardless of the patient or control status. Comparing disease-specific differentially expressed miRs, 13 miRs were found consistently deregulated in CD14
cells in both cohorts with miR-361-3p, miR-223-3p, miR-484, and miR-16-5p being the most differentially expressed. In CD4
T cells, 11 miRs were differentially expressed between patients and controls with miR-16-1-3p, miR-28-5p, miR-199a-5p, and miR-126-3p were the most strongly upregulated miRs among patients. These miRs are involved in disease relevant pathways such as inflammation, intestinal permeability or bone formation. Mir-146a-5p levels correlated inversely with the degree of inflammation in axSpA patients.
We demonstrate a consistent deregulation of miRs in both monocytes and CD4
T cells from axSpA patients, which could contribute to the pathophysiology of the disease with potential interest from a therapeutic perspective.
Although ultrahigh-throughput RNA-Sequencing has become the dominant technology for genome-wide transcriptional profiling, the vast majority of RNA-Seq studies typically profile only tens of samples, ...and most analytical pipelines are optimized for these smaller studies. However, projects are generating ever-larger data sets comprising RNA-Seq data from hundreds or thousands of samples, often collected at multiple centers and from diverse tissues. These complex data sets present significant analytical challenges due to batch and tissue effects, but provide the opportunity to revisit the assumptions and methods that we use to preprocess, normalize, and filter RNA-Seq data - critical first steps for any subsequent analysis.
We find that analysis of large RNA-Seq data sets requires both careful quality control and the need to account for sparsity due to the heterogeneity intrinsic in multi-group studies. We developed Yet Another RNA Normalization software pipeline (YARN), that includes quality control and preprocessing, gene filtering, and normalization steps designed to facilitate downstream analysis of large, heterogeneous RNA-Seq data sets and we demonstrate its use with data from the Genotype-Tissue Expression (GTEx) project.
An R package instantiating YARN is available at http://bioconductor.org/packages/yarn .
Cell lines are an indispensable tool in biomedical research and often used as surrogates for tissues. Although there are recognized important cellular and transcriptomic differences between cell ...lines and tissues, a systematic overview of the differences between the regulatory processes of a cell line and those of its tissue of origin has not been conducted. The RNA-Seq data generated by the GTEx project is the first available data resource in which it is possible to perform a large-scale transcriptional and regulatory network analysis comparing cell lines with their tissues of origin.
We compared 127 paired Epstein-Barr virus transformed lymphoblastoid cell lines (LCLs) and whole blood samples, and 244 paired primary fibroblast cell lines and skin samples. While gene expression analysis confirms that these cell lines carry the expression signatures of their primary tissues, albeit at reduced levels, network analysis indicates that expression changes are the cumulative result of many previously unreported alterations in transcription factor (TF) regulation. More specifically, cell cycle genes are over-expressed in cell lines compared to primary tissues, and this alteration in expression is a result of less repressive TF targeting. We confirmed these regulatory changes for four TFs, including SMAD5, using independent ChIP-seq data from ENCODE.
Our results provide novel insights into the regulatory mechanisms controlling the expression differences between cell lines and tissues. The strong changes in TF regulation that we observe suggest that network changes, in addition to transcriptional levels, should be considered when using cell lines as models for tissues.