Abstract
The advancements of high-throughput genomics have unveiled much about the human genome highlighting the importance of variations between individuals and their contribution to disease. Even ...though numerous software have been developed to make sense of large genomics datasets, a major short falling of these has been the inability to cope with repetitive regions, specifically to validate structural variants and accordingly assess their role in disease. Here we describe our program STEAK, a massively parallel software designed to detect chimeric reads in high-throughput sequencing data for a broad number of applications such as identifying presence/absence, as well as discovery of transposable elements (TEs), and retroviral integrations. We highlight the capabilities of STEAK by comparing its efficacy in locating HERV-K HML-2 in clinical whole genome projects, target enrichment sequences, and in the 1000 Genomes CEU Trio to the performance of other TE and virus detecting tools. We show that STEAK outperforms other software in terms of computational efficiency, sensitivity, and specificity. We demonstrate that STEAK is a robust tool, which allows analysts to flexibly detect and evaluate TE and retroviral integrations in a diverse range of sequencing projects for both research and clinical purposes.
Summary
Transfusion‐dependent myelodysplastic (MDS) patients are prone to iron overload. We evaluated 43 transfused MDS patients with T2* magnetic resonance imaging scans. 81% had liver and 16·8% ...cardiac iron overload. Liver R2* (1000/T2*), but not cardiac R2*, was correlated with number of units transfused (r = 0·72, P < 0·0001) and ferritin (r = 0·53, P < 0·0001). The area under the curve of a time‐ferritin plot was found to be much greater in patients with cardiac iron loading (median 53·7 × 105 Megaunits vs. 12·2 × 105 Megaunits, P = 0·002). HFE, HFE2, HAMP or SLC40A1 genotypes were not predictors of iron overload in these patients.
Background
: Angiotensin I converting enzyme 2 (ACE2) is a receptor for the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and differences in its expression may affect susceptibility to ...infection.
Methods
: We performed a genome-wide expression quantitative trait loci (eQTL) analysis using hepatitis C virus-infected liver tissue from 190 individuals.
Results
: We discovered that polymorphism in a type III interferon gene (
IFNL4
), which eliminates IFN-λ4 production, is associated with a two-fold increase in ACE2 RNA expression. Conversely, among genes negatively correlated with
ACE2
expression, IFN-signalling pathways were highly enriched and
ACE2
was downregulated after IFN-α treatment. Negative correlation was also found in the gastrointestinal tract where inflammation driven IFN-stimulated genes were negatively correlated with
ACE2
expression and in lung tissue from a murine model of SARS-CoV-1 infection suggesting conserved regulation of
ACE2
across tissue and species.
Conclusions
: We conclude that
ACE2
is likely a negatively-regulated interferon-stimulated gene (ISG) and carriage of
IFNL4
gene alleles which modulates ISGs expression in viral infection may play a role in SARS-CoV-2 pathogenesis with implications for therapeutic interventions.
Abstract 3381
It is well established that the level of gene expression can vary significantly between normal individuals, and that the majority of this variation is due to naturally occurring genomic ...variability caused by single nucleotide polymorphisms (SNPs). Therefore, identifying functional cis-regulatory polymorphisms and understanding how they influence gene expression is an important new task in many areas of medical research, including molecular hematology. We have previously shown that an entirely new form of alpha-thalassemia is caused by a gain of function regulatory SNP in an unremarkable non-coding region in the alpha-globin cluster. This SNP creates a novel, functional GATA site, which recruits a tissue-specific transcription factor (TF) complex. This creates a new promoter-like element, which interferes with activation of the globin genes (De Gobbi et al. Science 2006,312:1215–1257). Here, to investigate the extent and the impact of this class of regulatory SNP, using ChIP-Seq we characterized differences in the occupancy of Scl/TAL-1 (a tissue-specific TF critical for erythroid maturation) in the erythroblasts of two individuals from the same ethnic background (Caucasian 1, C1, and Caucasian 2, C2).
Sequence reads from two biological replicates of each individual were merged and aligned to the human reference genome (NCBI36/hg18) and a total of 2936 Scl/TAL-1 bound regions were identified. Using two de novo motif finding algorithms (MEME and DREME), we identified GATA (WGATAR) and E-box (CAGMTG) sites as the preferred sequences associated with in vivo binding of Scl/TAL-1. In addition, other motifs were enriched at the Scl/TAL-1 targets; among these were binding sites for known TFs (Sp1/Klf, RUNX1 and NFE2).
To identify differentially bound regions between C1 and C2, a two-class paired-test, Rank Product analysis (500 permutations, FDR<0.2) was performed with MeV4.6 TM4 Software. About 1% (25/2936) of these sites showed differential binding. Differences were mostly associated with SNPs directly affecting or lying adjacent to known TF consensus binding sites and deviations from the GATA or E-box consensus motifs corresponded to the inability of the sequence to bind Scl/TAL-1.
Since it has been previously shown that the function of active transcriptional elements can be predicted on the basis of chromatin signatures (e.g. enhancers marked by H3K4me1 and promoters marked by H3K4me3), to further characterize the Scl/TAL-1 differentially occupied sites, we asked which chromatin signatures are associated with these regions. H3K4me1 and H3K4me3 ChIP-Seq experiments, together with analyses of publicly available data sets, showed that the most of the SNPs responsible for variation in the recruitment of Scl/TAL-1 (23/25) lie in DNA sequences that have chromatin signatures predictive of enhancer elements, suggesting a potential long-range function in modulating gene expression.
Finally, Scl/TAL-1 ChIP-Seq analysis of erythroblasts of a third individual from a different ethnic background (African-Caribbean, A3) revealed more distinctive targets including a well known regulatory SNP at the promoter of the DARC gene (encoding the Duffy blood group), which alters a GATA binding motif in A3, conferring the malaria-resistant Duffy-null phenotype.
Given the exponential growth in genome-wide association studies by which numerous SNPs are being either associated with hematological parameters, or implicated in the etiology of hematologic disorders, this study elucidates molecular mechanisms which might account for phenotypic diversity and highlights the importance of carrying out functional characterization of non-coding polymorphisms found to be associated with disease risk.
No relevant conflicts of interest to declare.
Patterns of transcriptional activity along circular prokaryotic chromosomes such as Escherichia coli and Bacillus subtilis have previously been characterized through microarray-based analysis of gene ...expression. However, patterns across linear prokaryotic chromosomes are yet to be investigated. Given the different topological constraints (to circular DNA) imposed upon the linear chromosome the derived transcriptional patterns would be expected to be different. Here we explore transcriptional activity along the linear chromosome of Streptomyces coelicolor using expression data from 139 microarrays and genomic features. Representing total chromosomal expression as a spatial series of transcript abundances we observe both short and long range periodicities of transcription along the S. coelicolor chromosome with data derived from different microarray technologies (spotted and ink jet in situ-synthesized). Application of the autocorrelation function confirmed these periodicities as significant both in two and single channel analyses and allowed further analysis of chromosomal properties: Codon adaptation index, GC content, secondary structure and gene length. Another signal processing technique, namely wavelet analysis, has aided the visualization of the periodic signals along the chromosome, identifying ‘hot-spots’ of activity related to specific responses. The ranges of periodicity we observe for the linear chromosome of S. coelicolor indicate a different chromosomal organization than circular chromosomes; the short periodicity of 5 genes we observe cannot be accounted for by the currently proposed E. coli and B. subtilis models. We suggest a chromosome organization/transcriptional model that takes into consideration the transcriptional and genomic organization trends we observe.