Significance Meiotic recombination is known to vary over 1,000-fold in many eukaryotic organisms, including maize. This regional genomic variation has enormous consequences for plant breeders, who ...rely on meiotic cross-overs to fine-map quantitative traits and introgress favorable alleles. Deleterious mutations are also predicted to accumulate preferentially within low-recombination regions, particularly within historically outcrossing species, such as maize. Here, we show that meiotic recombination is predictable across diverse crosses based on several genomic features of the reference genome. We demonstrate that the extant patterns of recombination are historically stable and tied to variation in the number of deleterious mutations. The ability of plant breeders to exploit recombination to purge segregating deleterious alleles will determine the efficacy of future crop improvement.
Among the fundamental evolutionary forces, recombination arguably has the largest impact on the practical work of plant breeders. Varying over 1,000-fold across the maize genome, the local meiotic recombination rate limits the resolving power of quantitative trait mapping and the precision of favorable allele introgression. The consequences of low recombination also theoretically extend to the species-wide scale by decreasing the power of selection relative to genetic drift, and thereby hindering the purging of deleterious mutations. In this study, we used genotyping-by-sequencing (GBS) to identify 136,000 recombination breakpoints at high resolution within US and Chinese maize nested association mapping populations. We find that the pattern of cross-overs is highly predictable on the broad scale, following the distribution of gene density and CpG methylation. Several large inversions also suppress recombination in distinct regions of several families. We also identify recombination hotspots ranging in size from 1 kb to 30 kb. We find these hotspots to be historically stable and, compared with similar regions with low recombination, to have strongly differentiated patterns of DNA methylation and GC content. We also provide evidence for the historical action of GC-biased gene conversion in recombination hotspots. Finally, using genomic evolutionary rate profiling (GERP) to identify putative deleterious polymorphisms, we find evidence for reduced genetic load in hotspot regions, a phenomenon that may have considerable practical importance for breeding programs worldwide.
Folate being an important vitamin of B Complex group in our diet plays an important role not only in the synthesis of DNA but also in the maintenance of methylation reactions in the cells. Folate ...metabolism is influenced by several processes especially its dietary intake and the polymorphisms of the associated genes involved. Aberrant folate metabolism, therefore, affects both methylation as well as the DNA synthesis processes, both of which have been implicated in the development of various diseases. This paper reviews the current knowledge of the processes involved in folate metabolism and consequences of deviant folate metabolism, particular emphasis is given to the polymorphic genes which have been implicated in the development of various diseases in humans, like vascular diseases, Down's syndrome, neural tube defects, psychiatric disorders and cancers.
•MTHFR, DHFR, TS, MTR, MTRR etc. regulate the active folate levels in cells.•Cellular folate status influences the DNA stability and integrity.•Methylation patterns of DNA in some tissues also depend on cellular folate.•Polymorphisms of genes like MTHFR, TS, and MTR result in various diseases.•DNA biosynthesis and methylation are very crucial in relation to carcinogenesis.
Metabolic resistance to insecticides such as pyrethroids in mosquito vectors threatens control of malaria in Africa. Unless it is managed, recent gains in reducing malaria transmission could be lost. ...To improve monitoring and assess the impact of insecticide resistance on malaria control interventions, we elucidated the molecular basis of pyrethroid resistance in the major African malaria vector,
We showed that a single cytochrome P450 allele (
_R) in
reduced the efficacy of insecticide-treated bednets for preventing transmission of malaria in southern Africa. Expression of key insecticide resistance genes was detected in populations of this mosquito vector throughout Africa but varied according to the region. Signatures of selection and adaptive evolutionary traits including structural polymorphisms and cis-regulatory transcription factor binding sites were detected with evidence of selection due to the scale-up of insecticide-treated bednet use. A cis-regulatory polymorphism driving the overexpression of the major resistance gene
allowed us to design a DNA-based assay for cytochrome P450-mediated resistance to pyrethroid insecticides. Using this assay, we tracked the spread of pyrethroid resistance and found that it was almost fixed in mosquitoes from southern Africa but was absent from mosquitoes collected elsewhere in Africa. Furthermore, a field study in experimental huts in Cameroon demonstrated that mosquitoes carrying the resistance CYP6P9a_R allele survived and succeeded in blood feeding more often than did mosquitoes that lacked this allele. Our findings highlight the need to introduce a new generation of insecticide-treated bednets for malaria control that do not rely on pyrethroid insecticides.
Chromosomal inversions play an important role in local adaptation. Inversions can capture multiple locally adaptive functional variants in a linked block by repressing recombination. However, this ...recombination suppression makes it difficult to identify the genetic mechanisms underlying an inversion's role in adaptation. In this study, we used large-scale transcriptomic data to dissect the functional importance of a 13 Mb inversion locus (Inv4m) found almost exclusively in highland populations of maize (Zea mays ssp. mays). Inv4m was introgressed into highland maize from the wild relative Zea mays ssp. mexicana, also present in the highlands of Mexico, and is thought to be important for the adaptation of these populations to cultivation in highland environments. However, the specific genetic variants and traits that underlie this adaptation are not known. We created two families segregating for the standard and inverted haplotypes of Inv4m in a common genetic background and measured gene expression effects associated with the inversion across 9 tissues in two experimental conditions. With these data, we quantified both the global transcriptomic effects of the highland Inv4m haplotype, and the local cis-regulatory variation present within the locus. We found diverse physiological effects of Inv4m across the 9 tissues, including a strong effect on the expression of genes involved in photosynthesis and chloroplast physiology. Although we could not confidently identify the causal alleles within Inv4m, this research accelerates progress towards understanding this inversion and will guide future research on these important genomic features.
Natural killer (NK) cells are innate lymphocytes that eliminate infected and transformed cells. They discriminate healthy from diseased tissue through killer cell Ig-like receptor (KIR) recognition ...of HLA class I ligands. Directly impacting NK cell function,
polymorphism associates with infection control and multiple autoimmune and pregnancy syndromes. Here we analyze
diversity of 241 individuals from five groups of Iranians. These five populations represent Baloch, Kurd, and Lur, together comprising 15% of the ethnically diverse Iranian population. We identified 159
alleles, including 11 not previously characterized. We also identified 170 centromeric and 94 telomeric haplotypes, and 15 different
haplotypes carrying either a deletion or duplication encompassing one or more complete
genes. As expected, comparing our data with those representing major worldwide populations revealed the greatest similarity between Iranians and Europeans. Despite this similarity we observed higher frequencies of
in Iran than any other population, and the highest frequency of HLA-B
51, a Bw4-containing allotype that acts as a strong educator of
NK cells. Compared to Europeans, the Iranians we studied also have a reduced frequency of
, which encodes an allotype that is not expressed at the NK cell surface. Concurrent with the resulting high frequency of strong viable interactions between inhibitory KIR and polymorphic HLA class I, the majority of
haplotypes characterized do not express a functional activating receptor. By contrast, the most frequent
haplotype in Iran expresses only one functional inhibitory KIR and the maximum number of activating KIR. This first complete, high-resolution, characterization of the
locus of Iranians will form a valuable reference for future clinical and population studies.
Abstract
Motivation
Despite significant efforts in expert curation, clinical relevance about most of the 154 million dbSNP reference variants (RS) remains unknown. However, a wealth of knowledge ...about the variant biological function/disease impact is buried in unstructured literature data. Previous studies have attempted to harvest and unlock such information with text-mining techniques but are of limited use because their mutation extraction results are not standardized or integrated with curated data.
Results
We propose an automatic method to extract and normalize variant mentions to unique identifiers (dbSNP RSIDs). Our method, in benchmarking results, demonstrates a high F-measure of ∼90% and compared favorably to the state of the art. Next, we applied our approach to the entire PubMed and validated the results by verifying that each extracted variant-gene pair matched the dbSNP annotation based on mapped genomic position, and by analyzing variants curated in ClinVar. We then determined which text-mined variants and genes constituted novel discoveries. Our analysis reveals 41 889 RS numbers (associated with 9151 genes) not found in ClinVar. Moreover, we obtained a rich set worth further review: 12 462 rare variants (MAF ≤ 0.01) in 3849 genes which are presumed to be deleterious and not frequently found in the general population. To our knowledge, this is the first large-scale study to analyze and integrate text-mined variant data with curated knowledge in existing databases. Our results suggest that databases can be significantly enriched by text mining and that the combined information can greatly assist human efforts in evaluating/prioritizing variants in genomic research.
Availability and implementation
The tmVar 2.0 source code and corpus are freely available at https://www.ncbi.nlm.nih.gov/research/bionlp/Tools/tmvar/
A majority of mitochondrial DNA (mtDNA) mutations reported to be implicated in diseases are heteroplasmic, a status with coexisting mtDNA variants in a single cell. Quantifying the prevalence of ...mitochondrial heteroplasmy and its pathogenic effect in healthy individuals could further our understanding of its possible roles in various diseases. A total of 1,085 human individuals from 14 global populations have been sequenced by the 1000 Genomes Project to a mean coverage of ∼2,000× on mtDNA. Using a combination of stringent thresholds and a maximum-likelihood method to define heteroplasmy, we demonstrated that ∼90% of the individuals carry at least one heteroplasmy. At least 20% of individuals harbor heteroplasmies reported to be implicated in disease. Mitochondrial heteroplasmy tend to show high pathogenicity, and is significantly overrepresented in disease-associated loci. Consistent with their deleterious effect, heteroplasmies with derived allele frequency larger than 60% within an individual show a significant reduction in pathogenicity, indicating the action of purifying selection. Purifying selection on heteroplasmies can also be inferred from nonsynonymous and synonymous heteroplasmy comparison and the unfolded site frequency spectra for different functional sites in mtDNA. Nevertheless, in comparison with population polymorphic mtDNA mutations, the purifying selection is much less efficient in removing heteroplasmic mutations. The prevalence of mitochondrial heteroplasmy with high pathogenic potential in healthy individuals, along with the possibility of these mutations drifting to high frequency inside a subpopulation of cells across lifespan, emphasizes the importance of managing mitochondrial heteroplasmy to prevent disease progression.
•SARS-CoV-2 exhibits intra-host small- and large-scale genomic variability.•SNVs are collocalized with probes and primers used in molecular diagnostic assays.•SARS-CoV-2 Spike (S) gene host a ...potential recombination hot-spot.
In December 2019, an outbreak of atypical pneumonia (Coronavirus disease 2019 -COVID-19) associated with a novel coronavirus (SARS-CoV-2) was reported in Wuhan city, Hubei province, China. The outbreak was traced to a seafood wholesale market and human to human transmission was confirmed. The rapid spread and the death toll of the new epidemic warrants immediate intervention. The intra-host genomic variability of SARS-CoV-2 plays a pivotal role in the development of effective antiviral agents and vaccines, as well as in the design of accurate diagnostics.
We analyzed NGS data derived from clinical samples of three Chinese patients infected with SARS-CoV-2, in order to identify small- and large-scale intra-host variations in the viral genome. We identified tens of low- or higher- frequency single nucleotide variations (SNVs) with variable density across the viral genome, affecting 7 out of 10 protein-coding viral genes. The majority of these SNVs (72/104) corresponded to missense changes. The annotation of the identified SNVs but also of all currently circulating strain variations revealed colocalization of intra-host as well as strain specific SNVs with primers and probes currently used in molecular diagnostics assays. Moreover, we de-novo assembled the viral genome, in order to isolate and validate intra-host structural variations and recombination breakpoints. The bioinformatics analysis disclosed genomic rearrangements over poly-A / poly-U regions located in ORF1ab and spike (S) gene, including a potential recombination hot-spot within S gene.
Our results highlight the intra-host genomic diversity and plasticity of SARS-CoV-2, pointing out genomic regions that are prone to alterations. The isolated SNVs and genomic rearrangements reflect the intra-patient capacity of the polymorphic quasispecies, which may arise rapidly during the outbreak, allowing immunological escape of the virus, offering resistance to anti-viral drugs and affecting the sensitivity of the molecular diagnostics assays.
Data on genetic susceptibility to sporadic gastric carcinoma have been published at a growing pace, but to date no comprehensive overview and quantitative summary has been available.
We conducted a ...systematic review and meta-analysis of the evidence on the association between DNA variation and risk of developing stomach cancer. To assess result credibility, summary evidence was graded according to the Venice criteria and false positive report probability (FPRP) was calculated to further validate result noteworthiness. Meta-analysis was also conducted for subgroups, which were defined by ethnicity (Asian vs Caucasian), tumour histology (intestinal vs diffuse), tumour site (cardia vs non-cardia) and Helicobacter pylori infection status (positive vs negative).
Literature search identified 824 eligible studies comprising 2 530 706 subjects (cases: 261 386 (10.3%)) and investigating 2841 polymorphisms involving 952 distinct genes. Overall, we performed 456 primary and subgroup meta-analyses on 156 variants involving 101 genes. We identified 11 variants significantly associated with disease risk and assessed to have a high level of summary evidence: MUC1 rs2070803 at 1q22 (diffuse carcinoma subgroup), MTX1 rs2075570 at 1q22 (diffuse), PSCA rs2294008 at 8q24.2 (non-cardia), PRKAA1 rs13361707 5p13 (non-cardia), PLCE1 rs2274223 10q23 (cardia), TGFBR2 rs3087465 3p22 (Asian), PKLR rs3762272 1q22 (diffuse), PSCA rs2976392 (intestinal), GSTP1 rs1695 11q13 (Asian), CASP8 rs3834129 2q33 (mixed) and TNF rs1799724 6p21.3 (mixed), with the first nine variants characterised by a low FPRP. We also identified polymorphisms with lower quality significant associations (n=110).
We have identified several high-quality biomarkers of gastric cancer susceptibility. These data will form the backbone of an annually updated online resource that will be integral to the study of gastric carcinoma genetics and may inform future screening programmes.
Summary
Rice grain size and weight are major determinants of grain quality and yield and so have been under rigorous selection since domestication. However, the genetic basis for contrasting grain ...size/weight trait among Indian germplasms and their association with domestication‐driven evolution is not well understood. In this study, two long (LGG) and two short grain (SGG) genotypes were resequenced. LGG (LGR and PB 1121) differentiated from SGG (Sonasal and Bindli) by 504 439 single nucleotide polymorphisms (SNPs) and 78 166 insertion‐and‐deletion polymorphisms. The LRK gene cluster was different and a truncation mutation in the LRK8 kinase domain was associated with LGG. Phylogeny with 3000 diverse rice accessions revealed that the four sequenced genotypes belonged to the japonica group and were at the edge of the clades indicating them to be the potential source of genetic diversity available in Indian rice germplasm. Six SNPs were significantly associated with grain size/weight and the top four of these could be validated in mapping a population, suggesting this study as a valuable resource for high‐throughput genotyping. A contiguous long low‐diversity region (LDR) of approximately 6 Mb carrying a major grain weight quantitative trait loci (harbouring OsTOR gene) was identified on Chromosome 5. This LDR was identified as an evolutionary important site with significant positive selection and multiple selection sweeps, and showed association with many domestication‐related traits, including grain size/weight. The aus population retained more allelic variations in the LDR than the japonica and indica populations, suggesting it to be one of the divergence loci. All the data and analyses can be accessed from the RiceSzWtBase database.
Significance Statement
As rice grain size/weight is an important trait it has been under rigorous selection since domestication. In this study, a link between this trait and domestication‐driven evolution has been indicated. In addition to characterization of novel grain size/weight‐associated single nucleotide polymorphisms, an approximately 6 Mb low‐diversity region harbouring grain weight quantitative trait loci was identified on Chromosome 5, which turned out to be an evolutionary important site with multiple selection sweeps and introgression events, and significantly correlated with domestication‐related traits.