There has been an explosion of data describing newly recognized structural variants in the human genome. In the flurry of reporting, there has been no standard approach to collecting the data, ...assessing its quality or describing identified features. This risks becoming a rampant problem, in particular with respect to surveys of copy number variation and their application to disease studies. Here, we consider the challenges in characterizing and documenting genomic structural variants. From this, we derive recommendations for standards to be adopted, with the aim of ensuring the accurate presentation of this form of genetic variation to facilitate ongoing research.
Meiotic recombination between highly similar duplicated sequences (nonallelic homologous recombination, NAHR) generates deletions, duplications, inversions and translocations, and it is responsible ...for genetic diseases known as 'genomic disorders', most of which are caused by altered copy number of dosage-sensitive genes. NAHR hot spots have been identified within some duplicated sequences. We have developed sperm-based assays to measure the de novo rate of reciprocal deletions and duplications at four NAHR hot spots. We used these assays to dissect the relative rates of NAHR between different pairs of duplicated sequences. We show that (i) these NAHR hot spots are specific to meiosis, (ii) deletions are generated at a higher rate than their reciprocal duplications in the male germline and (iii) some of these genomic disorders are likely to have been underascertained clinically, most notably that resulting from the duplication of 7q11, the reciprocal of the deletion causing Williams-Beuren syndrome.
Copy number variants (CNVs) account for the majority of human genomic diversity in terms of base coverage. Here, we have developed and applied a new method to combine high-resolution array ...comparative genomic hybridization (CGH) data with whole-genome DNA sequencing data to obtain a comprehensive catalog of common CNVs in Asian individuals. The genomes of 30 individuals from three Asian populations (Korean, Chinese and Japanese) were interrogated with an ultra-high-resolution array CGH platform containing 24 million probes. Whole-genome sequencing data from a reference genome (NA10851, with 28.3× coverage) and two Asian genomes (AK1, with 27.8× coverage and AK2, with 32.0× coverage) were used to transform the relative copy number information obtained from array CGH experiments into absolute copy number values. We discovered 5,177 CNVs, of which 3,547 were putative Asian-specific CNVs. These common CNVs in Asian populations will be a useful resource for subsequent genetic studies in these populations, and the new method of calling absolute CNVs will be essential for applying CNV data to personalized medicine.
Using a positional cloning approach supported by comparative genomics, we have identified a previously unreported gene, EYS, at the RP25 locus on chromosome 6q12 commonly mutated in autosomal ...recessive retinitis pigmentosa. Spanning over 2 Mb, this is the largest eye-specific gene identified so far. EYS is independently disrupted in four other mammalian lineages, including that of rodents, but is well conserved from Drosophila to man and is likely to have a role in the modeling of retinal architecture.
•DT160 and DT56v shared an estimated date of common ancestor between 1769 and 1821.•DT160 replicated at a faster rate than DT56v in vitro.•Neither DT160 nor DT56v were lysed by phage released by the ...other strain.•A linear plasmid was detected in a DT56v isolate representing the first reported pBSSB1 plasmid isolated from S. Typhimurium.•DT160 contained the pSLT virulence plasmid, and the sseJ and sseK2 genes that possibly contributed to its higher prevalence.
Salmonella enterica serovar Typhimurium DT160 was the predominant cause of notified human salmonellosis cases in New Zealand from 2000 to 2010, before it was superseded by another S. Typhimurium strain, DT56 variant (DT56v). Whole genome sequencing and phenotypic testing were used to compare 109 DT160 isolates with eight DT56v isolates from New Zealand animal and human sources. Phylogenetic analysis provided evidence that DT160 and DT56v strains were distantly related with an estimated date of common ancestor between 1769 and 1821. The strains replicated at different rates but had similar antimicrobial susceptibility profiles. Both strains were resistant to the phage expressed from the chromosome of the other strain, which may have contributed to the emergence of DT56v. DT160 contained the pSLT virulence plasmid, and the sseJ and sseK2 genes that may have contributed to the higher reported prevalence compared to DT56v. A linear pBSSB1-family plasmid was also found in one of the DT56v isolates, but there was no evidence that this plasmid affected bacterial replication or antimicrobial susceptibility. One of the DT56v isolates was also sequenced using long-read technology and found to contain an uncommon chromosome arrangement for a Typhimurium isolate. This study demonstrates how comparative genomics and phenotypic testing can help identify strain-specific elements and factors that may have influenced the emergence and supersession of bacterial strains of public health importance.
The human
UGT2B17 gene varies in copy number from zero to two per individual and also differs in mean number between populations from Africa, Europe, and East Asia. We show that such a high degree of ...geographical variation is unusual and investigate its evolutionary history. This required first reinterpreting the reference sequence in this region of the genome, which is misassembled from the two different alleles separated by an artifactual gap. A corrected assembly identifies the polymorphism as a 117 kb deletion arising by nonallelic homologous recombination between ∼4.9 kb segmental duplications and allows the deletion breakpoint to be identified. We resequenced ∼12 kb of DNA spanning the breakpoint in 91 humans from three HapMap and one extended HapMap populations and one chimpanzee. Diversity was unusually high and the time to the most recent common ancestor was estimated at ∼2.4 or ∼3.0 million years by two different methods, with evidence of balancing selection in Europe. In contrast, diversity was low in East Asia where a single haplotype predominated, suggesting positive selection for the deletion in this part of the world.
Alveolar capillary dysplasia with misalignment of pulmonary veins (ACD/MPV) is a rare, neonatally lethal developmental disorder of the lung with defining histologic abnormalities typically associated ...with multiple congenital anomalies (MCA). Using array CGH analysis, we have identified six overlapping microdeletions encompassing the FOX transcription factor gene cluster in chromosome 16q24.1q24.2 in patients with ACD/MPV and MCA. Subsequently, we have identified four different heterozygous mutations (frameshift, nonsense, and no-stop) in the candidate FOXF1 gene in unrelated patients with sporadic ACD/MPV and MCA. Custom-designed, high-resolution microarray analysis of additional ACD/MPV samples revealed one microdeletion harboring FOXF1 and two distinct microdeletions upstream of FOXF1, implicating a position effect. DNA sequence analysis revealed that in six of nine deletions, both breakpoints occurred in the portions of Alu elements showing eight to 43 base pairs of perfect microhomology, suggesting replication error Microhomology-Mediated Break-Induced Replication (MMBIR)/Fork Stalling and Template Switching (FoSTeS) as a mechanism of their formation. In contrast to the association of point mutations in FOXF1 with bowel malrotation, microdeletions of FOXF1 were associated with hypoplastic left heart syndrome and gastrointestinal atresias, probably due to haploinsufficiency for the neighboring FOXC2 and FOXL1 genes. These differences reveal the phenotypic consequences of gene alterations in cis.
Medulloblastoma is the most common malignant brain tumor in children. Despite multimodal aggressive treatment, nearly half of the patients die as a result of this tumor. Identification of molecular ...markers for prognosis and development of novel pathogenesis-based therapies depends crucially on a better understanding of medulloblastoma pathomechanisms.
We performed genome-wide analysis of DNA copy number imbalances in 47 medulloblastomas using comparative genomic hybridization to large insert DNA microarrays (matrix-CGH). The expression of selected candidate genes identified by matrix-CGH was analyzed immunohistochemically on tissue microarrays representing medulloblastomas from 189 clinically well-documented patients. To identify novel prognostic markers, genomic findings and protein expression data were correlated to patient survival.
Matrix-CGH analysis revealed frequent DNA copy number alterations of several novel candidate regions. Among these, gains at 17q23.2-qter (P < .01) and losses at 17p13.1 to 17p13.3 (P = .04) were significantly correlated to poor prognosis. Within 17q23.2-qter and 7q21.2, two of the most frequently gained chromosomal regions, confined amplicons were identified that contained the PPM1D and CDK6 genes, respectively. Immunohistochemistry revealed strong expression of PPM1D in 148 (88%) of 168 and CDK6 in 50 (30%) of 169 medulloblastomas. Overexpression of CDK6 correlated significantly with poor prognosis (P < .01) and represented an independent prognostic marker of overall survival on multivariate analysis (P = .02).
We identified CDK6 as a novel molecular marker that can be determined by immunohistochemistry on routinely processed tissue specimens and may facilitate the prognostic assessment of medulloblastoma patients. Furthermore, increased protein-levels of PPM1D and CDK6 may link the TP53 and RB1 tumor suppressor pathways to medulloblastoma pathomechanisms.
Schizophrenia is a major psychiatric disease with strong evidence of genetic risk factors. Recent studies based on genome-wide study of copy number variations (CNVs) have detected novel recurrent ...submicroscopic copy number changes, including recurrent deletions at 1q21.11, 15q11.3, 15q13.3, and the recurrent CNV at the 2p16.3 neurexin 1 locus. These schizophrenia susceptibility CNV loci demonstrate that schizophrenia is, at least in part, genetic in origin and provide the basis for further investigation of mutations associated with the disease. The studies combined have also established the role of rare and—in sporadic cases—de novo variants in schizophrenia. Furthermore, neuronal-related genes and genetic pathways are starting to emerge from the CNV loci associated with schizophrenia. Here, we review the major findings in the recent literature, which begin to unravel the genetic and biological architecture of this complex human neuropsychiatric disorder.
Heterogeneity in the genome copy number of tissues is of particular importance in solid tumor biology. Furthermore, many clinical applications such as pre-implantation and non-invasive prenatal ...diagnosis would benefit from the ability to characterize individual single cells. As the amount of DNA from single cells is so small, several PCR protocols have been developed in an attempt to achieve unbiased amplification. Many of these approaches are suitable for subsequent cytogenetic analyses using conventional methodologies such as comparative genomic hybridization (CGH) to metaphase spreads. However, attempts to harness array-CGH for single-cell analysis to provide improved resolution have been disappointing. Here we describe a strategy that combines single-cell amplification using GenomePlex library technology (GenomePlex® Single Cell Whole Genome Amplification Kit, Sigma-Aldrich, UK) and detailed analysis of genomic copy number changes by high-resolution array-CGH. We show that single copy changes as small as 8.3 Mb in single cells are detected reliably with single cells derived from various tumor cell lines as well as patients presenting with trisomy 21 and Prader–Willi syndrome. Our results demonstrate the potential of this technology for studies of tumor biology and for clinical diagnostics.