Background
Some clinically important genetic variants are not easily evaluated with next‐generation sequencing (NGS) methods due to technical challenges arising from high‐ similarity copies (e.g., ...PMS2, SMN1/SMN2, GBA1, HBA1/HBA2, CYP21A2), repetitive short sequences (e.g., ARX polyalanine repeats, FMR1 AGG interruptions in CGG repeats, CFTR poly‐T/TG repeats), and other complexities (e.g., MSH2 Boland inversions).
Methods
We customized our NGS processes to detect the technically challenging variants mentioned above with adaptations including target enrichment and bioinformatic masking of similar sequences. Adaptations were validated with samples of known genotypes.
Results
Our adaptations provided high‐sensitivity and high‐specificity detection for most of the variants and provided a high‐sensitivity primary assay to be followed with orthogonal disambiguation for the others. The sensitivity of the NGS adaptations was 100% for all of the technically challenging variants. Specificity was 100% for those in PMS2, GBA1, SMN1/SMN2, and HBA1/HBA2, and for the MSH2 Boland inversion; 97.8%–100% for CYP21A2 variants; and 85.7% for ARX polyalanine repeats.
Conclusions
NGS assays can detect technically challenging variants when chemistries and bioinformatics are jointly refined. The adaptations described support a scalable, cost‐effective path to identifying all clinically relevant variants within a single sample.
Some clinically important genes and variants are not easily detected with standard next‐generation sequencing (NGS) methods due to technical challenges arising from high‐similarity copies, repetitive short sequences, and other complexities. When the chemistries and bioinformatics of NGS are jointly refined, even technically challenging genes and variants can be evaluated, including the Gaucher disease‐associated GBA, which has a high‐similarity pseudogene.
DNA variants that arise after conception can show mosaicism, varying in presence and extent among tissues. Mosaic variants have been reported in Mendelian diseases, but further investigation is ...necessary to broadly understand their incidence, transmission, and clinical impact. A mosaic pathogenic variant in a disease-related gene may cause an atypical phenotype in terms of severity, clinical features, or timing of disease onset. Using high-depth sequencing, we studied results from one million unrelated individuals referred for genetic testing for almost 1,900 disease-related genes. We observed 5,939 mosaic sequence or intragenic copy number variants distributed across 509 genes in nearly 5,700 individuals, constituting approximately 2% of molecular diagnoses in the cohort. Cancer-related genes had the most mosaic variants and showed age-specific enrichment, in part reflecting clonal hematopoiesis in older individuals. We also observed many mosaic variants in genes related to early-onset conditions. Additional mosaic variants were observed in genes analyzed for reproductive carrier screening or associated with dominant disorders with low penetrance, posing challenges for interpreting their clinical significance. When we controlled for the potential involvement of clonal hematopoiesis, most mosaic variants were enriched in younger individuals and were present at higher levels than in older individuals. Furthermore, individuals with mosaicism showed later disease onset or milder phenotypes than individuals with non-mosaic variants in the same genes. Collectively, the large compendium of variants, disease correlations, and age-specific results identified in this study expand our understanding of the implications of mosaic DNA variation for diagnosis and genetic counseling.
Truty et al. describe mosaic sequence and copy number variants identified through genetic testing. Nearly 6,000 variants across >500 genes contributed to ∼2% of molecular diagnoses. Mosaic variants were mostly in cancer-related genes, at higher levels in younger individuals, and appeared to correlate with later disease onset or milder phenotypes.
Cyclic peptides capable of activating the erythropoietin receptor (EPOR) were isolated from phage display libraries by screening with a novel EPOR-IgG fusion protein reagent. A parental clone ERB1 ...(EPO Receptor Binder 1) was first isolated from a phage display library displaying 38 random amino acids as an N-terminal fusion with the M13 minor capsid protein, pill. An evolved library was then produced from the parental sequence using an oligonucleotide saturation mutagenesis strategy which yielded EPOR binding sequences with 20 times the relative affinity of ERB1. Two synthetic peptides were constructed from these sequences both of which bind the EPO receptor in specific ELISA, and act as full agonists in EPO dependent cell proliferation assays. These peptides are 18 amino acids in length, disulfide-bonded, and have a minimum consensus sequence of CXXGWVGXCXXW, where X represents positions tolerant of several amino acids.
We tested the value of a new library mutagenesis approach, called library enzymatic inverse PCR (LEIPCR), for expression-level enhancement of antibody Fv fragments produced in Escherichia coli. The ...production level of active, metal chelate-specific antibody from our constructs is limited by a low expression level of the second, heavy-chain cistron. To increase the production level, LEIPCR was applied to the wobble bases of the second cistron leader peptide. In LEIPCR mutagenesis, the entire plasmid is amplified using mutagenic primers with class-IIS restriction endonuclease (ENase) sites at their 5' ends. The PCR product is digested with the class-IIS ENase (here, BsaI; GGTCTCNsymbol: see textNNNNsymbol: see text), which removes its own recognition sequence, and the ends are self-ligated. Thus, LEIPCR can be used to make plasmid mutant libraries regardless of the nucleotide sequence, and independent of available ENase sites. The resulting library of 10(7) wobble mutants was screened for active Fv by a colony filter lift. A selected mutant was shown to produce between four- and elevenfold more active Fv than the wild type (wt), and fivefold more heavy chain. Mutations outside of the leader peptide were shown not to be involved. The mutated areas of the mRNAs of two different up-mutants may have less secondary structure than the wt. Thus, the sequence of the mRNA of the second leader peptide was limiting to the expression level of heavy-chain and active Fv.
We have developed a cloning vector for the expression of type I cytokine receptor, NO, extracellular domain (ECD)–mouse IgG1Fc fusion proteins. The vector has a versatile polylinker that allows ...in-frame cloning of the receptor ECD with the mouse IgG1sequence to encode a receptor ECD–IgG1fusion construct. The receptor–IgG1fusion proteins are transiently expressed in useful amounts following transfection of the expression vector into COS7 cells and G418 selection. The mouse IgG1portion of the fusion protein provides a universal handle for purification on an affinity matrix and detection by anti-mouse IgG antibodies in ELISA or Western blot formats. The expressed receptor ECD–IgG1fusion proteins bind their cognate ligands. In order to demonstrate that the fusion proteins have similar ligand binding affinities as the native receptors, the affinity constants (Kd) for EPOR, TNFR, IL-4R, and IL-6R–IgG1fusion proteins were measured by surface plasmon resonance and shown to be in good agreement with published values. The TNFR–IgG1fusion protein was employed in a demonstration of a novel ELISA format for detecting cytokine receptor binding to cytokine.