Extremely rare diseases are increasingly recognized due to wide-spread, inexpensive genomic sequencing. Understanding the incidence of rare disease is important for appreciating its health impact and ...allocating recourses for research. However, estimating incidence of rare disease is challenging because the individual contributory alleles are, themselves, extremely rare. We propose a new method to determine incidence of rare, severe, recessive disease in non-consanguineous populations that use known allele frequencies, estimate the combined allele frequency of observed alleles and estimate the number of causative alleles that are thus far unobserved in a disease cohort. Experiments on simulated and real data show that this approach is a feasible method to estimate the incidence of rare disease in European populations but due to several limitations in our ability to assess the full spectrum of pathogenic mutations serves as a useful tool to provide a lower threshold on disease incidence.
Abstract
A new and automated method is presented for the analysis of high-resolution absorption spectra. Three established numerical methods are unified into one ‘artificial intelligence’ process: a ...genetic algorithm (Genetic Voigt Profile FIT, gvpfit); non-linear least-squares with parameter constraints (vpfit); and Bayesian model averaging (BMA). The method has broad application but here we apply it specifically to the problem of measuring the fine structure constant at high redshift. For this we need objectivity and reproducibility. gvpfit is also motivated by the importance of obtaining a large statistical sample of measurements of Δα/α. Interactive analyses are both time consuming and complex and automation makes obtaining a large sample feasible. In contrast to previous methodologies, we use BMA to derive results using a large set of models and show that this procedure is more robust than a human picking a single preferred model since BMA avoids the systematic uncertainties associated with model choice. Numerical simulations provide stringent tests of the whole process and we show using both real and simulated spectra that the unified automated fitting procedure out-performs a human interactive analysis. The method should be invaluable in the context of future instrumentation like ESPRESSO on the VLT and indeed future ELTs. We apply the method to the zabs = 1.8389 absorber towards the zem = 2.145 quasar J110325−264515. The derived constraint of Δα/α = 3.3 ± 2.9 × 10−6 is consistent with no variation and also consistent with the tentative spatial variation reported in Webb et al. and King et al.
This study was undertaken to conduct a comprehensive investigation of the role of DNA damage repair (DDR) defects in poor outcome ER
disease.
Expression and mutational status of DDR genes in ER
...breast tumors were correlated with proliferative response in neoadjuvant aromatase inhibitor therapy trials (discovery dataset), with outcomes in METABRIC, TCGA, and Loi datasets (validation datasets), and in patient-derived xenografts. A causal relationship between candidate DDR genes and endocrine treatment response, and the underlying mechanism, was then tested in ER
breast cancer cell lines.
Correlations between loss of expression of three genes:
(
< 0.001) and
(
= 0.01) from the nucleotide excision repair (NER) and
(
= 0.04) from the base excision repair (BER) pathways were associated with endocrine treatment resistance in discovery dataset, and subsequently validated in independent patient cohorts. Complementary mutation analysis supported associations between mutations in NER and BER genes and reduced endocrine treatment response. A causal role for
, and
loss in intrinsic endocrine resistance was experimentally validated in ER
breast cancer cell lines, and in ER
patient-derived xenograft models. Loss of
, or
induced endocrine treatment resistance by dysregulating G
-S transition, and therefore, increased sensitivity to CDK4/6 inhibitors. A combined DDR signature score was developed that predicted poor outcome in multiple patient cohorts.
This report identifies DDR defects as a new class of endocrine treatment resistance drivers and indicates new avenues for predicting efficacy of CDK4/6 inhibition in the adjuvant treatment setting.
.
Breast cancer is one of the most commonly diagnosed cancers in women. While there are several effective therapies for breast cancer and important single gene prognostic/predictive markers, more than ...40,000 women die from this disease every year. The increasing availability of large-scale genomic datasets provides opportunities for identifying factors that influence breast cancer survival in smaller, well-defined subsets. The purpose of this study was to investigate the genomic landscape of various breast cancer subtypes and its potential associations with clinical outcomes. We used statistical analysis of sequence data generated by the Cancer Genome Atlas initiative including somatic mutation load (SML) analysis, Kaplan–Meier survival curves, gene mutational frequency, and mutational enrichment evaluation to study the genomic landscape of breast cancer. We show that ER
+
, but not ER
−
, tumors with high SML associate with poor overall survival (HR = 2.02). Further, these high mutation load tumors are enriched for coincident mutations in both DNA damage repair and ER signature genes. While it is known that somatic mutations in specific genes affect breast cancer survival, this study is the first to identify that SML may constitute an important global signature for a subset of ER
+
tumors prone to high mortality. Moreover, although somatic mutations in individual DNA damage genes affect clinical outcome, our results indicate that coincident mutations in DNA damage response and signature ER genes may prove more informative for ER
+
breast cancer survival. Next generation sequencing may prove an essential tool for identifying pathways underlying poor outcomes and for tailoring therapeutic strategies.
Quasar absorption lines provide a precise test of whether the fine-structure constant, α, is the same in different places and through cosmological time. We present a new analysis of a large sample of ...quasar absorption-line spectra obtained using the Ultraviolet and Visual Echelle Spectrograph (UVES) on the Very Large Telescope (VLT) in Chile. We apply the many-multiplet method to derive values of Δα/α≡ (α
z
−α0)/α0 from 154 absorbers, and combine these values with 141 values from previous observations at the Keck Observatory in Hawaii. In the VLT sample, we find evidence that α increases with increasing cosmological distance from Earth. However, as previously shown, the Keck sample provided evidence for a smaller α in the distant absorption clouds. Upon combining the samples, an apparent variation of α across the sky emerges which is well represented by an angular dipole model pointing in the direction RA = 17.3 ± 1.0 h and Dec. =−61°± 10°, with amplitude
. The dipole model is required at the 4.1σ statistical significance level over a simple monopole model where α is the same across the sky (but possibly different from the current laboratory value). The data sets reveal remarkable consistencies: (i) the directions of dipoles fitted to the VLT and Keck samples separately agree; (ii) the directions of dipoles fitted to z < 1.6 and z > 1.6 cuts of the combined VLT+Keck samples agree; and (iii) in the equatorial region of the dipole, where both the Keck and VLT samples contribute a significant number of absorbers, there is no evidence for inconsistency between Keck and VLT. The amplitude of the dipole is clearly larger at higher redshift. Assuming a dipole-only (i.e. no-monopole) model whose amplitude grows proportionally with 'lookback-time distance' (r=ct, where t is the lookback time), the amplitude is (1.1 ± 0.2) × 10−6 GLyr−1 and the model is significant at the 4.2σ confidence level over the null model (Δα/α≡ 0). We apply robustness checks and demonstrate that the dipole effect does not originate from a small subset of the absorbers or spectra. We present an analysis of systematic effects, and are unable to identify any single systematic effect which can emulate the observed variation in α. To the best of our knowledge, this result is not in conflict with any other observational or experimental result.
Whole-exome sequencing is a diagnostic approach for the identification of molecular defects in patients with suspected genetic disorders.
We developed technical, bioinformatic, interpretive, and ...validation pipelines for whole-exome sequencing in a certified clinical laboratory to identify sequence variants underlying disease phenotypes in patients.
We present data on the first 250 probands for whom referring physicians ordered whole-exome sequencing. Patients presented with a range of phenotypes suggesting potential genetic causes. Approximately 80% were children with neurologic phenotypes. Insurance coverage was similar to that for established genetic tests. We identified 86 mutated alleles that were highly likely to be causative in 62 of the 250 patients, achieving a 25% molecular diagnostic rate (95% confidence interval, 20 to 31). Among the 62 patients, 33 had autosomal dominant disease, 16 had autosomal recessive disease, and 9 had X-linked disease. A total of 4 probands received two nonoverlapping molecular diagnoses, which potentially challenged the clinical diagnosis that had been made on the basis of history and physical examination. A total of 83% of the autosomal dominant mutant alleles and 40% of the X-linked mutant alleles occurred de novo. Recurrent clinical phenotypes occurred in patients with mutations that were highly likely to be causative in the same genes and in different genes responsible for genetically heterogeneous disorders.
Whole-exome sequencing identified the underlying genetic defect in 25% of consecutive patients referred for evaluation of a possible genetic condition. (Funded by the National Human Genome Research Institute.).
Genetic disorders are a leading cause of morbidity and mortality in infants. Rapid whole-genome sequencing (rWGS) can diagnose genetic disorders in time to change acute medical or surgical management ...(clinical utility) and improve outcomes in acutely ill infants. We report a retrospective cohort study of acutely ill inpatient infants in a regional children's hospital from July 2016-March 2017. Forty-two families received rWGS for etiologic diagnosis of genetic disorders. Probands also received standard genetic testing as clinically indicated. Primary end-points were rate of diagnosis, clinical utility, and healthcare utilization. The latter was modelled in six infants by comparing actual utilization with matched historical controls and/or counterfactual utilization had rWGS been performed at different time points. The diagnostic sensitivity of rWGS was 43% (eighteen of 42 infants) and 10% (four of 42 infants) for standard genetic tests (
= .0005). The rate of clinical utility of rWGS (31%, thirteen of 42 infants) was significantly greater than for standard genetic tests (2%, one of 42;
= .0015). Eleven (26%) infants with diagnostic rWGS avoided morbidity, one had a 43% reduction in likelihood of mortality, and one started palliative care. In six of the eleven infants, the changes in management reduced inpatient cost by $800,000-$2,000,000. These findings replicate a prior study of the clinical utility of rWGS in acutely ill inpatient infants, and demonstrate improved outcomes and net healthcare savings. rWGS merits consideration as a first tier test in this setting.
Charcot-Marie-Tooth (CMT) disease is a clinically and genetically heterogeneous distal symmetric polyneuropathy. Whole-exome sequencing (WES) of 40 individuals from 37 unrelated families with ...CMT-like peripheral neuropathy refractory to molecular diagnosis identified apparent causal mutations in ∼45% (17/37) of families. Three candidate disease genes are proposed, supported by a combination of genetic and in vivo studies. Aggregate analysis of mutation data revealed a significantly increased number of rare variants across 58 neuropathy-associated genes in subjects versus controls, confirmed in a second ethnically discrete neuropathy cohort, suggesting that mutation burden potentially contributes to phenotypic variability. Neuropathy genes shown to have highly penetrant Mendelizing variants (HPMVs) and implicated by burden in families were shown to interact genetically in a zebrafish assay exacerbating the phenotype established by the suppression of single genes. Our findings suggest that the combinatorial effect of rare variants contributes to disease burden and variable expressivity.
Display omitted
•WES of a neuropathy cohort identifies causal variants in ∼45% of patients•Three candidate disease genes associated with peripheral neuropathy are proposed•Evidence for genetic mutation burden is found in two independent cohorts•Variant combinatorial effects may contribute to clinical variability and expressivity
Peripheral neuropathy is a clinically variable and genetically heterogeneous disease. In a cohort of patients, Gonzaga-Jauregui et al. have identified causative variants in ∼45% of the families studied, proposed candidate disease genes for an additional three families, and recognized a significant mutation burden in patients versus controls that likely contributes to phenotypic variability.
NGLY1 Deficiency is an ultra-rare, multisystemic disease caused by biallelic pathogenic NGLY1 variants. The aims of this study were to (1) characterize the variants and clinical features of the ...largest cohort of NGLY1 Deficiency patients reported to date, and (2) estimate the incidence of this disorder.
The Grace Science Foundation collected genotypic data from 74 NGLY1 Deficiency patients, of which 37 also provided phenotypic data. We analyzed NGLY1 variants and clinical features and estimated NGLY1 disease incidence in the United States (U.S.).
Analysis of patient genotypes, including 10 previously unreported NGLY1 variants, showed strong statistical enrichment for missense variants in the transglutaminase-like domain of NGLY1 (p < 1.96E-11). Caregivers reported global developmental delay, movement disorder, and alacrima in over 85% of patients. Some phenotypic differences were noted between males and females. Regression was reported for all patients over 14 years old by their caregivers. The calculated U.S. incidence of NGLY1 Deficiency was ~ 12 individuals born per year.
The estimated U.S. incidence of NGLY1 indicates the disease may be more common than the number of patients reported in the literature suggests. Given the low frequency of most variants and proportion of compound heterozygotes, genotype/phenotype correlations were not distinguishable.
Enrichment of loci by DNA hybridization-capture, followed by high-throughput sequencing, is an important tool in modern genetics. Currently, the most common targets for enrichment are the protein ...coding exons represented by the consensus coding DNA sequence (CCDS). The CCDS, however, excludes many actual or computationally predicted coding exons present in other databases, such as RefSeq and Vega, and non-coding functional elements such as untranslated and regulatory regions. The number of variants per base pair (variant density) and our ability to interrogate regions outside of the CCDS regions is consequently less well understood.
We examine capture sequence data from outside of the CCDS regions and find that extremes of GC content that are present in different subregions of the genome can reduce the local capture sequence coverage to less than 50% relative to the CCDS. This effect is due to biases inherent in both the Illumina and SOLiD sequencing platforms that are exacerbated by the capture process. Interestingly, for two subregion types, microRNA and predicted exons, the capture process yields higher than expected coverage when compared to whole genome sequencing. Lastly, we examine the variation present in non-CCDS regions and find that predicted exons, as well as exonic regions specific to RefSeq and Vega, show much higher variant densities than the CCDS.
We show that regions outside of the CCDS perform less efficiently in capture sequence experiments. Further, we show that the variant density in computationally predicted exons is more than 2.5-times higher than that observed in the CCDS.