Abstract
The Drug-Gene Interaction Database (DGIdb, www.dgidb.org) is a web resource that provides information on drug-gene interactions and druggable genes from publications, databases, and other ...web-based sources. Drug, gene, and interaction data are normalized and merged into conceptual groups. The information contained in this resource is available to users through a straightforward search interface, an application programming interface (API), and TSV data downloads. DGIdb 4.0 is the latest major version release of this database. A primary focus of this update was integration with crowdsourced efforts, leveraging the Drug Target Commons for community-contributed interaction data, Wikidata to facilitate term normalization, and export to NDEx for drug-gene interaction network representations. Seven new sources have been added since the last major version release, bringing the total number of sources included to 41. Of the previously aggregated sources, 15 have been updated. DGIdb 4.0 also includes improvements to the process of drug normalization and grouping of imported sources. Other notable updates include the introduction of a more sophisticated Query Score for interaction search results, an updated Interaction Score, the inclusion of interaction directionality, and several additional improvements to search features, data releases, licensing documentation and the application framework.
The Open Regulatory Annotation database (ORegAnno) is a resource for curated regulatory annotation. It contains information about regulatory regions, transcription factor binding sites, RNA binding ...sites, regulatory variants, haplotypes, and other regulatory elements. ORegAnno differentiates itself from other regulatory resources by facilitating crowd-sourced interpretation and annotation of regulatory observations from the literature and highly curated resources. It contains a comprehensive annotation scheme that aims to describe both the elements and outcomes of regulatory events. Moreover, ORegAnno assembles these disparate data sources and annotations into a single, high quality catalogue of curated regulatory information. The current release is an update of the database previously featured in the NAR Database Issue, and now contains 1 948 307 records, across 18 species, with a combined coverage of 334 215 080 bp. Complete records, annotation, and other associated data are available for browsing and download at http://www.oreganno.org/.
Somatic mutations within non-coding regions and even exons may have unidentified regulatory consequences that are often overlooked in analysis workflows. Here we present RegTools ( www.regtools.org ...), a computationally efficient, free, and open-source software package designed to integrate somatic variants from genomic data with splice junctions from bulk or single cell transcriptomic data to identify variants that may cause aberrant splicing. We apply RegTools to over 9000 tumor samples with both tumor DNA and RNA sequence data. RegTools discovers 235,778 events where a splice-associated variant significantly increases the splicing of a particular junction, across 158,200 unique variants and 131,212 unique junctions. To characterize these somatic variants and their associated splice isoforms, we annotate them with the Variant Effect Predictor, SpliceAI, and Genotype-Tissue Expression junction counts and compare our results to other tools that integrate genomic and transcriptomic data. While many events are corroborated by the aforementioned tools, the flexibility of RegTools also allows us to identify splice-associated variants in known cancer drivers, such as TP53, CDKN2A, and B2M, and other genes.
Circulating tumor DNA (ctDNA) in peripheral blood has been used to predict prognosis and therapeutic response for triple-negative breast cancer (TNBC) patients. However, previous approaches typically ...use large comprehensive panels of genes commonly mutated across all breast cancers. Given the reduction in sequencing costs and decreased turnaround times associated with panel generation, the objective of this study was to assess the use of custom micro-panels for tracking disease and predicting clinical outcomes for patients with TNBC. Paired tumor-normal samples from patients with TNBC were obtained at diagnosis (T0) and whole exome sequencing (WES) was performed to identify somatic variants associated with individual tumors. Custom micro-panels of 4-6 variants were created for each individual enrolled in the study. Peripheral blood was obtained at baseline, during Cycle 1 Day 3, at time of surgery, and in 3-6 month intervals after surgery to assess variant allele fraction (VAF) at different timepoints during disease course. The VAF was compared to clinical outcomes to evaluate the ability of custom micro-panels to predict pathological response, disease-free intervals, and patient relapse. A cohort of 50 individuals were evaluated for up to 48 months post-diagnosis of TNBC. In total, there were 33 patients who did not achieve pathological complete response (pCR) and seven patients developed clinical relapse. For all patients who developed clinical relapse and had peripheral blood obtained ≤ 6 months prior to relapse (n = 4), the custom ctDNA micro-panels identified molecular relapse at an average of 4.3 months prior to clinical relapse. The custom ctDNA panel results were moderately associated with pCR such that during disease monitoring, only 11% of patients with pCR had a molecular relapse, whereas 47% of patients without pCR had a molecular relapse (Chi-Square; p-value = 0.10). In this study, we show that a custom micro-panel of 4-6 markers can be effectively used to predict outcomes and monitor remission for patients with TNBC. These custom micro-panels show high sensitivity for detecting molecular relapse in advance of clinical relapse. The use of these panels could improve patient outcomes through early detection of relapse with preemptive intervention prior to symptom onset.
Following automated variant calling, manual review of aligned read sequences is required to identify a high-quality list of somatic variants. Despite widespread use in analyzing sequence data, ...methods to standardize manual review have not been described, resulting in high inter- and intralab variability.
This manual review standard operating procedure (SOP) consists of methods to annotate variants with four different calls and 19 tags. The calls indicate a reviewer's confidence in each variant and the tags indicate commonly observed sequencing patterns and artifacts that inform the manual review call. Four individuals were asked to classify variants prior to, and after, reading the SOP and accuracy was assessed by comparing reviewer calls with orthogonal validation sequencing.
After reading the SOP, average accuracy in somatic variant identification increased by 16.7% (p value = 0.0298) and average interreviewer agreement increased by 12.7% (p value < 0.001). Manual review conducted after reading the SOP did not significantly increase reviewer time.
This SOP supports and enhances manual somatic variant detection by improving reviewer accuracy while reducing the interreviewer variability for variant calling and annotation.
•Non-cirrhotic HCC genomically resembles cirrhotic HCC•Comprehensive genome- and transcriptome-wide profiling allows detection of novel structural variants, fusions, and undiagnosed viral ...infections•NR1H4 fusions may represent a novel mechanism for tumorigenesis in HCC•Non-cirrhotic HCC is characterized by genotoxic mutational signatures and dysregulated liver metabolism•Clinical history and comprehensive omic profiling incompletely explain underlying etiologies for non-cirrhotic HCC highlighting the need for further research
Neoantigens are tumor-specific peptide sequences resulting from sources such as somatic DNA mutations. Upon loading onto major histocompatibility complex (MHC) molecules, they can trigger recognition ...by T cells. Accurate neoantigen identification is thus critical for both designing cancer vaccines and predicting response to immunotherapies. Neoantigen identification and prioritization relies on correctly predicting whether the presenting peptide sequence can successfully induce an immune response. Because most somatic mutations are single-nucleotide variants, changes between wild-type and mutated peptides are typically subtle and require cautious interpretation. A potentially underappreciated variable in neoantigen prediction pipelines is the mutation position within the peptide relative to its anchor positions for the patient's specific MHC molecules. Whereas a subset of peptide positions are presented to the T cell receptor for recognition, others are responsible for anchoring to the MHC, making these positional considerations critical for predicting T cell responses. We computationally predicted anchor positions for different peptide lengths for 328 common HLA alleles and identified unique anchoring patterns among them. Analysis of 923 tumor samples shows that 6 to 38% of neoantigen candidates are potentially misclassified and can be rescued using allele-specific knowledge of anchor positions. A subset of anchor results were orthogonally validated using protein crystallography structures. Representative anchor trends were experimentally validated using peptide-MHC stability assays and competition binding assays. By incorporating our anchor prediction results into neoantigen prediction pipelines, we hope to formalize, streamline, and improve the identification process for relevant clinical studies.
•B-ALL development in the setting of lenalidomide treatment for MM is a distinct primary malignancy with high incidence of TP53 mutations.•Chronic lenalidomide therapy appears to be capable of ...expanding rare hematopoietic cells with acquired TP53 mutations.
Display omitted
Patients with multiple myeloma (MM) who are treated with lenalidomide rarely develop a secondary B-cell acute lymphoblastic leukemia (B-ALL). The clonal and biological relationship between these sequential malignancies is not yet clear. We identified 17 patients with MM treated with lenalidomide, who subsequently developed B-ALL. Patient samples were evaluated through sequencing, cytogenetics/fluorescence in situ hybridization (FISH), immunohistochemical (IHC) staining, and immunoglobulin heavy chain (IgH) clonality assessment. Samples were assessed for shared mutations and recurrently mutated genes. Through whole exome sequencing and cytogenetics/FISH analysis of 7 paired samples (MM vs matched B-ALL), no mutational overlap between samples was observed. Unique dominant IgH clonotypes between the tumors were observed in 5 paired MM/B-ALL samples. Across all 17 B-ALL samples, 14 (83%) had a TP53 variant detected. Three MM samples with sufficient sequencing depth (>500×) revealed rare cells (average of 0.6% variant allele frequency, or 1.2% of cells) with the same TP53 variant identified in the subsequent B-ALL sample. A lack of mutational overlap between MM and B-ALL samples shows that B-ALL developed as a second malignancy arising from a founding population of cells that likely represented unrelated clonal hematopoiesis caused by a TP53 mutation. The recurrent variants in TP53 in the B-ALL samples suggest a common path for malignant transformation that may be similar to that of TP53-mutant, treatment-related acute myeloid leukemia. The presence of rare cells containing TP53 variants in bone marrow at the initiation of lenalidomide treatment suggests that cellular populations containing TP53 variants expand in the presence of lenalidomide to increase the likelihood of B-ALL development.