Differences in chromatin organization are key to the multiplicity of cell states that arise from a single genetic background, yet the landscapes of in vivo tissues remain largely uncharted. Here, we ...mapped chromatin genome-wide in a large and diverse collection of human tissues and stem cells. The maps yield unprecedented annotations of functional genomic elements and their regulation across developmental stages, lineages, and cellular environments. They also reveal global features of the epigenome, related to nuclear architecture, that also vary across cellular phenotypes. Specifically, developmental specification is accompanied by progressive chromatin restriction as the default state transitions from dynamic remodeling to generalized compaction. Exposure to serum in vitro triggers a distinct transition that involves de novo establishment of domains with features of constitutive heterochromatin. We describe how these global chromatin state transitions relate to chromosome and nuclear architecture, and discuss their implications for lineage fidelity, cellular senescence, and reprogramming.
Display omitted
► A resource of chromatin state maps for phenotypically diverse human tissues ► Annotation of regulatory elements across developmental stages and environments ► Developmental specification is accompanied by progressive chromatin restriction ► Chromatin architecture changes in cultured cells have implications for reprogramming
A large collection of chromatin state maps, representing human cells and tissues in vivo, reveals tissue-specific enhancer-like elements as well as repressive chromatin domains that arise during development or in response to nonphysiologic cellular environments and may present a hindrance to cellular reprogramming.
Artificial intelligence (AI) has been developed for echocardiography
, although it has not yet been tested with blinding and randomization. Here we designed a blinded, randomized non-inferiority ...clinical trial (ClinicalTrials.gov ID: NCT05140642; no outside funding) of AI versus sonographer initial assessment of left ventricular ejection fraction (LVEF) to evaluate the impact of AI in the interpretation workflow. The primary end point was the change in the LVEF between initial AI or sonographer assessment and final cardiologist assessment, evaluated by the proportion of studies with substantial change (more than 5% change). From 3,769 echocardiographic studies screened, 274 studies were excluded owing to poor image quality. The proportion of studies substantially changed was 16.8% in the AI group and 27.2% in the sonographer group (difference of -10.4%, 95% confidence interval: -13.2% to -7.7%, P < 0.001 for non-inferiority, P < 0.001 for superiority). The mean absolute difference between final cardiologist assessment and independent previous cardiologist assessment was 6.29% in the AI group and 7.23% in the sonographer group (difference of -0.96%, 95% confidence interval: -1.34% to -0.54%, P < 0.001 for superiority). The AI-guided workflow saved time for both sonographers and cardiologists, and cardiologists were not able to distinguish between the initial assessments by AI versus the sonographer (blinding index of 0.088). For patients undergoing echocardiographic quantification of cardiac function, initial assessment of LVEF by AI was non-inferior to assessment by sonographers.
Mammalian gene regulation is dependent on tissue-specific enhancers that can act across large distances to influence transcriptional activity. Mapping experiments have identified hundreds of ...thousands of putative enhancers whose functionality is supported by cell type-specific chromatin signatures and striking enrichments for disease-associated sequence variants. However, these studies did not address the in vivo functions of the putative elements or their chromatin states and did not determine which genes, if any, a given enhancer regulates. Here we present a strategy to investigate endogenous regulatory elements by selectively altering their chromatin state using programmable reagents. Transcription activator-like (TAL) effector repeat domains fused to the LSD1 histone demethylase efficiently remove enhancer-associated chromatin modifications from target loci, without affecting control regions. We find that inactivation of enhancer chromatin by these fusion proteins frequently causes downregulation of proximal genes, revealing enhancer target genes. Our study demonstrates the potential of epigenome editing tools to characterize an important class of functional genomic elements.
Proteolysis is a major posttranslational regulator of biology inside and outside of cells. Broad identification of optimal cleavage sites and natural substrates of proteases is critical for drug ...discovery and to understand protease biology. Here, we present a method that employs two genetically encoded substrate phage display libraries coupled with next generation sequencing (SPD-NGS) that allows up to 10,000-fold deeper sequence coverage of the typical six- to eight-residue protease cleavage sites compared to state-of-the-art synthetic peptide libraries or proteomics. We applied SPD-NGS to two classes of proteases, the intracellular caspases, and the ectodomains of the sheddases, ADAMs 10 and 17. The first library (Lib 10AA) allowed us to identify 10⁴ to 10⁵ unique cleavage sites over a 1,000-fold dynamic range of NGS counts and produced consensus and optimal cleavage motifs based position-specific scoring matrices. A second SPD-NGS library (Lib hP), which displayed virtually the entire human proteome tiled in contiguous 49 amino acid sequences with 25 amino acid overlaps, enabled us to identify candidate human proteome sequences. We identified up to 10⁴ natural linear cut sites, depending on the protease, and captured most of the examples previously identified by proteomics and predicted 10- to 100-fold more. Structural bioinformatics was used to facilitate the identification of candidate natural protein substrates. SPD-NGS is rapid, reproducible, simple to perform and analyze, inexpensive, and renewable, with unprecedented depth of coverage for substrate sequences, and is an important tool for protease biologists interested in protease specificity for specific assays and inhibitors and to facilitate identification of natural protein substrates.
The large volume of data used in cancer diagnosis presents a unique opportunity for deep learning algorithms, which improve in predictive performance with increasing data. When applying deep learning ...to cancer diagnosis, the goal is often to learn how to classify an input sample (such as images or biomarkers) into predefined categories (such as benign or cancerous). In this article, we examine examples of how deep learning algorithms have been implemented to make predictions related to cancer diagnosis using clinical, radiological, and pathological image data. We present a systematic approach for evaluating the development and application of clinical deep learning algorithms. Based on these examples and the current state of deep learning in medicine, we discuss the future possibilities in this space and outline a roadmap for implementations of deep learning in cancer diagnosis.
Accurate assessment of cardiac function is crucial for the diagnosis of cardiovascular disease
, screening for cardiotoxicity
and decisions regarding the clinical management of patients with a ...critical illness
. However, human assessment of cardiac function focuses on a limited sampling of cardiac cycles and has considerable inter-observer variability despite years of training
. Here, to overcome this challenge, we present a video-based deep learning algorithm-EchoNet-Dynamic-that surpasses the performance of human experts in the critical tasks of segmenting the left ventricle, estimating ejection fraction and assessing cardiomyopathy. Trained on echocardiogram videos, our model accurately segments the left ventricle with a Dice similarity coefficient of 0.92, predicts ejection fraction with a mean absolute error of 4.1% and reliably classifies heart failure with reduced ejection fraction (area under the curve of 0.97). In an external dataset from another healthcare system, EchoNet-Dynamic predicts the ejection fraction with a mean absolute error of 6.0% and classifies heart failure with reduced ejection fraction with an area under the curve of 0.96. Prospective evaluation with repeated human measurements confirms that the model has variance that is comparable to or less than that of human experts. By leveraging information across multiple cardiac cycles, our model can rapidly identify subtle changes in ejection fraction, is more reproducible than human evaluation and lays the foundation for precise diagnosis of cardiovascular disease in real time. As a resource to promote further innovation, we also make publicly available a large dataset of 10,030 annotated echocardiogram videos.
Echocardiography uses ultrasound technology to capture high temporal and spatial resolution images of the heart and surrounding structures, and is the most common imaging modality in cardiovascular ...medicine. Using convolutional neural networks on a large new dataset, we show that deep learning applied to echocardiography can identify local cardiac structures, estimate cardiac function, and predict systemic phenotypes that modify cardiovascular risk but not readily identifiable to human interpretation. Our deep learning model, EchoNet, accurately identified the presence of pacemaker leads (AUC = 0.89), enlarged left atrium (AUC = 0.86), left ventricular hypertrophy (AUC = 0.75), left ventricular end systolic and diastolic volumes (
= 0.74 and
= 0.70), and ejection fraction (
= 0.50), as well as predicted systemic phenotypes of age (
= 0.46), sex (AUC = 0.88), weight (
= 0.56), and height (
= 0.33). Interpretation analysis validates that EchoNet shows appropriate attention to key cardiac structures when performing human-explainable tasks and highlights hypothesis-generating regions of interest when predicting systemic phenotypes difficult for human interpretation. Machine learning on echocardiography images can streamline repetitive tasks in the clinical workflow, provide preliminary interpretation in areas with insufficient qualified cardiologists, and predict phenotypes challenging for human evaluation.
The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent ...advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a diffusion-based generative model that generates protein backbone structures via a procedure inspired by the natural folding process. We describe a protein backbone structure as a sequence of angles capturing the relative orientation of the constituent backbone atoms, and generate structures by denoising from a random, unfolded state towards a stable folded structure. Not only does this mirror how proteins natively twist into energetically favorable conformations, the inherent shift and rotational invariance of this representation crucially alleviates the need for more complex equivariant networks. We train a denoising diffusion probabilistic model with a simple transformer backbone and demonstrate that our resulting model unconditionally generates highly realistic protein structures with complexity and structural patterns akin to those of naturally-occurring proteins. As a useful resource, we release an open-source codebase and trained models for protein structure diffusion.
Abstract
The three-dimensional structure of RNA molecules plays a critical role in a wide range of cellular processes encompassing functions from riboswitches to epigenetic regulation. These RNA ...structures are incredibly dynamic and can indeed be described aptly as an ensemble of structures that shifts in distribution depending on different cellular conditions. Thus, the computational prediction of RNA structure poses a unique challenge, even as computational protein folding has seen great advances. In this review, we focus on a variety of machine learning-based methods that have been developed to predict RNA molecules’ secondary structure, as well as more complex tertiary structures. We survey commonly used modeling strategies, and how many are inspired by or incorporate thermodynamic principles. We discuss the shortcomings that various design decisions entail and propose future directions that could build off these methods to yield more robust, accurate RNA structure predictions.
Significant interobserver and interstudy variability occurs for left ventricular (LV) functional indices despite standardization of measurement techniques. Artificial intelligence models trained on ...adult echocardiograms are not likely to be applicable to a pediatric population. We present EchoNet-Peds, a video-based deep learning algorithm, which matches human expert performance of LV segmentation and ejection fraction (EF).
A large pediatric data set of 4,467 echocardiograms was used to develop EchoNet-Peds. EchoNet-Peds was trained on 80% of the data for segmentation of the left ventricle and estimation of LVEF. The remaining 20% was used to fine-tune and validate the algorithm.
In both apical 4-chamber and parasternal short-axis views, EchoNet-Peds segments the left ventricle with a Dice similarity coefficient of 0.89. EchoNet-Peds estimates EF with a mean absolute error of 3.66% and can routinely identify pediatric patients with systolic dysfunction (area under the curve of 0.95). EchoNet-Peds was trained on pediatric echocardiograms and performed significantly better to estimate EF (P < .001) than an adult model applied to the same data.
Accurate, rapid automation of EF assessment and recognition of systolic dysfunction in a pediatric population are feasible using EchoNet-Peds with the potential for far-reaching clinical impact. In addition, the first large pediatric data set of annotated echocardiograms is now publicly available for efforts to develop pediatric-specific artificial intelligence algorithms.