The volume and complexity of diagnostic imaging is increasing at a pace faster than the availability of human expertise to interpret it. Artificial intelligence has shown great promise in classifying ...two-dimensional photographs of some common diseases and typically relies on databases of millions of annotated images. Until now, the challenge of reaching the performance of expert clinicians in a real-world clinical pathway with three-dimensional diagnostic scans has remained unsolved. Here, we apply a novel deep learning architecture to a clinically heterogeneous set of three-dimensional optical coherence tomography scans from patients referred to a major eye hospital. We demonstrate performance in making a referral recommendation that reaches or exceeds that of experts on a range of sight-threatening retinal diseases after training on only 14,884 scans. Moreover, we demonstrate that the tissue segmentations produced by our architecture act as a device-independent representation; referral accuracy is maintained when using tissue segmentations from a different type of device. Our work removes previous barriers to wider clinical use without prohibitive training data requirements across multiple pathologies in a real-world setting.
The early prediction of deterioration could have an important role in supporting healthcare professionals, as an estimated 11% of deaths in hospital follow a failure to promptly recognize and treat ...deteriorating patients
. To achieve this goal requires predictions of patient risk that are continuously updated and accurate, and delivered at an individual level with sufficient context and enough time to act. Here we develop a deep learning approach for the continuous risk prediction of future deterioration in patients, building on recent work that models adverse events from electronic health records
and using acute kidney injury-a common and potentially life-threatening condition
-as an exemplar. Our model was developed on a large, longitudinal dataset of electronic health records that cover diverse clinical environments, comprising 703,782 adult patients across 172 inpatient and 1,062 outpatient sites. Our model predicts 55.8% of all inpatient episodes of acute kidney injury, and 90.2% of all acute kidney injuries that required subsequent administration of dialysis, with a lead time of up to 48 h and a ratio of 2 false alerts for every true alert. In addition to predicting future acute kidney injury, our model provides confidence assessments and a list of the clinical features that are most salient to each prediction, alongside predicted future trajectories for clinically relevant blood tests
. Although the recognition and prompt treatment of acute kidney injury is known to be challenging, our approach may offer opportunities for identifying patients at risk within a time window that enables early treatment.
We describe the operation and improvement of AlphaFold, the system that was entered by the team AlphaFold2 to the “human” category in the 14th Critical Assessment of Protein Structure Prediction ...(CASP14). The AlphaFold system entered in CASP14 is entirely different to the one entered in CASP13. It used a novel end‐to‐end deep neural network trained to produce protein structures from amino acid sequence, multiple sequence alignments, and homologous proteins. In the assessors' ranking by summed z scores (>2.0), AlphaFold scored 244.0 compared to 90.8 by the next best group. The predictions made by AlphaFold had a median domain GDT_TS of 92.4; this is the first time that this level of average accuracy has been achieved during CASP, especially on the more difficult Free Modeling targets, and represents a significant improvement in the state of the art in protein structure prediction. We reported how AlphaFold was run as a human team during CASP14 and improved such that it now achieves an equivalent level of performance without intervention, opening the door to highly accurate large‐scale structure prediction.
Progression to exudative 'wet' age-related macular degeneration (exAMD) is a major cause of visual deterioration. In patients diagnosed with exAMD in one eye, we introduce an artificial intelligence ...(AI) system to predict progression to exAMD in the second eye. By combining models based on three-dimensional (3D) optical coherence tomography images and corresponding automatic tissue maps, our system predicts conversion to exAMD within a clinically actionable 6-month time window, achieving a per-volumetric-scan sensitivity of 80% at 55% specificity, and 34% sensitivity at 90% specificity. This level of performance corresponds to true positives in 78% and 41% of individual eyes, and false positives in 56% and 17% of individual eyes at the high sensitivity and high specificity points, respectively. Moreover, we show that automatic tissue segmentation can identify anatomical changes before conversion and high-risk subgroups. This AI system overcomes substantial interobserver variability in expert predictions, performing better than five out of six experts, and demonstrates the potential of using AI to predict disease progression.
Background
Over half a million individuals are diagnosed with head and neck cancer each year globally. Radiotherapy is an important curative treatment for this disease, but it requires manual time to ...delineate radiosensitive organs at risk. This planning process can delay treatment while also introducing interoperator variability, resulting in downstream radiation dose differences. Although auto-segmentation algorithms offer a potentially time-saving solution, the challenges in defining, quantifying, and achieving expert performance remain.
Objective
Adopting a deep learning approach, we aim to demonstrate a 3D U-Net architecture that achieves expert-level performance in delineating 21 distinct head and neck organs at risk commonly segmented in clinical practice.
Methods
The model was trained on a data set of 663 deidentified computed tomography scans acquired in routine clinical practice and with both segmentations taken from clinical practice and segmentations created by experienced radiographers as part of this research, all in accordance with consensus organ at risk definitions.
Results
We demonstrated the model’s clinical applicability by assessing its performance on a test set of 21 computed tomography scans from clinical practice, each with 21 organs at risk segmented by 2 independent experts. We also introduced surface Dice similarity coefficient, a new metric for the comparison of organ delineation, to quantify the deviation between organ at risk surface contours rather than volumes, better reflecting the clinical task of correcting errors in automated organ segmentations. The model’s generalizability was then demonstrated on 2 distinct open-source data sets, reflecting different centers and countries to model training.
Conclusions
Deep learning is an effective and clinically applicable technique for the segmentation of the head and neck anatomy for radiotherapy. With appropriate validation studies and regulatory approvals, this system could improve the efficiency, consistency, and safety of radiotherapy pathways.
Deep learning has the potential to transform health care; however, substantial expertise is required to train such models. We sought to evaluate the utility of automated deep learning software to ...develop medical image diagnostic classifiers by health-care professionals with no coding-and no deep learning-expertise.
We used five publicly available open-source datasets: retinal fundus images (MESSIDOR); optical coherence tomography (OCT) images (Guangzhou Medical University and Shiley Eye Institute, version 3); images of skin lesions (Human Against Machine HAM 10000), and both paediatric and adult chest x-ray (CXR) images (Guangzhou Medical University and Shiley Eye Institute, version 3 and the National Institute of Health NIH dataset, respectively) to separately feed into a neural architecture search framework, hosted through Google Cloud AutoML, that automatically developed a deep learning architecture to classify common diseases. Sensitivity (recall), specificity, and positive predictive value (precision) were used to evaluate the diagnostic properties of the models. The discriminative performance was assessed using the area under the precision recall curve (AUPRC). In the case of the deep learning model developed on a subset of the HAM10000 dataset, we did external validation using the Edinburgh Dermofit Library dataset.
Diagnostic properties and discriminative performance from internal validations were high in the binary classification tasks (sensitivity 73·3-97·0%; specificity 67-100%; AUPRC 0·87-1·00). In the multiple classification tasks, the diagnostic properties ranged from 38% to 100% for sensitivity and from 67% to 100% for specificity. The discriminative performance in terms of AUPRC ranged from 0·57 to 1·00 in the five automated deep learning models. In an external validation using the Edinburgh Dermofit Library dataset, the automated deep learning model showed an AUPRC of 0·47, with a sensitivity of 49% and a positive predictive value of 52%.
All models, except the automated deep learning model trained on the multilabel classification task of the NIH CXR14 dataset, showed comparable discriminative performance and diagnostic properties to state-of-the-art performing deep learning algorithms. The performance in the external validation study was low. The quality of the open-access datasets (including insufficient information about patient flow and demographics) and the absence of measurement for precision, such as confidence intervals, constituted the major limitations of this study. The availability of automated deep learning platforms provide an opportunity for the medical community to enhance their understanding in model development and evaluation. Although the derivation of classification models without requiring a deep understanding of the mathematical, statistical, and programming principles is attractive, comparable performance to expertly designed models is limited to more elementary classification tasks. Furthermore, care should be placed in adhering to ethical principles when using these automated models to avoid discrimination and causing harm. Future studies should compare several application programming interfaces on thoroughly curated datasets.
National Institute for Health Research and Moorfields Eye Charity.
There are almost two million people in the United Kingdom living with sight loss, including around 360,000 people who are registered as blind or partially sighted. Sight threatening diseases, such as ...diabetic retinopathy and age related macular degeneration have contributed to the 40% increase in outpatient attendances in the last decade but are amenable to early detection and monitoring. With early and appropriate intervention, blindness may be prevented in many cases.
Ophthalmic imaging provides a way to diagnose and objectively assess the progression of a number of pathologies including neovascular ("wet") age-related macular degeneration (wet AMD) and diabetic retinopathy. Two methods of imaging are commonly used: digital photographs of the fundus (the 'back' of the eye) and Optical Coherence Tomography (OCT, a modality that uses light waves in a similar way to how ultrasound uses sound waves). Changes in population demographics and expectations and the changing pattern of chronic diseases creates a rising demand for such imaging. Meanwhile, interrogation of such images is time consuming, costly, and prone to human error. The application of novel analysis methods may provide a solution to these challenges.
This research will focus on applying novel machine learning algorithms to automatic analysis of both digital fundus photographs and OCT in Moorfields Eye Hospital NHS Foundation Trust patients.
Through analysis of the images used in ophthalmology, along with relevant clinical and demographic information, Google DeepMind Health will investigate the feasibility of automated grading of digital fundus photographs and OCT and provide novel quantitative measures for specific disease features and for monitoring the therapeutic success.
Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort
, the structures of around ...100,000 unique proteins have been determined
, but this represents a small fraction of the billions of known protein sequences
. Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'
-has been an important open research problem for more than 50 years
. Despite recent progress
, existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14)
, demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
Acute Kidney Injury (AKI), an abrupt deterioration in kidney function, is defined by changes in urine output or serum creatinine. AKI is common (affecting up to 20% of acute hospital admissions in ...the United Kingdom), associated with significant morbidity and mortality, and expensive (excess costs to the National Health Service in England alone may exceed £1 billion per year). NHS England has mandated the implementation of an automated algorithm to detect AKI based on changes in serum creatinine, and to alert clinicians. It is uncertain, however, whether 'alerting' alone improves care quality.
We have thus developed a digitally-enabled care pathway as a clinical service to inpatients in the Royal Free Hospital (RFH), a large London hospital. This pathway incorporates a mobile software application - the "Streams-AKI" app, developed by DeepMind Health - that applies the NHS AKI algorithm to routinely collected serum creatinine data in hospital inpatients. Streams-AKI alerts clinicians to potential AKI cases, furnishing them with a trend view of kidney function alongside other relevant data, in real-time, on a mobile device. A clinical response team comprising nephrologists and critical care nurses responds to these AKI alerts by reviewing individual patients and administering interventions according to existing clinical practice guidelines.
We propose a mixed methods service evaluation of the implementation of this care pathway. This evaluation will assess how the care pathway meets the health and care needs of service users (RFH inpatients), in terms of clinical outcome, processes of care, and NHS costs. It will also seek to assess acceptance of the pathway by members of the response team and wider hospital community. All analyses will be undertaken by the service evaluation team from UCL (Department of Applied Health Research) and St George's, University of London (Population Health Research Institute).
Acute Kidney Injury (AKI), an abrupt deterioration in kidney function, is defined by changes in urine output or serum creatinine. AKI is common (affecting up to 20% of acute hospital admissions in ...the United Kingdom), associated with significant morbidity and mortality, and expensive (excess costs to the National Health Service in England alone may exceed £1 billion per year). NHS England has mandated the implementation of an automated algorithm to detect AKI based on changes in serum creatinine, and to alert clinicians. It is uncertain, however, whether 'alerting' alone improves care quality.
We have thus developed a digitally-enabled care pathway as a clinical service to inpatients in the Royal Free Hospital (RFH), a large London hospital. This pathway incorporates a mobile software application - the "Streams-AKI" app, developed by DeepMind Health - that applies the NHS AKI algorithm to routinely collected serum creatinine data in hospital inpatients. Streams-AKI alerts clinicians to potential AKI cases, furnishing them with a trend view of kidney function alongside other relevant data, in real-time, on a mobile device. A clinical response team comprising nephrologists and critical care nurses responds to these AKI alerts by reviewing individual patients and administering interventions according to existing clinical practice guidelines.
We propose a mixed methods service evaluation of the implementation of this care pathway. This evaluation will assess how the care pathway meets the health and care needs of service users (RFH inpatients), in terms of clinical outcome, processes of care, and NHS costs. It will also seek to assess acceptance of the pathway by members of the response team and wider hospital community. All analyses will be undertaken by the service evaluation team from UCL (Department of Applied Health Research) and St George's, University of London (Population Health Research Institute).