Existing screening tools for early detection of autism are expensive, cumbersome, time- intensive, and sometimes fall short in predictive value. In this work, we sought to apply Machine Learning (ML) ...to gold standard clinical data obtained across thousands of children at-risk for autism spectrum disorder to create a low-cost, quick, and easy to apply autism screening tool.
Two algorithms are trained to identify autism, one based on short, structured parent-reported questionnaires and the other on tagging key behaviors from short, semi-structured home videos of children. A combination algorithm is then used to combine the results into a single assessment of higher accuracy. To overcome the scarcity, sparsity, and imbalance of training data, we apply novel feature selection, feature engineering, and feature encoding techniques. We allow for inconclusive determination where appropriate in order to boost screening accuracy when conclusive. The performance is then validated in a controlled clinical study.
A multi-center clinical study of n = 162 children is performed to ascertain the performance of these algorithms and their combination. We demonstrate a significant accuracy improvement over standard screening tools in measurements of AUC, sensitivity, and specificity.
These findings suggest that a mobile, machine learning process is a reliable method for detection of autism outside of clinical settings. A variety of confounding factors in the clinical analysis are discussed along with the solutions engineered into the algorithms. Final results are statistically limited and will benefit from future clinical studies to extend the sample size.
Recent research has uncovered an important role for de novo variation in neurodevelopmental disorders. Using aggregated data from 9,246 families with autism spectrum disorder, intellectual ...disability, or developmental delay, we found that ∼1/3 of de novo variants are independently present as standing variation in the Exome Aggregation Consortium's cohort of 60,706 adults, and these de novo variants do not contribute to neurodevelopmental risk. We further used a loss-of-function (LoF)-intolerance metric, pLI, to identify a subset of LoF-intolerant genes containing the observed signal of associated de novo protein-truncating variants (PTVs) in neurodevelopmental disorders. LoF-intolerant genes also carry a modest excess of inherited PTVs, although the strongest de novo-affected genes contribute little to this excess, thus suggesting that the excess of inherited risk resides in lower-penetrant genes. These findings illustrate the importance of population-based reference cohorts for the interpretation of candidate pathogenic variants, even for analyses of complex diseases and de novo variation.
The Mobile Autism Risk Assessment (MARA) is a new, electronically administered, 7-question autism spectrum disorder (ASD) screen to triage those at highest risk for ASD. Children 16 months–17 years ...(N = 222) were screened during their first visit in a developmental-behavioral pediatric clinic. MARA scores were compared to diagnosis from the clinical encounter. Participant median age was 5.8 years, 76.1 % were male, and most participants had an intelligence/developmental quotient score >85; 69 of the participants (31 %) received a clinical diagnosis of ASD. The sensitivity of the MARA in detecting ASD was 89.9 % 95 % CI = 82.7–97; the specificity was 79.7 % 95 % CI = 73.4–86.1. In a high-risk clinical setting, the MARA shows promise as a screen to distinguish ASD from other developmental/behavioral disorders.
The burden of comorbidity in Autism Spectrum Disorder (ASD) is substantial. The symptoms of autism overlap with many other human conditions, reflecting common molecular pathologies suggesting that ...cross-disorder analysis will help prioritize autism gene candidates. Genes in the intersection between autism and related conditions may represent nonspecific indicators of dysregulation while genes unique to autism may play a more causal role. Thorough literature review allowed us to extract 125 ICD-9 codes comorbid to ASD that we mapped to 30 specific human disorders. In the present work, we performed an automated extraction of genes associated with ASD and its comorbid disorders, and found 1031 genes involved in ASD, among which 262 are involved in ASD only, with the remaining 779 involved in ASD and at least one comorbid disorder. A pathway analysis revealed 13 pathways not involved in any other comorbid disorders and therefore unique to ASD, all associated with basal cellular functions. These pathways differ from the pathways associated with both ASD and its comorbid conditions, with the latter being more specific to neural function. To determine whether the sequence of these genes have been subjected to differential evolutionary constraints, we studied long term constraints by looking into Genomic Evolutionary Rate Profiling, and showed that genes involved in several comorbid disorders seem to have undergone more purifying selection than the genes involved in ASD only. This result was corroborated by a higher dN/dS ratio for genes unique to ASD as compare to those that are shared between ASD and its comorbid disorders. Short-term evolutionary constraints showed the same trend as the pN/pS ratio indicates that genes unique to ASD were under significantly less evolutionary constraint than the genes associated with all other disorders.
Spontaneously arising (de novo) mutations have an important role in medical genetics. For diseases with extensive locus heterogeneity, such as autism spectrum disorders (ASDs), the signal from de ...novo mutations is distributed across many genes, making it difficult to distinguish disease-relevant mutations from background variation. Here we provide a statistical framework for the analysis of excesses in de novo mutation per gene and gene set by calibrating a model of de novo mutation. We applied this framework to de novo mutations collected from 1,078 ASD family trios, and, whereas we affirmed a significant role for loss-of-function mutations, we found no excess of de novo loss-of-function mutations in cases with IQ above 100, suggesting that the role of de novo mutations in ASDs might reside in fundamental neurodevelopmental processes. We also used our model to identify ∼1,000 genes that are significantly lacking in functional coding variation in non-ASD samples and are enriched for de novo loss-of-function mutations identified in ASD cases.
Autism has become a pressing healthcare challenge. The instruments used to aid diagnosis are time and labor expensive and require trained clinicians to administer, leading to long wait times for ...at-risk children. We present a multi-modular, machine learning-based assessment of autism comprising three complementary modules for a unified outcome of diagnostic-grade reliability: A 4-minute, parent-report questionnaire delivered via a mobile app, a list of key behaviors identified from 2-minute, semi-structured home videos of children, and a 2-minute questionnaire presented to the clinician at the time of clinical assessment. We demonstrate the assessment reliability in a blinded, multi-site clinical study on children 18-72 months of age (n = 375) in the United States. It outperforms baseline screeners administered to children by 0.35 (90% CI: 0.26 to 0.43) in AUC and 0.69 (90% CI: 0.58 to 0.81) in specificity when operating at 90% sensitivity. Compared to the baseline screeners evaluated on children less than 48 months of age, our assessment outperforms the most accurate by 0.18 (90% CI: 0.08 to 0.29 at 90%) in AUC and 0.30 (90% CI: 0.11 to 0.50) in specificity when operating at 90% sensitivity.
The Autism Diagnostic Interview-Revised (ADI-R) is one of the most commonly used instruments for assisting in the behavioral diagnosis of autism. The exam consists of 93 questions that must be ...answered by a care provider within a focused session that often spans 2.5 hours. We used machine learning techniques to study the complete sets of answers to the ADI-R available at the Autism Genetic Research Exchange (AGRE) for 891 individuals diagnosed with autism and 75 individuals who did not meet the criteria for an autism diagnosis. Our analysis showed that 7 of the 93 items contained in the ADI-R were sufficient to classify autism with 99.9% statistical accuracy. We further tested the accuracy of this 7-question classifier against complete sets of answers from two independent sources, a collection of 1654 individuals with autism from the Simons Foundation and a collection of 322 individuals with autism from the Boston Autism Consortium. In both cases, our classifier performed with nearly 100% statistical accuracy, properly categorizing all but one of the individuals from these two resources who previously had been diagnosed with autism through the standard ADI-R. Our ability to measure specificity was limited by the small numbers of non-spectrum cases in the research data used, however, both real and simulated data demonstrated a range in specificity from 99% to 93.8%. With incidence rates rising, the capacity to diagnose autism quickly and effectively requires careful design of behavioral assessment methods. Ours is an initial attempt to retrospectively analyze large data repositories to derive an accurate, but significantly abbreviated approach that may be used for rapid detection and clinical prioritization of individuals likely to have an autism spectrum disorder. Such a tool could assist in streamlining the clinical diagnostic process overall, leading to faster screening and earlier treatment of individuals with autism.
Autism spectrum disorder (autism) is a neurodevelopmental delay that affects at least 1 in 44 children. Like many neurological disorder phenotypes, the diagnostic features are observable, can be ...tracked over time, and can be managed or even eliminated through proper therapy and treatments. However, there are major bottlenecks in the diagnostic, therapeutic, and longitudinal tracking pipelines for autism and related neurodevelopmental delays, creating an opportunity for novel data science solutions to augment and transform existing workflows and provide increased access to services for affected families. Several efforts previously conducted by a multitude of research labs have spawned great progress toward improved digital diagnostics and digital therapies for children with autism. We review the literature on digital health methods for autism behavior quantification and beneficial therapies using data science. We describe both case-control studies and classification systems for digital phenotyping. We then discuss digital diagnostics and therapeutics that integrate machine learning models of autism-related behaviors, including the factors that must be addressed for translational use. Finally, we describe ongoing challenges and potential opportunities for the field of autism data science. Given the heterogeneous nature of autism and the complexities of the relevant behaviors, this review contains insights that are relevant to neurological behavior analysis and digital psychiatry more broadly.
Autism behavioral therapy is effective but expensive and difficult to access. While mobile technology-based therapy can alleviate wait-lists and scale for increasing demand, few clinical trials exist ...to support its use for autism spectrum disorder (ASD) care.
To evaluate the efficacy of Superpower Glass, an artificial intelligence-driven wearable behavioral intervention for improving social outcomes of children with ASD.
A randomized clinical trial in which participants received the Superpower Glass intervention plus standard of care applied behavioral analysis therapy and control participants received only applied behavioral analysis therapy. Assessments were completed at the Stanford University Medical School, and enrolled participants used the Superpower Glass intervention in their homes. Children aged 6 to 12 years with a formal ASD diagnosis who were currently receiving applied behavioral analysis therapy were included. Families were recruited between June 2016 and December 2017. The first participant was enrolled on November 1, 2016, and the last appointment was completed on April 11, 2018. Data analysis was conducted between April and October 2018.
The Superpower Glass intervention, deployed via Google Glass (worn by the child) and a smartphone app, promotes facial engagement and emotion recognition by detecting facial expressions and providing reinforcing social cues. Families were asked to conduct 20-minute sessions at home 4 times per week for 6 weeks.
Four socialization measures were assessed using an intention-to-treat analysis with a Bonferroni test correction.
Overall, 71 children (63 boys 89%; mean SD age, 8.38 2.46 years) diagnosed with ASD were enrolled (40 56.3% were randomized to treatment, and 31 (43.7%) were randomized to control). Children receiving the intervention showed significant improvements on the Vineland Adaptive Behaviors Scale socialization subscale compared with treatment as usual controls (mean SD treatment impact, 4.58 1.62; P = .005). Positive mean treatment effects were also found for the other 3 primary measures but not to a significance threshold of P = .0125.
The observed 4.58-point average gain on the Vineland Adaptive Behaviors Scale socialization subscale is comparable with gains observed with standard of care therapy. To our knowledge, this is the first randomized clinical trial to demonstrate efficacy of a wearable digital intervention to improve social behavior of children with ASD. The intervention reinforces facial engagement and emotion recognition, suggesting either or both could be a mechanism of action driving the observed improvement. This study underscores the potential of digital home therapy to augment the standard of care.
ClinicalTrials.gov identifier: NCT03569176.
The unmapped readspace of whole genome sequencing data tends to be large but is often ignored. We posit that it contains valuable signals of both human infection and contamination. Using unmapped and ...poorly aligned reads from whole genome sequences (WGS) of over 1000 families and nearly 5000 individuals, we present insights into common viral, bacterial, and computational contamination that plague whole genome sequencing studies. We present several notable results: (1) In addition to known contaminants such as Epstein-Barr virus and phiX, sequences from whole blood and lymphocyte cell lines contain many other contaminants, likely originating from storage, prep, and sequencing pipelines. (2) Sequencing plate and biological sample source of a sample strongly influence contamination profile. And, (3) Y-chromosome fragments not on the human reference genome commonly mismap to bacterial reference genomes. Both experiment-derived and computational contamination is prominent in next-generation sequencing data. Such contamination can compromise results from WGS as well as metagenomics studies, and standard protocols for identifying and removing contamination should be developed to ensure the fidelity of sequencing-based studies.