The Philadelphia Neurodevelopmental Cohort (PNC) is a large-scale study of child development that combines neuroimaging, diverse clinical and cognitive phenotypes, and genomics. Data from this rich ...resource is now publicly available through the Database of Genotypes and Phenotypes (dbGaP). Here we focus on the data from the PNC that is available through dbGaP and describe how users can access this data, which is evolving to be a significant resource for the broader neuroscience community for studies of normal and abnormal neurodevelopment.
•The PNC is a large-scale study of neurodevelopment.•Data includes imaging, rich cognitive and clinical phenotyping, and genomics.•Investigators can access data through dbGaP.
Abstract
Study Objectives
To identify genetic susceptibility variants in pediatric obstructive sleep apnea in European American and African American children.
Methods
A phenotyping algorithm using ...electronic medical records was developed to recruit cases with OSA and control subjects from the Center for Applied Genomics at Children’s Hospital of Philadelphia (CHOP). Genome-wide association studies (GWAS) were performed in pediatric OSA cases and control subjects with European American (EA) and African American (AA) ancestry followed by meta-analysis and sex stratification.
Results
The algorithm accrued 1486 subjects (46.3% European American, 53.7% African American). We identified genomic loci at 1p36.22 and 15q26.1 that associated with OSA risk in EA and AA, respectively. We also revealed a shared risk locus at 18p11.32 (rs114124196, p = 1.72 × 10‐8) across EA and AA populations. Additionally, association at 1q43 (rs12754698) and 2p25.1 (rs72775219) was identified in the male-only analysis of EA children with OSA, while association at 8q21.11 (rs6472959), 11q24.3 (rs4370952) and 15q21.1 (rs149936782) was detected in the female-only analysis of EA children and association at 18p11.23 (rs9964029) was identified in the female-only analysis of African-American children. Moreover, the 18p11.32 locus was replicated in an EA cohort (rs114124196, p = 8.8 × 10‐3).
Conclusions
We report the first GWAS for pediatric OSA in European Americans and African Americans. Our results provide novel insights to the genetic underpins of pediatric OSA.
Eosinophilic esophagitis (EoE) is an allergic disorder characterized by infiltration of the oesophagus with eosinophils. We had previously reported association of the TSLP/WDR36 locus with EoE. Here ...we report genome-wide significant associations at four additional loci; c11orf30 and STAT6, which have been previously associated with both atopic and autoimmune diseases, and two EoE-specific loci, ANKRD27 that regulates the trafficking of melanogenic enzymes to epidermal melanocytes and CAPN14, that encodes a calpain whose expression is highly enriched in the oesophagus. The identification of five EoE loci, not only expands our aetiological understanding of the disease but may also represent new therapeutic targets to treat the most debilitating aspect of EoE, oesophageal inflammation and remodelling.
Despite significant advances in knowledge of the genetic architecture of asthma, specific contributors to the variability in the burden between populations remain uncovered.
To identify additional ...genetic susceptibility factors of asthma in European American and African American populations.
A phenotyping algorithm mining electronic medical records was developed and validated to recruit cases with asthma and control subjects from the Electronic Medical Records and Genomics network. Genome-wide association analyses were performed in pediatric and adult asthma cases and control subjects with European American and African American ancestry followed by metaanalysis. Nominally significant results were reanalyzed conditioning on allergy status.
The validation of the algorithm yielded an average of 95.8% positive predictive values for both cases and control subjects. The algorithm accrued 21,644 subjects (65.83% European American and 34.17% African American). We identified four novel population-specific associations with asthma after metaanalyses: loci 6p21.31, 9p21.2, and 10q21.3 in the European American population, and the PTGES gene in African Americans. TEK at 9p21.2, which encodes TIE2, has been shown to be involved in remodeling the airway wall in asthma, and the association remained significant after conditioning by allergy. PTGES, which encodes the prostaglandin E synthase, has also been linked to asthma, where deficient prostaglandin E
synthesis has been associated with airway remodeling.
This study adds to understanding of the genetic architecture of asthma in European Americans and African Americans and reinforces the need to study populations of diverse ethnic backgrounds to identify shared and unique genetic predictors of asthma.
Background
An integrative multidisciplinary approach is required to elucidate the multiple factors that shape neurodevelopmental trajectories of mental disorders. The Philadelphia Neurodevelopmental ...Cohort (PNC), funded by the National Institute of Mental Health Grand Opportunity (GO) mechanism of the American Recovery and Reinvestment Act, was designed to characterize clinical and neurobehavioral phenotypes of genotyped youths. Data generated, which are recently available through the NIMH Database of Genotypes and Phenotypes (dbGaP), have garnered considerable interest. We provide an overview of PNC recruitment and clinical assessment methods to allow informed use and interpretation of the PNC resource by the scientific community. We also evaluate the structure of the assessment tools and their criterion validity.
Methods
Participants were recruited from a large pool of youths (n = 13,958) previously identified and genotyped at The Children's Hospital of Philadelphia. A comprehensive computerized tool for structured evaluation of psychopathology domains (GOASSESS) was constructed. We administered GOASSESS to all participants and used factor analysis to evaluate its structure.
Results
A total of 9,498 youths (aged 8–21; mean age = 14.2; European American = 55.8%; African American = 32.9%; Other = 11.4%) were enrolled. Factor analysis revealed a strong general psychopathology factor, and specific ‘anxious‐misery’, ‘fear’, and ‘behavior’ factors. The ‘behavior’ factor had a small negative correlation (−0.21) with overall accuracy of neurocognitive performance, particularly in tests of executive and complex reasoning. Being female had a high association with the ‘anxious‐misery’ and low association with the ‘behavior’ factors. The psychosis spectrum was also best characterized by a general factor and three specific factors: ideas about ‘special abilities/persecution,’ ‘unusual thoughts/perceptions’, and ‘negative/disorganized’ symptoms.
Conclusions
The PNC assessment mechanism yielded psychopathology data with strong factorial validity in a large diverse community cohort of genotyped youths. Factor scores should be useful for dimensional integration with other modalities (neuroimaging, genomics). Thus, PNC public domain resources can advance understanding of complex inter‐relationships among genes, cognition, brain, and behavior involved in neurodevelopment of common mental disorders.
Phenome-wide association studies (PheWAS) have been proposed as a possible aid in drug development through elucidating mechanisms of action, identifying alternative indications, or predicting adverse ...drug events (ADEs). Here, we select 25 single nucleotide polymorphisms (SNPs) linked through genome-wide association studies (GWAS) to 19 candidate drug targets for common disease indications. We interrogate these SNPs by PheWAS in four large cohorts with extensive health information (23andMe, UK Biobank, FINRISK, CHOP) for association with 1683 binary endpoints in up to 697,815 individuals and conduct meta-analyses for 145 mapped disease endpoints. Our analyses replicate 75% of known GWAS associations (P < 0.05) and identify nine study-wide significant novel associations (of 71 with FDR < 0.1). We describe associations that may predict ADEs, e.g., acne, high cholesterol, gout, and gallstones with rs738409 (p.I148M) in PNPLA3 and asthma with rs1990760 (p.T946A) in IFIH1. Our results demonstrate PheWAS as a powerful addition to the toolkit for drug discovery.
Mental disorders present a global health concern, while the diagnosis of mental disorders can be challenging. The diagnosis is even harder for patients who have more than one type of mental disorder, ...especially for young toddlers who are not able to complete questionnaires or standardized rating scales for diagnosis. In the past decade, multiple genomic association signals have been reported for mental disorders, some of which present attractive drug targets. Concurrently, machine learning algorithms, especially deep learning algorithms, have been successful in the diagnosis and/or labeling of complex diseases, such as attention deficit hyperactivity disorder (ADHD) or cancer. In this study, we focused on eight common mental disorders, including ADHD, depression, anxiety, autism, intellectual disabilities, speech/language disorder, delays in developments, and oppositional defiant disorder in the ethnic minority of African Americans. Blood-derived whole genome sequencing data from 4179 individuals were generated, including 1384 patients with the diagnosis of at least one mental disorder. The burden of genomic variants in coding/non-coding regions was applied as feature vectors in the deep learning algorithm. Our model showed ~65% accuracy in differentiating patients from controls. Ability to label patients with multiple disorders was similarly successful, with a hamming loss score less than 0.3, while exact diagnostic matches are around 10%. Genes in genomic regions with the highest weights showed enrichment of biological pathways involved in immune responses, antigen/nucleic acid binding, chemokine signaling pathway, and G-protein receptor activities. A noticeable fact is that variants in non-coding regions (e.g., ncRNA, intronic, and intergenic) performed equally well as variants in coding regions; however, unlike coding region variants, variants in non-coding regions do not express genomic hotspots whereas they carry much more narrow standard deviations, indicating they probably serve as alternative markers.
Major depressive disorder (MDD) is a common psychiatric and behavioral disorder. To discover novel variants conferring risk to MDD, we conducted a whole-genome scan of copy number variation (CNV), ...including 1,693 MDD cases and 4,506 controls genotyped on the Perlegen 600K platform. The most significant locus was observed on 5q35.1, harboring the SLIT3 gene (P = 2×10(-3)). Extending the controls with 30,000 subjects typed on the Illumina 550 k array, we found the CNV to remain exclusive to MDD cases (P = 3.2×10(-9)). Duplication was observed in 5 unrelated MDD cases encompassing 646 kb with highly similar breakpoints. SLIT3 is integral to repulsive axon guidance based on binding to Roundabout receptors. Duplication of 5q35.1 is a highly penetrant variation accounting for 0.7% of the subset of 647 cases harboring large CNVs, using a threshold of a minimum of 10 SNPs and 100 kb. This study leverages a large dataset of MDD cases and controls for the analysis of CNVs with matched platform and ethnicity. SLIT3 duplication is a novel association which explains a definitive proportion of the largely unknown etiology of MDD.
Improved copy number variation (CNV) detection remains an area of heavy emphasis for algorithm development; however, both CNV curation and disease association approaches remain in its infancy. The ...current practice of focusing on candidate CNVs, where researchers study specific CNVs they believe to be pathological while discarding others, refrains from considering the full spectrum of CNVs in a hypothesis-free GWAS. To address this, we present a next-generation approach to CNV association by natively supporting the popular VCF specification for sequencing-derived variants as well as SNP array calls using a PennCNV format. The code is fast and efficient, allowing for the analysis of large (>100,000 sample) cohorts without dividing up the data on a compute cluster. The scripts are condensed into a single tool to promote simplicity and best practices. CNV curation pre and post-association is rigorously supported and emphasized to yield reliable results of highest quality. We benchmarked two large datasets, including the UK Biobank (n > 450,000) and CAG Biobank (n > 350,000) both of which are genotyped at >0.5 M probes, for our input files. ParseCNV has been actively supported and developed since 2008. ParseCNV2 presents a critical addition to formalizing CNV association for inclusion with SNP associations in GWAS Catalog. Clinical CNV prioritization, interactive quality control (QC), and adjustment for covariates are revolutionary new features of ParseCNV2 vs. ParseCNV. The software is freely available at: https://github.com/CAG-CNV/ParseCNV2 .
The quest to disentangle the aetiopathogenesis of Parkinson's disease has been heavily influenced by the genes associated with the disease. The alpha-synuclein-centric theory of protein aggregation ...with the adjunct of parkin-driven proteasome deregulation has, in recent years, been complemented by the discovery and increasing knowledge of the functions of DJ1, PINK1 and OMI/HTRA2, which are all associated with the mitochondria and have been implicated in cellular protection against oxidative damage. We critically review how these genes fit into and enhance our understanding of the role of mitochondrial dysfunction in Parkinson's disease, and consider how oxidative stress might be a potential unifying factor in the aetiopathogenesis of the disease.