E-viri
Recenzirano
Odprti dostop
-
Leung, Yuk Yee; Lee, Wan‐Ping; Kuzma, Amanda B; Gangadharan, Prabhakaran; Nicaretta, Heather Issen; Qu, Liming; Ren, Youli; Cantwell, Laura B; Valladares, Otto; Zhao, Yi; Iqbal, Taha; Schmidt, Michael A.; Mena, Pedro R.; Vardarajan, Badri N; Dalgard, Clifton L.; Kunkle, Brian W.; Bush, William S.; Martin, Eden R.; Naj, Adam C.; Haines, Jonathan L.; Pericak‐Vance, Margaret A.; Wang, Li‐San; Schellenberg, Gerald D.
Alzheimer's & dementia, December 2023, 2023-12-00, Letnik: 19, Številka: S12Journal Article
Background The Genome Center for Alzheimer’s Disease (GCAD) coordinates the integration of all available Alzheimer’s disease (AD) relevant whole genome sequencing (WGS) data with the goal of identifying AD risk or protective genetic variants and eventual therapeutic targets. The WGS datasets are generated through collaboration between investigators from the Alzheimer’s Disease Sequencing Project (ADSP) and GCAD. With the goal of minimizing data heterogeneity, introduced by different sequencing protocols and assays, GCAD processes all samples using standardized pipelines and performs quality control (QC)/quality assurance (QA) checks. Methods Raw sequencing data (FASTQs or BAMs) were aligned to GRCh38/hg38 by BWA, and variant calling and joint genotyping on single nucleotide variants (SNVs), insertions and deletions (indels), were done by GATK. Structural variants (SVs) were called per sample using the Smoove, Manta, and Strelka packages. Preliminary QA checks including sex check, contamination, and genotype concordance were performed followed by QC per ADSP protocol to evaluate the quality of samples and variants. To facilitate access and usage of massive joint‐genotype called VCF files, a compact version for storing variant info and sample genotypes only was released first. Results We dropped 275 (0.7%) samples of poor coverage (<20×), and we flagged 219 (0.6%) samples that were of borderline quality. As a result, the dataset (ADSP Release 4, 2022) includes 36,361 genomes from 40 diverse cohorts with 4 major ancestries: 16,573 Non‐Hispanic Whites, 11,358 Hispanics; 5,422 African Americans; and 2,802 Asians. Data are deeply sequenced (average genome coverage: 40x). All samples’ CRAMs and gVCFs from GATK were deposited into NIAGADS Data Sharing Service (DSS) (https://dss.niagads.org/) for public distribution. Joint‐genotyped called VCFs are undergoing a full QC/annotation process and will be made available. This joint‐genotyped called VCF contains >362M bi‐allelic variants, >58M multi‐allelic variants, with 95% of variants remaining after QC. SV calling is ongoing and data will be ready prior to the conference. Conclusion The ADSP and GCAD generate high quality SNVs, indels and SV calls. Currently GCAD is preparing the next release of ∼60,000 more ancestrally‐diverse WGS samples sequenced primarily through the ADSP Follow‐Up Study, which we anticipate will be released in 2023 to greatly benefit the AD genetics community.
Avtor
Vnos na polico
Trajna povezava
- URL:
Faktor vpliva
Dostop do baze podatkov JCR je dovoljen samo uporabnikom iz Slovenije. Vaš trenutni IP-naslov ni na seznamu dovoljenih za dostop, zato je potrebna avtentikacija z ustreznim računom AAI.
Leto | Faktor vpliva | Izdaja | Kategorija | Razvrstitev | ||||
---|---|---|---|---|---|---|---|---|
JCR | SNIP | JCR | SNIP | JCR | SNIP | JCR | SNIP |
Baze podatkov, v katerih je revija indeksirana
Ime baze podatkov | Področje | Leto |
---|
Povezave do osebnih bibliografij avtorjev | Povezave do podatkov o raziskovalcih v sistemu SICRIS |
---|
Vir: Osebne bibliografije
in: SICRIS
To gradivo vam je dostopno v celotnem besedilu. Če kljub temu želite naročiti gradivo, kliknite gumb Nadaljuj.