To assess the clinical performance of an expanded noninvasive prenatal screening (NIPS) test (“NIPS-Plus”) for detection of both aneuploidy and genome-wide microdeletion/microduplication syndromes ...(MMS).
A total of 94,085 women with a singleton pregnancy were prospectively enrolled in the study. The cell-free plasma DNA was directly sequenced without intermediate amplification and fetal abnormalities identified using an improved copy-number variation (CNV) calling algorithm.
A total of 1128 pregnancies (1.2%) were scored positive for clinically significant fetal chromosome abnormalities. This comprised 965 aneuploidies (1.026%) and 163 (0.174%) MMS. From follow-up tests, the positive predictive values (PPVs) for T21, T18, T13, rare trisomies, and sex chromosome aneuploidies were calculated as 95%, 82%, 46%, 29%, and 47%, respectively. For known MMS (n=32), PPVs were 93% (DiGeorge), 68% (22q11.22 microduplication), 75% (Prader–Willi/Angleman), and 50% (Cri du Chat). For the remaining genome-wide MMS (n=88), combined PPVs were 32% (CNVs ≥10Mb) and 19% (CNVs <10Mb).
NIPS-Plus yielded high PPVs for common aneuploidies and DiGeorge syndrome, and moderate PPVs for other MMS. Our results present compelling evidence that NIPS-Plus can be used as a first-tier pregnancy screening method to improve detection rates of clinically significant fetal chromosome abnormalities.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Whole exome capture sequencing allows researchers to cost-effectively sequence the coding regions of the genome. Although the exome capture sequencing methods have become routine and well ...established, there is currently a lack of tools specialized for variant calling in this type of data.
Using statistical models trained on validated whole-exome capture sequencing data, the Atlas2 Suite is an integrative variant analysis pipeline optimized for variant discovery on all three of the widely used next generation sequencing platforms (SOLiD, Illumina, and Roche 454). The suite employs logistic regression models in conjunction with user-adjustable cutoffs to accurately separate true SNPs and INDELs from sequencing and mapping errors with high sensitivity (96.7%).
We have implemented the Atlas2 Suite and applied it to 92 whole exome samples from the 1000 Genomes Project. The Atlas2 Suite is available for download at http://sourceforge.net/projects/atlas2/. In addition to a command line version, the suite has been integrated into the Genboree Workbench, allowing biomedical scientists with minimal informatics expertise to remotely call, view, and further analyze variants through a simple web interface. The existing genomic databases displayed via the Genboree browser also streamline the process from variant discovery to functional genomics analysis, resulting in an off-the-shelf toolkit for the broader community.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
To assess the impact of genetic variation in regulatory loci on human health, we constructed a high-resolution map of allelic imbalances in DNA methylation, histone marks, and gene transcription in ...71 epigenomes from 36 distinct cell and tissue types from 13 donors. Deep whole-genome bisulfite sequencing of 49 methylomes revealed sequence-dependent CpG methylation imbalances at thousands of heterozygous regulatory loci. Such loci are enriched for stochastic switching, which is defined as random transitions between fully methylated and unmethylated states of DNA. The methylation imbalances at thousands of loci are explainable by different relative frequencies of the methylated and unmethylated states for the two alleles. Further analyses provided a unifying model that links sequence-dependent allelic imbalances of the epigenome, stochastic switching at gene regulatory loci, and disease-associated genetic variation.
Since the first patient reported in December 2019, 2019 novel coronavirus disease (COVID-19) has become global pandemic with more than 10 million total confirmed cases and 500 thousand related ...deaths. Using deep learning methods to quickly identify COVID-19 and accurately segment the infected area can help control the outbreak and assist in treatment. Computed tomography (CT) as a fast and easy clinical method, it is suitable for assisting in diagnosis and treatment of COVID-19. According to clinical manifestations, COVID-19 lung infection areas can be divided into three categories: ground-glass opacities, interstitial infiltrates and consolidation. We proposed a multi-scale discriminative network (MSD-Net) for multi-class segmentation of COVID-19 lung infection on CT. In the MSD-Net, we proposed pyramid convolution block (PCB), channel attention block (CAB) and residual refinement block (RRB). The PCB can increase the receptive field by using different numbers and different sizes of kernels, which strengthened the ability to segment the infected areas of different sizes. The CAB was used to fusion the input of the two stages and focus features on the area to be segmented. The role of RRB was to refine the feature maps. Experimental results showed that the dice similarity coefficient (DSC) of the three infection categories were 0.7422,0.7384,0.8769 respectively. For sensitivity and specificity, the results of three infection categories were (0.8593, 0.9742), (0.8268,0.9869) and (0.8645,0.9889) respectively. The experimental results demonstrated that the network proposed in this paper can effectively segment the COVID-19 infection on CT images. It can be adopted for assisting in diagnosis and treatment of COVID-19.
Next-generation sequencing is a powerful approach for discovering genetic variation. Sensitive variant calling and haplotype inference from population sequencing data remain challenging. We describe ...methods for high-quality discovery, genotyping, and phasing of SNPs for low-coverage (approximately 5×) sequencing of populations, implemented in a pipeline called SNPTools. Our pipeline contains several innovations that specifically address challenges caused by low-coverage population sequencing: (1) effective base depth (EBD), a nonparametric statistic that enables more accurate statistical modeling of sequencing data; (2) variance ratio scoring, a variance-based statistic that discovers polymorphic loci with high sensitivity and specificity; and (3) BAM-specific binomial mixture modeling (BBMM), a clustering algorithm that generates robust genotype likelihoods from heterogeneous sequencing data. Last, we develop an imputation engine that refines raw genotype likelihoods to produce high-quality phased genotypes/haplotypes. Designed for large population studies, SNPTools' input/output (I/O) and storage aware design leads to improved computing performance on large sequencing data sets. We apply SNPTools to the International 1000 Genomes Project (1000G) Phase 1 low-coverage data set and obtain genotyping accuracy comparable to that of SNP microarray.
At the end of 2019, a novel coronavirus COVID‐19 broke out. Due to its high contagiousness, more than 74 million people have been infected worldwide. Automatic segmentation of the COVID‐19 lesion ...area in CT images is an effective auxiliary medical technology which can quantitatively diagnose and judge the severity of the disease. In this paper, a multi‐class COVID‐19 CT image segmentation network is proposed, which includes a pyramid attention module to extract multi‐scale contextual attention information, and a residual convolution module to improve the discriminative ability of the network. A wavelet edge loss function is also proposed to extract edge features of the lesion area to improve the segmentation accuracy. For the experiment, a dataset of 4369 CT slices is constructed, including three symptoms: ground glass opacities, interstitial infiltrates, and lung consolidation. The dice similarity coefficients of three symptoms of the model achieve 0.7704, 0.7900, 0.8241 respectively. The performance of the proposed network on public dataset COVID‐SemiSeg is also evaluated. The results demonstrate that this model outperforms other state‐of‐the‐art methods and can be a powerful tool to assist in the diagnosis of positive infection cases, and promote the development of intelligent technology in the medical field.
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
Tourette syndrome (TS) is a childhood-onset neuropsychiatric disorder characterized by repetitive motor movements and vocal tics. The clinical manifestations of TS are complex and often overlap with ...other neuropsychiatric disorders. TS is highly heritable; however, the underlying genetic basis and molecular and neuronal mechanisms of TS remain largely unknown. We performed whole-exome sequencing of a hundred trios (probands and their parents) with detailed records of their clinical presentations and identified a risk gene, ASH1L, that was both de novo mutated and associated with TS based on a transmission disequilibrium test. As a replication, we performed follow-up targeted sequencing of ASH1L in additional 524 unrelated TS samples and replicated the association (P value = 0.001). The point mutations in ASH1L cause defects in its enzymatic activity. Therefore, we established a transgenic mouse line and performed an array of anatomical, behavioral, and functional assays to investigate ASH1L function. The Ash1l
mice manifested tic-like behaviors and compulsive behaviors that could be rescued by the tic-relieving drug haloperidol. We also found that Ash1l disruption leads to hyper-activation and elevated dopamine-releasing events in the dorsal striatum, all of which could explain the neural mechanisms for the behavioral abnormalities in mice. Taken together, our results provide compelling evidence that ASH1L is a TS risk gene.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Rhesus macaques (Macaca mulatta) are the most widely used nonhuman primate in biomedical research, have the largest natural geographic distribution of any nonhuman primate, and have been the focus of ...much evolutionary and behavioral investigation. Consequently, rhesus macaques are one of the most thoroughly studied nonhuman primate species. However, little is known about genome-wide genetic variation in this species. A detailed understanding of extant genomic variation among rhesus macaques has implications for the use of this species as a model for studies of human health and disease, as well as for evolutionary population genomics. Whole-genome sequencing analysis of 133 rhesus macaques revealed more than 43.7 million single-nucleotide variants, including thousands predicted to alter protein sequences, transcript splicing, and transcription factor binding sites. Rhesus macaques exhibit 2.5-fold higher overall nucleotide diversity and slightly elevated putative functional variation compared with humans. This functional variation in macaques provides opportunities for analyses of coding and noncoding variation, and its cellular consequences. Despite modestly higher levels of nonsynonymous variation in the macaques, the estimated distribution of fitness effects and the ratio of nonsynonymous to synonymous variants suggest that purifying selection has had stronger effects in rhesus macaques than in humans. Demographic reconstructions indicate this species has experienced a consistently large but fluctuating population size. Overall, the results presented here provide new insights into the population genomics of nonhuman primates and expand genomic information directly relevant to primate models of human disease.
Accurate identification of genetic variants from next-generation sequencing (NGS) data is essential for immediate large-scale genomic endeavors such as the 1000 Genomes Project, and is crucial for ...further genetic analysis based on the discoveries. The key challenge in single nucleotide polymorphism (SNP) discovery is to distinguish true individual variants (occurring at a low frequency) from sequencing errors (often occurring at frequencies orders of magnitude higher). Therefore, knowledge of the error probabilities of base calls is essential. We have developed Atlas-SNP2, a computational tool that detects and accounts for systematic sequencing errors caused by context-related variables in a logistic regression model learned from training data sets. Subsequently, it estimates the posterior error probability for each substitution through a Bayesian formula that integrates prior knowledge of the overall sequencing error probability and the estimated SNP rate with the results from the logistic regression model for the given substitutions. The estimated posterior SNP probability can be used to distinguish true SNPs from sequencing errors. Validation results show that Atlas-SNP2 achieves a false-positive rate of lower than 10%, with an approximately 5% or lower false-negative rate.
Characterizing meiotic recombination rates across the genomes of nonhuman primates is important for understanding the genetics of primate populations, performing genetic analyses of phenotypic ...variation and reconstructing the evolution of human recombination. Rhesus macaques (Macaca mulatta) are the most widely used nonhuman primates in biomedical research. We constructed a high-resolution genetic map of the rhesus genome based on whole genome sequence data from Indian-origin rhesus macaques. The genetic markers used were approximately 18 million SNPs, with marker density 6.93 per kb across the autosomes. We report that the genome-wide recombination rate in rhesus macaques is significantly lower than rates observed in apes or humans, while the distribution of recombination across the macaque genome is more uniform. These observations provide new comparative information regarding the evolution of recombination in primates.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK