Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically ...improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging "hybrid" assemblies that use long reads for scaffolding and short reads for accuracy.
We describe a novel method leveraging a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We demonstrate that our method efficiently produces significantly more high quality corrected sequence than existing hybrid error-correction methods. We also show that our method produces more contiguous assemblies, in many cases, than existing state-of-the-art hybrid and long-read only de novo assembly methods.
Our method accurately corrects long read sequence data using complementary short reads. We demonstrate higher total throughput of corrected long reads and a corresponding increase in contiguity of the resulting de novo assemblies. Improved throughput and computational efficiency than existing methods will help better economically utilize emerging long read sequencing technologies.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The JAX Diversity Outbred population is a new mouse resource derived from partially inbred Collaborative Cross strains and maintained by randomized outcrossing. As such, it segregates the same ...allelic variants as the Collaborative Cross but embeds these in a distinct population architecture in which each animal has a high degree of heterozygosity and carries a unique combination of alleles. Phenotypic diversity is striking and often divergent from phenotypes seen in the founder strains of the Collaborative Cross. Allele frequencies and recombination density in early generations of Diversity Outbred mice are consistent with expectations based on simulations of the mating design. We describe analytical methods for genetic mapping using this resource and demonstrate the power and high mapping resolution achieved with this population by mapping a serum cholesterol trait to a 2-Mb region on chromosome 3 containing only 11 genes. Analysis of the estimated allele effects in conjunction with complete genome sequence data of the founder strains reduced the pool of candidate polymorphisms to seven SNPs, five of which are located in an intergenic region upstream of the Foxo1 gene.
Genomes of the Mouse Collaborative Cross Srivastava, Anuj; Morgan, Andrew P; Najarian, Maya L ...
Genetics (Austin),
06/2017, Letnik:
206, Številka:
2
Journal Article
Recenzirano
Odprti dostop
The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic ...stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.
Here we provide a genome-wide, high-resolution map of the phylogenetic origin of the genome of most extant laboratory mouse inbred strains. Our analysis is based on the genotypes of wild-caught mice ...from three subspecies of Mus musculus. We show that classical laboratory strains are derived from a few fancy mice with limited haplotype diversity. Their genomes are overwhelmingly Mus musculus domesticus in origin, and the remainder is mostly of Japanese origin. We generated genome-wide haplotype maps based on identity by descent from fancy mice and show that classical inbred strains have limited and non-randomly distributed genetic diversity. In contrast, wild-derived laboratory strains represent a broad sampling of diversity within M. musculus. Intersubspecific introgression is pervasive in these strains, and contamination by laboratory stocks has played a role in this process. The subspecific origin, haplotype diversity and identity by descent maps can be visualized using the Mouse Phylogeny Viewer (see URLs).
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
A data-driven reflectance model Matusik, Wojciech; Pfister, Hanspeter; Brand, Matt ...
ACM transactions on graphics,
01/07, Letnik:
22, Številka:
3
Journal Article
Recenzirano
Odprti dostop
We present a generative model for isotropic bidirectional reflectance distribution functions (BRDFs) based on acquired reflectance data. Instead of using analytical reflectance models, we represent ...each BRDF as a dense set of measurements. This allows us to interpolate and extrapolate in the space of acquired BRDFs to create new BRDFs. We treat each acquired BRDF as a single high-dimensional vector taken from a space of all possible BRDFs. We apply both linear (subspace) and non-linear (manifold) dimensionality reduction tools in an effort to discover a lower-dimensional representation that characterizes our measurements. We let users define perceptually meaningful parametrization directions to navigate in the reduced-dimension BRDF space. On the low-dimensional manifold, movement along these directions produces novel but valid BRDFs.
New systems genetics approaches are needed to rapidly identify host genes and genetic networks that regulate complex disease outcomes. Using genetically diverse animals from incipient lines of the ...Collaborative Cross mouse panel, we demonstrate a greatly expanded range of phenotypes relative to classical mouse models of SARS-CoV infection including lung pathology, weight loss and viral titer. Genetic mapping revealed several loci contributing to differential disease responses, including an 8.5Mb locus associated with vascular cuffing on chromosome 3 that contained 23 genes and 13 noncoding RNAs. Integrating phenotypic and genetic data narrowed this region to a single gene, Trim55, an E3 ubiquitin ligase with a role in muscle fiber maintenance. Lung pathology and transcriptomic data from mice genetically deficient in Trim55 were used to validate its role in SARS-CoV-induced vascular cuffing and inflammation. These data establish the Collaborative Cross platform as a powerful genetic resource for uncovering genetic contributions of complex traits in microbial disease severity, inflammation and virus replication in models of outbred populations.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The ability to accurately monitor alterations in sperm motility is paramount to understanding multiple genetic and biochemical perturbations impacting normal fertilization. Computer-aided sperm ...analysis (CASA) of human sperm typically reports motile percentage and kinematic parameters at the population level, and uses kinematic gating methods to identify subpopulations such as progressive or hyperactivated sperm. The goal of this study was to develop an automated method that classifies all patterns of human spermmotility during in vitro capacitation following the removal of seminal plasma. We visually classified CASA tracks of 2817 sperm from 18 individuals and used a support vector machine-based decision tree to compute four hyperplanes that separate five classes based on their kinematic parameters. We then developed a web-based program, CASAnova, which applies these equations sequentially to assign a single classification to each motile sperm. Vigorous sperm are classified as progressive, intermediate, or hyperactivated, and nonvigorous sperm as slow or weakly motile. This program correctly classifies sperm motility into one of five classes with an overall accuracy of 89.9%. Application of CASAnova to capacitating sperm populations showed a shift from predominantly linear patterns of motility at initial time points to more vigorous patterns, including hyperactivated motility, as capacitation proceeds. Both intermediate and hyperactivated motility patterns were largely eliminated when sperm were incubated in noncapacitating medium, demonstrating the sensitivity of this method. The five CASAnova classifications are distinctive and reflect kinetic parameters of washed human sperm, providing an accurate, quantitative, and high-throughput method for monitoring alterations in motility. Summary Sentence A CASA-based support vector machine model of human sperm motility provides rapid, accurate, and quantitative analysis of all motile sperm in a population.
We enhance underexposed, low dynamic range videos by adaptively and independently varying the exposure at each photoreceptor in a post-process. This virtual exposure is a dynamic function of both the ...spatial neighborhood and temporal history at each pixel. Temporal integration enables us to expand the image's dynamic range while simultaneously reducing noise. Our non-linear exposure variation and denoising filters smoothly transition from temporal to spatial for moving scene elements. Our virtual exposure framework also supports temporally coherent per frame tone mapping. Our system outputs restored video sequences with significantly reduced noise, increased exposure time of dark pixels, intact motion, and improved details.
The Collaborative Cross (CC) is a panel of recombinant inbred lines derived from eight genetically diverse laboratory inbred strains. Recently, the genetic architecture of the CC population was ...reported based on the genotype of a single male per line, and other publications reported incompletely inbred CC mice that have been used to map a variety of traits. The three breeding sites, in the US, Israel, and Australia, are actively collaborating to accelerate the inbreeding process through marker-assisted inbreeding and to expedite community access of CC lines deemed to have reached defined thresholds of inbreeding. Plans are now being developed to provide access to this novel genetic reference population through distribution centers. Here we provide a description of the distribution efforts by the University of North Carolina Systems Genetics Core, Tel Aviv University, Israel and the University of Western Australia.
Motion capture is a prevalent technique for capturing and analyzing human articulations. A common problem encountered in motion capture is that some marker positions are often missing due to ...occlusions or ambiguities. Most methods for completing missing markers may quickly become ineffective and produce unsatisfactory results when a significant portion of the markers are missing for extended periods of time. We propose a data-driven, piecewise linear modeling approach to missing marker estimation that is especially beneficial in this scenario. We model motion sequences of a training set with a hierarchy of low-dimensional local linear models characterized by the principal components. For a new sequence with missing markers, we use a pre-trained classifier to identify the most appropriate local linear model for each frame and then recover the missing markers by finding the least squares solutions based on the available marker positions and the principal components of the associated model. Our experimental results demonstrate that our method is efficient in recovering the full-body motion and is robust to heterogeneous motion data.