Human genomics is undergoing a step change from being a predominantly research-driven activity to one driven through health care as many countries in Europe now have nascent precision medicine ...programmes. To maximize the value of the genomic data generated, these data will need to be shared between institutions and across countries. In recognition of this challenge, 21 European countries recently signed a declaration to transnationally share data on at least 1 million human genomes by 2022. In this Roadmap, we identify the challenges of data sharing across borders and demonstrate that European research infrastructures are well-positioned to support the rapid implementation of widespread genomic data access.
The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme ...(MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the characteristics of typical runs including the consistency, rate, volume and quality of data produced. Further analysis of the Phase 1 data presented here, and additional experiments in Phase 2 of E. coli from MARC are already underway to identify ways to improve and enhance MinION performance.
Reactome http://www.reactome.org, an online curated resource for human pathway data, provides infrastructure for computation across the biologic reaction network. We use Reactome to infer equivalent ...reactions in multiple nonhuman species, and present data on the reliability of these inferred reactions for the distantly related eukaryote Saccharomyces cerevisiae. Finally, we describe the use of Reactome both as a learning resource and as a computational tool to aid in the interpretation of microarrays and similar large-scale datasets.
Recently attention has been turned to the problem of reconstructing complete ancestral sequences from large multiple alignments. Successful generation of these genome-wide reconstructions will ...facilitate a greater knowledge of the events that have driven evolution. We present a new evolutionary alignment modeler, called "Ortheus," for inferring the evolutionary history of a multiple alignment, in terms of both substitutions and, importantly, insertions and deletions. Based on a multiple sequence probabilistic transducer model of the type proposed by Holmes, Ortheus uses efficient stochastic graph-based dynamic programming methods. Unlike other methods, Ortheus does not rely on a single fixed alignment from which to work. Ortheus is also more scaleable than previous methods while being fast, stable, and open source. Large-scale simulations show that Ortheus performs close to optimally on a deep mammalian phylogeny. Simulations also indicate that significant proportions of errors due to insertions and deletions can be avoided by not assuming a fixed alignment. We additionally use a challenging hold-out cross-validation procedure to test the method; using the reconstructions to predict extant sequence bases, we demonstrate significant improvements over using closest extant neighbor sequences. Accompanying this paper, a new, public, and genome-wide set of Ortheus ancestor alignments provide an intriguing new resource for evolutionary studies in mammals. As a first piece of analysis, we attempt to recover "fossilized" ancestral pseudogenes. We confidently find 31 cases in which the ancestral sequence had a more complete sequence than any of the extant sequences.
The correlated data sets from the 1000 Genomes Project Consortium1 allow researchers to infer a large complement of variants, including indels and structural variants, in panels of people for whom ...only a small subset of SNPs have been analysed, using partial sequencing techniques such as genotyping arrays. Because genotyping arrays are cheap, the ability to infer variation allows researchers to focus on increasing sample sizes - a crucial next step in improving our understanding of the genetics of disease. The mechanisms by which these variants underpin disease risk remain largely unexplored, but the authors' data set provides the starting point for further mechanistic studies.
Celotno besedilo
Dostopno za:
DOBA, IJS, IZUM, KILJ, KISLJ, NUK, PILJ, PNG, SAZU, SBMB, SIK, UILJ, UKNU, UL, UM, UPUK
Genetic diseases have been historically segregated into rare Mendelian disorders and common complex conditions. Large-scale studies using genome sequencing are eroding this distinction and are ...gradually unmasking the underlying complexity of human traits. Here, we analysed data from the Genomics England 100,000 Genomes Project and from a cohort of 1313 individuals with albinism aiming to gain insights into the genetic architecture of this archetypal rare disorder. We investigated the contribution of protein-coding and regulatory variants both rare and common. We focused on TYR, the gene encoding tyrosinase, and found that a high-frequency promoter variant, TYR c.-301C>T rs4547091, modulates the penetrance of a prevalent, albinism-associated missense change, TYR c.1205G>A (p.Arg402Gln) rs1126809. We also found that homozygosity for a haplotype formed by three common, functionally-relevant variants, TYR c.-301C;575C>A;1205G>A, is associated with a high probability of receiving an albinism diagnosis (OR>82). This genotype is also associated with reduced visual acuity and with increased central retinal thickness in UK Biobank participants. Finally, we report how the combined analysis of rare and common variants can increase diagnostic yield and can help inform genetic counselling in families with albinism.
Background: Long-read sequencing is rapidly evolving and reshaping the suite of opportunities for genomic analysis. For the MinION in particular, as both the platform and chemistry develop, the user ...community requires reference data to set performance expectations and maximally exploit third-generation sequencing. We performed an analysis of MinION data derived from whole genome sequencing of
Escherichia
coli K-12 using the R9.0 chemistry, comparing the results with the older R7.3 chemistry.
Methods: We computed the error-rate estimates for insertions, deletions, and mismatches in MinION reads.
Results: Run-time characteristics of the flow cell and run scripts for R9.0 were similar to those observed for R7.3 chemistry, but with an 8-fold increase in bases per second (from 30 bps in R7.3 and SQK-MAP005 library preparation, to 250 bps in R9.0) processed by individual nanopores, and less drop-off in yield over time. The 2-dimensional ("2D") N50 read length was unchanged from the prior chemistry. Using the proportion of alignable reads as a measure of base-call accuracy, 99.9% of "pass" template reads from 1-dimensional ("1D") experiments were mappable and ~97% from 2D experiments. The median identity of reads was ~89% for 1D and ~94% for 2D experiments. The total error rate (miscall + insertion + deletion ) decreased for 2D "pass" reads from 9.1% in R7.3 to 7.5% in R9.0 and for template "pass" reads from 26.7% in R7.3 to 14.5% in R9.0.
Conclusions: These Phase 2 MinION experiments serve as a baseline by providing estimates for read quality, throughput, and mappability. The datasets further enable the development of bioinformatic tools tailored to the new R9.0 chemistry and the design of novel biological applications for this technology.
Abbreviations: K: thousand, Kb: kilobase (one thousand base pairs), M: million, Mb: megabase (one million base pairs), Gb: gigabase (one billion base pairs).
Animal promoters initiate transcription either at precise positions (narrow promoters) or dispersed regions (broad promoters), a distinction referred to as promoter shape. Although highly conserved, ...the functional properties of promoters with different shapes and the genetic basis of their evolution remain unclear. Here we used natural genetic variation across a panel of 81 Drosophila lines to measure changes in transcriptional start site (TSS) usage, identifying thousands of genetic variants affecting transcript levels (strength) or the distribution of TSSs within a promoter (shape). Our results identify promoter shape as a molecular trait that can evolve independently of promoter strength. Broad promoters typically harbor shape-associated variants, with signatures of adaptive selection. Single-cell measurements demonstrate that variants modulating promoter shape often increase expression noise, whereas heteroallelic interactions with other promoter variants alleviate these effects. These results uncover new functional properties of natural promoters and suggest the minimization of expression noise as an important factor in promoter evolution.