Data Analysis WorkbeNch (DAWN) Basham, Mark; Filik, Jacob; Wharmby, Michael T. ...
Journal of synchrotron radiation,
20/May , Volume:
22, Issue:
3
Journal Article
Peer reviewed
Open access
Synchrotron light source facilities worldwide generate terabytes of data in numerous incompatible data formats from a wide range of experiment types. The Data Analysis WorkbeNch (DAWN) was developed ...to address the challenge of providing a single visualization and analysis platform for data from any synchrotron experiment (including single‐crystal and powder diffraction, tomography and spectroscopy), whilst also being sufficiently extensible for new specific use case analysis environments to be incorporated (e.g. ARPES, PEEM). In this work, the history and current state of DAWN are presented, with two case studies to demonstrate specific functionality. The first is an example of a data processing and reduction problem using the generic tools, whilst the second shows how these tools can be targeted to a specific scientific area.
Full text
Available for:
FZAB, GIS, IJS, IZUM, KILJ, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBMB, UL, UM, UPUK
Hundreds of inbred mouse strains and intercross populations have been used to characterize the function of genetic variants that contribute to disease. Thousands of disease-relevant traits have been ...characterized in mice and made publicly available. New strains and populations including consomics, the collaborative cross, expanded BXD, and inbred wild-derived strains add to existing complex disease mouse models, mapping populations, and sensitized backgrounds for engineered mutations. The genome sequences of inbred strains, along with dense genotypes from others, enable integrated analysis of trait-variant associations across populations, but these analyses are hampered by the sparsity of genotypes available. Moreover, the data are not readily interoperable with other resources. To address these limitations, we created a uniformly dense variant resource by harmonizing multiple data sets. Missing genotypes were imputed using the Viterbi algorithm with a data-driven technique that incorporates local phylogenetic information, an approach that is extendable to other model organisms. The result is a web- and programmatically accessible data service called GenomeMUSter, comprising single-nucleotide variants covering 657 strains at 106.8 million segregating sites. Interoperation with phenotype databases, analytic tools, and other resources enable a wealth of applications, including multitrait, multipopulation meta-analysis. We show this in cross-species comparisons of type 2 diabetes and substance use disorder meta-analyses, leveraging mouse data to characterize the likely role of human variant effects in disease. Other applications include refinement of mapped loci and prioritization of strain backgrounds for disease modeling to further unlock extant mouse diversity for genetic and genomic studies in health and disease.
The automation of beam delivery, sample handling and data analysis, together with increasing photon flux, diminishing focal spot size and the appearance of fast‐readout detectors on synchrotron ...beamlines, have changed the way that many macromolecular crystallography experiments are planned and executed. Screening for the best diffracting crystal, or even the best diffracting part of a selected crystal, has been enabled by the development of microfocus beams, precise goniometers and fast‐readout detectors that all require rapid feedback from the initial processing of images in order to be effective. All of these advances require the coupling of data feedback to the experimental control system and depend on immediate online data‐analysis results during the experiment. To facilitate this, a Data Analysis WorkBench (DAWB) for the flexible creation of complex automated protocols has been developed. Here, example workflows designed and implemented using DAWB are presented for enhanced multi‐step crystal characterizations, experiments involving crystal reorientation with kappa goniometers, crystal‐burning experiments for empirically determining the radiation sensitivity of a crystal system and the application of mesh scans to find the best location of a crystal to obtain the highest diffraction quality. Beamline users interact with the prepared workflows through a specific brick within the beamline‐control GUI MXCuBE.
Full text
Available for:
BFBNIB, FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
MVAR: A Mouse Variation Registry El Kassaby, Bahá; Castellanos, Francisco; Gerring, Matthew ...
Journal of molecular biology,
2024-Mar-06
Journal Article
Peer reviewed
Open access
Display omitted
•MVAR aggregates and annotates genome variation from large-scale sequencing of different mouse strains and expertly curated variants for phenotypic alleles.•Variant annotation in MVAR ...includes variant type, molecular consequence, impact, and region.•Data in MVAR are accessible in both human- and machine- readable formats.•MVAR serves as both a stand-alone database of mouse genome variation and as a variant annotation service.•MVAR is a platform for facilitating genotype-phenotype associations in the laboratory mouse.•MVAR resource was implemented using a micro-services architecture, providing both interoperability and ease of software maintenance.
The Mouse Variation Registry (MVAR) resource is a scalable registry of mouse single nucleotide variants and small indels and variant annotation. The resource accepts data in standard Variant Call Format (VCF) and assesses the uniqueness of the submitted variants via a canonicalization process. Novel variants are assigned a unique, persistent MVAR identifier; variants that are equivalent to an existing variant in the resource are associated with the existing identifier. Annotations for variant type, molecular consequence, impact, and genomic region in the context of specific transcripts and protein sequences are generated using Ensembl’s Variant Effect Predictor (VEP) and Jannovar. Access to the data and annotations in MVAR are supported via an Application Programming Interface (API) and web application. Researchers can search the resource by gene symbol, genomic region, variant (expressed in Human Genome Variation Society syntax), refSNP identifiers, or MVAR identifiers. Tabular search results can be filtered by variant annotations (variant type, molecular consequence, impact, variant region) and viewed according to variant distribution across mouse strains. The registry currently comprises more than 99 million canonical single nucleotide variants for 581 strains of mice. MVAR is accessible from https://mvar.jax.org.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
The Mouse Phenome Database (MPD; https://phenome.jax.org; RRID:SCR_003212), supported by the US National Institutes of Health, is a Biomedical Data Repository listed in the Trans-NIH Biomedical ...Informatics Coordinating Committee registry. As an increasingly FAIR-compliant and TRUST-worthy data repository, MPD accepts phenotype and genotype data from mouse experiments and curates, organizes, integrates, archives, and distributes those data using community standards. Data are accompanied by rich metadata, including widely used ontologies and detailed protocols. Data are from all over the world and represent genetic, behavioral, morphological, and physiological disease-related characteristics in mice at baseline or those exposed to drugs or other treatments. MPD houses data from over 6000 strains and populations, representing many reproducible strain types and heterogenous populations such as the Diversity Outbred where each mouse is unique but can be genotyped throughout the genome. A suite of analysis tools is available to aggregate, visualize, and analyze these data within and across studies and populations in an increasingly traceable and reproducible manner. We have refined existing resources and developed new tools to continue to provide users with access to consistent, high-quality data that has translational relevance in a modernized infrastructure that enables interaction with a suite of bioinformatics analytic and data services.
The Mouse Phenome Database continues to serve as a curated repository and analysis suite for measured attributes of members of diverse mouse populations. The repository includes annotation to ...community standard ontologies and guidelines, a database of allelic states for 657 mouse strains, a collection of protocols, and analysis tools for flexible, interactive, user directed analyses that increasingly integrates data across traits and populations. The database has grown from its initial focus on a standard set of inbred strains to include heterogeneous mouse populations such as the Diversity Outbred and mapping crosses and well as Collaborative Cross, Hybrid Mouse Diversity Panel, and recombinant inbred strains. Most recently the system has expanded to include data from the International Mouse Phenotyping Consortium. Collectively these data are accessible by API and provided with an interactive tool suite that enables users’ persistent selection, storage, and operation on collections of measures. The tool suite allows basic analyses, advanced functions with dynamic visualization including multi-population meta-analysis, multivariate outlier detection, trait pattern matching, correlation analyses and other functions. The data resources and analysis suite provide users a flexible environment in which to explore the basis of phenotypic variation in health and disease across the lifespan.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Java technology, applied to science Diamond Light's synchrotron works like a giant microscope, harnessing the power of electrons to produce bright light that scientists can use to study anything from ...fossils to jet engines to viruses and vaccines. The static, non-modular algorithm Today in our server there around a hundred OSGI declarative services for things like loading files, getting interfaces to hardware, writing data to a fast distributed file system (we use GPFS and Luster), talking to FPGA-based devices by description language, sending text messages to a port on a custom Linux device that controls a detector, and much more besides! ...depending on the experiment, the various bridging bundles and device libraries can easily outnumber the scanning algorithm itself.
This study attempts to reconcile competing positions in an important debate about the relationship between regime type and human development. We contend that this empirical relationship is contingent ...upon issues of conceptualization and measurement in democracy. First, the relationship is more likely to be perceived when democracy is measured in a nuanced fashion, taking account of gradations of democracy and autocracy. Second, some aspects of democracy - those associated with competitive elections - are more strongly associated with human development than others. Third, the components of electoral democracy interact in a reinforcing manner. Finally, the impact of democracy on human development is a distal relationship that depends upon a country's entire regime history. Our approach draws on several new datasets that interrogate change across a century, enhancing empirical leverage on this important question. To measure human development, we employ the Gapminder project, covering most sovereign countries from 1900 to 2012. To measure democracy, we draw on Varieties of Democracy data, which measure democracy in a highly differentiated fashion for most sovereign countries from 1900 to the present. An extensive set of analyses offer strong corroboration for the argument.
Full text
Available for:
BFBNIB, NUK, PILJ, SAZU, UL, UM, UPUK