One of the foundations of synthetic biology is the project to develop libraries of standardized genetic parts that could be assembled quickly and cheaply into large systems. The limitations of the ...initial BioBrick standard have prompted the development of multiple new standards proposing different avenues to overcome these shortcomings. The lack of compatibility between standards, the compliance of parts with only some of the standards or even the type of constructs that each standard supports have significantly increased the complexity of assembling constructs from standardized parts. Here, we describe computer tools to facilitate the rigorous description of part compositions in the context of a rapidly changing landscape of physical construction methods and standards. A context-free grammar has been developed to model the structure of constructs compliant with six popular assembly standards. Its implementation in GenoCAD makes it possible for users to quickly assemble from a rich library of genetic parts, constructs compliant with any of six existing standards.
Plant synthetic biology requires software tools to assist on the design of complex multi-genic expression plasmids. Here a vector design strategy to express genes in plants is formalized and ...implemented as a grammar in GenoCAD, a Computer-Aided Design software for synthetic biology. It includes a library of plant biological parts organized in structural categories and a set of rules describing how to assemble these parts into large constructs. Rules developed here are organized and divided into three main subsections according to the aim of the final construct: protein localization studies, promoter analysis and protein-protein interaction experiments. The GenoCAD plant grammar guides the user through the design while allowing users to customize vectors according to their needs. Therefore the plant grammar implemented in GenoCAD will help plant biologists take advantage of methods from synthetic biology to design expression vectors supporting their research projects.
The re-use of previously validated designs is critical to the evolution of synthetic biology from a research discipline to an engineering practice. Here we describe the Synthetic Biology Open ...Language (SBOL), a proposed data standard for exchanging designs within the synthetic biology community. SBOL represents synthetic biology designs in a community-driven, formalized format for exchange between software tools, research groups and commercial service providers. The SBOL Developers Group has implemented SBOL as an XML/RDF serialization and provides software libraries and specification documentation to help developers implement SBOL in their own software. We describe early successes, including a demonstration of the utility of SBOL for information exchange between several different software tools and repositories from both academic and industrial partners. As a community-driven standard, SBOL will be updated as synthetic biology evolves to provide specific capabilities for different aspects of the synthetic biology workflow.
Visualization plays an important role in epidemic time series analysis and forecasting. Viewing time series data plotted on a graph can help researchers identify anomalies and unexpected trends that ...could be overlooked if the data were reviewed in tabular form; these details can influence a researcher's recommended course of action or choice of simulation models. However, there are challenges in reviewing data sets from multiple data sources - data can be aggregated in different ways (e.g., incidence vs. cumulative), measure different criteria (e.g., infection counts, hospitalizations, and deaths), or represent different geographical scales (e.g., nation, HHS Regions, or states), which can make a direct comparison between time series difficult. In the face of an emerging epidemic, the ability to visualize time series from various sources and organizations and to reconcile these datasets based on different criteria could be key in developing accurate forecasts and identifying effective interventions. Many tools have been developed for visualizing temporal data; however, none yet supports all the functionality needed for easy collaborative visualization and analysis of epidemic data.
In this paper, we present EpiViewer, a time series exploration dashboard where users can upload epidemiological time series data from a variety of sources and compare, organize, and track how data evolves as an epidemic progresses. EpiViewer provides an easy-to-use web interface for visualizing temporal datasets either as line charts or bar charts. The application provides enhanced features for visual analysis, such as hierarchical categorization, zooming, and filtering, to enable detailed inspection and comparison of multiple time series on a single canvas. Finally, EpiViewer provides several built-in statistical Epi-features to help users interpret the epidemiological curves.
EpiViewer is a single page web application that provides a framework for exploring, comparing, and organizing temporal datasets. It offers a variety of features for convenient filtering and analysis of epicurves based on meta-attribute tagging. EpiViewer also provides a platform for sharing data between groups for better comparison and analysis. Our user study demonstrated that EpiViewer is easy to use and fills a particular niche in the toolspace for visualization and exploration of epidemiological data.
We present MacKenzie, a HPC-driven multi-cluster workflow system that was used repeatedly to configure and execute fine-grained US national-scale epidemic simulation models during the COVID-19 ...pandemic. Mackenzie supported federal and Virginia policymakers, in real-time, for a large number of “what-if” scenarios during the COVID-19 pandemic, and continues to be used to answer related questions as COVID-19 transitions to the endemic stage of the disease. MacKenzie is a novel HPC meta-scheduler that can execute US-scale simulation models and associated workflows that typically present significant big data challenges. The meta-scheduler optimizes the total execution time of simulations in the workflow, and helps improve overall human productivity.
As an exemplar of the kind of studies that can be conducted using Mackenzie, we present a modeling study to understand the impact of vaccine-acceptance in controlling the spread of COVID-19 in the US. We use a 288 million node synthetic social contact network (digital twin) spanning all 50 US states plus Washington DC, comprised of 3300 counties, with 12 billion daily interactions. The highly-resolved agent-based model used for the epidemic simulations uses realistic information about disease progression, vaccine uptake, production schedules, acceptance trends, prevalence, and social distancing guidelines. Computational experiments show that, for the simulation workload discussed above, MacKenzie is able to scale up well to 10 K CPU cores.
Our modeling results show that, when compared to faster and accelerating vaccinations, slower vaccination rates due to vaccine hesitancy cause averted infections to drop from 6.7M to 4.5M, and averted total deaths to drop from 39.4 K to 28.2 K across the US. This occurs despite the fact that the final vaccine coverage is the same in both scenarios. We also find that if vaccine acceptance could be increased by 10% in all states, averted infections could be increased from 4.5M to 4.7M (a 4.4% improvement) and total averted deaths could be increased from 28.2 K to 29.9 K (a 6% improvement) nationwide.
•Presents MacKenzie, a novel multi-cluster HPC job scheduler.•A novel workflow that reduces execution time and improves productivity.•Detailed, fine-grained, data driven epidemic models.•Detailed analysis of the COVID-19 vaccine allocation problem.•Real world workflow; has been used repeatedly brief VA and Federal policymakers.
Scenario-based modeling frameworks have been widely used to support policy-making at state and federal levels in the United States during the COVID-19 response. While custom-built models can be used ...to support one-off studies, sustained updates to projections under changing pandemic conditions requires a robust, integrated, and adaptive framework. In this paper, we describe one such framework, UVA-adaptive, that was built to support the CDC-aligned Scenario Modeling Hub (SMH) across multiple rounds, as well as weekly/biweekly projections to Virginia Department of Health (VDH) and US Department of Defense during the COVID-19 response. Building upon an existing metapopulation framework, PatchSim, UVA-adaptive uses a calibration mechanism relying on adjustable effective transmissibility as a basis for scenario definition while also incorporating real-time datasets on case incidence, seroprevalence, variant characteristics, and vaccine uptake. Through the pandemic, our framework evolved by incorporating available data sources and was extended to capture complexities of multiple strains and heterogeneous immunity of the population. Here we present the version of the model that was used for the recent projections for SMH and VDH, describe the calibration and projection framework, and demonstrate that the calibrated transmissibility correlates with the evolution of the pathogen as well as associated societal dynamics.
•Supporting Scenario Modeling Hub for COVID-19 response.•Multi-strain mechanistic model with stratified immunity and effective transmissibility for COVID-19 scenario projections.•Data-drive calibration of apparent transmissibility for real-time update and flexibility in scenario implementation.
When an influenza pandemic emerges, temporary school closures and antiviral treatment may slow virus spread, reduce the overall disease burden, and provide time for vaccine development, distribution, ...and administration while keeping a larger portion of the general population infection free. The impact of such measures will depend on the transmissibility and severity of the virus and the timing and extent of their implementation. To provide robust assessments of layered pandemic intervention strategies, the Centers for Disease Control and Prevention (CDC) funded a network of academic groups to build a framework for the development and comparison of multiple pandemic influenza models. Research teams from Columbia University, Imperial College London/Princeton University, Northeastern University, the University of Texas at Austin/Yale University, and the University of Virginia independently modeled three prescribed sets of pandemic influenza scenarios developed collaboratively by the CDC and network members. Results provided by the groups were aggregated into a mean-based ensemble. The ensemble and most component models agreed on the ranking of the most and least effective intervention strategies by impact but not on the magnitude of those impacts. In the scenarios evaluated, vaccination alone, due to the time needed for development, approval, and deployment, would not be expected to substantially reduce the numbers of illnesses, hospitalizations, and deaths that would occur. Only strategies that included early implementation of school closure were found to substantially mitigate early spread and allow time for vaccines to be developed and administered, especially under a highly transmissible pandemic scenario.
ObjectivesThis research studies the role of slums in the spread and control of infectious diseases in the National Capital Territory of India, Delhi, using detailed social contact networks of its ...residents.MethodsWe use an agent-based model to study the spread of influenza in Delhi through person-to-person contact. Two different networks are used: one in which slum and non-slum regions are treated the same, and the other in which 298 slum zones are identified. In the second network, slum-specific demographics and activities are assigned to the individuals whose homes reside inside these zones. The main effects of integrating slums are that the network has more home-related contacts due to larger family sizes and more outside contacts due to more daily activities outside home. Various vaccination and social distancing interventions are applied to control the spread of influenza.ResultsSimulation-based results show that when slum attributes are ignored, the effectiveness of vaccination can be overestimated by 30%–55%, in terms of reducing the peak number of infections and the size of the epidemic, and in delaying the time to peak infection. The slum population sustains greater infection rates under all intervention scenarios in the network that treats slums differently. Vaccination strategy performs better than social distancing strategies in slums.ConclusionsUnique characteristics of slums play a significant role in the spread of infectious diseases. Modelling slums and estimating their impact on epidemics will help policy makers and regulators more accurately prioritise allocation of scarce medical resources and implement public health policies.
This paper describes an integrated, data-driven operational pipeline based on national agent-based models to support federal and state-level pandemic planning and response. The pipeline consists of ...(i) an automatic semantic-aware scheduling method that coordinates jobs across two separate high performance computing systems; (ii) a data pipeline to collect, integrate and organize national and county-level disaggregated data for initialization and post-simulation analysis; (iii) a digital twin of national social contact networks made up of 288 Million individuals and 12.6 Billion time-varying interactions covering the US states and DC; (iv) an extension of a parallel agent-based simulation model to study epidemic dynamics and associated interventions. This pipeline can run 400 replicates of national runs in less than 33 h, and reduces the need for human intervention, resulting in faster turnaround times and higher reliability and accuracy of the results. Scientifically, the work has led to significant advances in real-time epidemic sciences.