Since the onset of the COVID-19 pandemic many researchers and health advisory institutions have focused on virus spread prediction through epidemiological models. Such models rely on virus- and ...disease characteristics of which most are uncertain or even unknown for SARS-CoV-2. This study addresses the validity of various assumptions using an epidemiological simulation model. The contributions of this work are twofold. First, we show that multiple scenarios all lead to realistic numbers of deaths and ICU admissions, two observable and verifiable metrics. Second, we test the sensitivity of estimates for the number of infected and immune individuals, and show that these vary strongly between scenarios. Note that the amount of variation measured in this study is merely a lower bound: epidemiological modeling contains uncertainty on more parameters than the four in this study, and including those as well would lead to an even larger set of possible scenarios. As the level of infection and immunity among the population are particularly important for policy makers, further research on virus and disease progression characteristics is essential. Until that time, epidemiological modeling studies cannot give conclusive results and should come with a careful analysis of several scenarios on virus- and disease characteristics.
Abstract
Motivation
Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disease caused by aberrations in the genome. While several disease-causing variants have been identified, a major part ...of heritability remains unexplained. ALS is believed to have a complex genetic basis where non-additive combinations of variants constitute disease, which cannot be picked up using the linear models employed in classical genotype–phenotype association studies. Deep learning on the other hand is highly promising for identifying such complex relations. We therefore developed a deep-learning based approach for the classification of ALS patients versus healthy individuals from the Dutch cohort of the Project MinE dataset. Based on recent insight that regulatory regions harbor the majority of disease-associated variants, we employ a two-step approach: first promoter regions that are likely associated to ALS are identified, and second individuals are classified based on their genotype in the selected genomic regions. Both steps employ a deep convolutional neural network. The network architecture accounts for the structure of genome data by applying convolution only to parts of the data where this makes sense from a genomics perspective.
Results
Our approach identifies potentially ALS-associated promoter regions, and generally outperforms other classification methods. Test results support the hypothesis that non-additive combinations of variants contribute to ALS. Architectures and protocols developed are tailored toward processing population-scale, whole-genome data. We consider this a relevant first step toward deep learning assisted genotype–phenotype association in whole genome-sized data.
Availability and implementation
Our code will be available on Github, together with a synthetic dataset (https://github.com/byin-cwi/ALS-Deeplearning). The data used in this study is available to bona-fide researchers upon request.
Supplementary information
Supplementary data are available at Bioinformatics online.
Linear programming (LP) is often used within diet optimization to find, from a set of available food commodities, the most affordable diet that meets the nutritional requirements of an individual or ...(sub)population. It is, however, not always possible to create a feasible diet, as certain nutritional requirements are difficult to meet. In that case, goal programming (GP) can be used to minimize deviations from the nutritional requirements in order to obtain a near feasible diet. With GP the cost of the diet is often overlooked or taken into account using the ε-constraint method. This method does not guarantee to find all possible trade-offs between costs and nutritional deficiency without solving many uninformative LPs.
We present a method to find all trade-offs between any two linear objectives in a dietary LP context that is simple, does not solve uninformative LPs and does not need prior input from the decision maker (DM). This method is a bi-objective algorithm based on the NonInferior Set Estimation (NISE) method that finds all efficient trade-offs between two linear objectives.
In order to show what type of insights can be gained from this approach, two analyses are presented that investigate the relation between cost and nutritional adequacy. In the first analysis a diet with a restriction on the exact energy intake is considered where all nutrient intakes except energy are allowed to deviate from their prescription. This analysis is especially helpful in case of a restrictive budget or when a nutritionally adequate diet is either unaffordable or unattainable. The second analysis only relaxes the exact energy intake, where the other nutrients are kept within their requirements, to investigate how the energy intake affects the cost of a diet. Here, we describe in what situations the so-called more-for-less paradox takes place, which can be induced by requiring an exact energy intake.
To the best of our knowledge, we are the first to address how to obtain all efficient trade-offs of two linear objectives in a dietary LP context and how this can be used for analyses.
In this article we provide a method to generate the trade-off between delivery time and fluence map matching quality for dynamically delivered fluence maps. At the heart of our method lies a ...mathematical programming model that, for a given duration of delivery, optimizes leaf trajectories and dose rates such that the desired fluence map is reproduced as well as possible. We begin with the single fluence map case and then generalize the model and the solution technique to the delivery of sequential fluence maps. The resulting large-scale, non-convex optimization problem was solved using a heuristic approach. We test our method using a prostate case and a head and neck case, and present the resulting trade-off curves. Analysis of the leaf trajectories reveals that short time plans have larger leaf openings in general than longer delivery time plans. Our method allows one to explore the continuum of possibilities between coarse, large segment plans characteristic of direct aperture approaches and narrow field plans produced by sliding window approaches. Exposing this trade-off will allow for an informed choice between plan quality and solution time. Further research is required to speed up the optimization process to make this method clinically implementable.
. Increasing cancer incidence, staff shortage and high burnout rate among radiation oncologists, medical physicists and radiation technicians are putting many departments under strain. Operations ...research (OR) tools could optimize radiotherapy processes, however, clinical implementation of OR-tools in radiotherapy is scarce since most investigated optimization methods lack robustness against patient-to-patient variation in duration of tasks. By combining OR-tools, a method was developed that optimized deployment of radiotherapy resources by generating robust pretreatment preparation schedules that balance the expected average patient preparation time (
) with the risk of working overtime (
). The method was evaluated for various settings of an one-stop shop (OSS) outpatient clinic for palliative radiotherapy.
. The OSS at our institute sees, scans and treats 3-5 patients within one day. The OSS pretreatment preparation workflow consists of a fixed sequence of tasks, which was manually optimized for radiation oncologist and CT availability. To find more optimal sequences, with shorter
and lower
, a genetic algorithm was developed which regards these sequences as DNA-strands. The genetic algorithm applied natural selection principles to produce new sequences. A decoder translated sequences to schedules to find the conflicting fitness parameters
and
. For every generation, fitness of sequences was determined by the distance to the estimated Pareto front of
and
. Experiments were run in various OSS-settings.
. According to our approach, the expected
of the current clinical schedule could be reduced with 37%, without increasing
. Additional experiments provided insights in trade-offs between
,
, working shift length, number of patients treated on a single day and staff composition.
. Our approach demonstrated that OR-tools could optimize radiotherapy resources by robust pretreatment workflow scheduling. The results strongly support further exploration of scheduling optimization for treatment preparation also outside a one-stop shop or radiotherapy setting.
The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells ...analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.
Abstract
Motivation
The microbes that live in an environment can be identified from the combined genomic material, also referred to as the metagenome. Sequencing a metagenome can result in large ...volumes of sequencing reads. A promising approach to reduce the size of metagenomic datasets is by clustering reads into groups based on their overlaps. Clustering reads are valuable to facilitate downstream analyses, including computationally intensive strain-aware assembly. As current read clustering approaches cannot handle the large datasets arising from high-throughput metagenome sequencing, a novel read clustering approach is needed. In this article, we propose OGRE, an Overlap Graph-based Read clustEring procedure for high-throughput sequencing data, with a focus on shotgun metagenomes.
Results
We show that for small datasets OGRE outperforms other read binners in terms of the number of species included in a cluster, also referred to as cluster purity, and the fraction of all reads that is placed in one of the clusters. Furthermore, OGRE is able to process metagenomic datasets that are too large for other read binners into clusters with high cluster purity.
Conclusion
OGRE is the only method that can successfully cluster reads in species-specific clusters for large metagenomic datasets without running into computation time- or memory issues.
Availabilityand implementation
Code is made available on Github (https://github.com/Marleen1/OGRE).
Supplementary information
Supplementary data are available at Bioinformatics online.
Purpose
Theoretical studies have shown that dose‐painting‐by‐numbers (DPBN) could lead to large gains in tumor control probability (TCP) compared to conventional dose distributions. However, these ...gains may vary considerably among patients due to (a) variations in the overall radiosensitivity of the tumor, (b) variations in the 3D distribution of intra‐tumor radiosensitivity within the tumor in combination with patient anatomy, (c) uncertainties of the 3D radiosensitivity maps, (d) geometrical uncertainties, and (e) temporal changes in radiosensitivity. The goal of this study was to investigate how much of the theoretical gains of DPBN remain when accounting for these factors. DPBN was compared to both a homogeneous reference dose distribution and to nonselective dose escalation (NSDE), that uses the same dose constraints as DPBN, but does not require 3D radiosensitivity maps.
Methods
A fully automated DPBN treatment planning strategy was developed and implemented in our in‐house developed treatment planning system (TPS) that is robust to uncertainties in radiosensitivity and patient positioning. The method optimized the expected TCP based on 3D maps of intra‐tumor radiosensitivity, while accounting for normal tissue constraints, uncertainties in radiosensitivity, and setup uncertainties. Based on FDG‐PETCT scans of 12 non‐small cell lung cancer (NSCLC) patients, data of 324 virtual patients were created synthetically with large variations in the aforementioned parameters. DPBN was compared to both a uniform dose distribution of 60 Gy, and NSDE. In total, 360 DPBN and 24 NSDE treatment plans were optimized.
Results
The average gain in TCP over all patients and radiosensitivity maps of DPBN was 0.54 ± 0.20 (range 0–0.97) compared to the 60 Gy uniform reference dose distribution, but only 0.03 ± 0.03 (range 0–0.22) compared to NSDE. The gains varied per patient depending on the radiosensitivity of the entire tumor and the 3D radiosensitivity maps. Uncertainty in radiosensitivity led to a considerable loss in TCP gain, which could be recovered almost completely by accounting for the uncertainty directly in the optimization.
Conclusions
Our results suggest that the gains of DPBN can be considerable compared to a 60 Gy uniform reference dose distribution, but small compared to NSDE for most patients. Using the robust DPBN treatment planning system developed in this work, the optimal DPBN treatment plan could be derived for any patient for whom 3D intra‐tumor radiosensitivity maps are known, and can be used to select patients that might benefit from DPBN. NSDE could be an effective strategy to increase TCP without requiring biological information of the tumor.