After claiming nearly five hundred thousand lives globally, the COVID-19 pandemic is showing no signs of slowing down. While the UK, USA, Brazil and parts of Asia are bracing themselves for the ...second wave-or the extension of the first wave-it is imperative to identify the primary social, economic, environmental, demographic, ethnic, cultural and health factors contributing towards COVID-19 infection and mortality numbers to facilitate mitigation and control measures.
We process several open-access datasets on US states to create an integrated dataset of potential factors leading to the pandemic spread. We then apply several supervised machine learning approaches to reach a consensus as well as rank the key factors. We carry out regression analysis to pinpoint the key pre-lockdown factors that affect post-lockdown infection and mortality, informing future lockdown-related policy making.
Population density, testing numbers and airport traffic emerge as the most discriminatory factors, followed by higher age groups (above 40 and specifically 60+). Post-lockdown infected and death rates are highly influenced by their pre-lockdown counterparts, followed by population density and airport traffic. While healthcare index seems uncorrelated with mortality rate, principal component analysis on the key features show two groups: states (1) forming early epicenters and (2) experiencing strong second wave or peaking late in rate of infection and death. Finally, a small case study on New York City shows that days-to-peak for infection of neighboring boroughs correlate better with inter-zone mobility than the inter-zone distance.
States forming the early hotspots are regions with high airport or road traffic resulting in human interaction. US states with high population density and testing tend to exhibit consistently high infected and death numbers. Mortality rate seems to be driven by individual physiology, preexisting condition, age etc., rather than gender, healthcare facility or ethnic predisposition. Finally, policymaking on the timing of lockdowns should primarily consider the pre-lockdown infected numbers along with population density and airport traffic.
COVID-19, a global pandemic caused by the Severe Acute Respiratory Syndrome Coronavirus 2 virus, has claimed millions of lives worldwide. Amid soaring contagion due to newer strains of the virus, it ...is imperative to design dynamic, spatiotemporal models to contain the spread of infection during future outbreaks of the same or variants of the virus. The reliance on existing prediction and contact tracing approaches on prior knowledge of inter- or intra-zone mobility renders them impracticable. We present a spatiotemporal approach that employs a network inference approach with sliding time windows solely on the date and number of daily infection numbers of zones within a geographical region to generate temporal networks capturing the influence of each zone on another. It helps analyze the spatial interaction among the hotspot or spreader zones and highly affected zones based on the flow of network contagion traffic. We apply the proposed approach to the daily infection counts of New York State as well as the states of USA to show that it effectively measures the phase shifts in the pandemic timeline. It identifies the spreaders and affected zones at different time points and helps infer the trajectory of the pandemic spread across the country. A small set of zones periodically exhibit a very high outflow of contagion traffic over time, suggesting that they act as the key spreaders of infection. Moreover, the strong influence between the majority of non-neighbor regions suggests that the overall spread of infection is a result of the unavoidable long-distance trips by a large number of people as opposed to the shorter trips at a county level, thereby informing future mitigation measures and public policies.
Network inference is used to model transcriptional, signaling, and metabolic interactions among genes, proteins, and metabolites that identify biological pathways influencing disease pathogenesis. ...Advances in machine learning (ML)-based inference models exhibit the predictive capabilities of capturing latent patterns in genomic data. Such models are emerging as an alternative to the statistical models identifying causative factors driving complex diseases. We present CoVar, an ML-based framework that builds upon the properties of existing inference models, to find the central genes driving perturbed gene expression across biological states. Unlike differentially expressed genes (DEGs) that capture changes in individual gene expression across conditions, CoVar focuses on identifying variational genes that undergo changes in their expression network interaction profiles, providing insights into changes in the regulatory dynamics, such as in disease pathogenesis. Subsequently, it finds core genes from among the nearest neighbors of these variational genes, which are central to the variational activity and influence the coordinated regulatory processes underlying the observed changes in gene expression. Through the analysis of simulated as well as yeast expression data perturbed by the deletion of the mitochondrial genome, we show that CoVar captures the intrinsic variationality and modularity in the expression data, identifying key driver genes not found through existing differential analysis methodologies.
Abstract
Inflammatory bowel diseases (IBD), namely Crohn’s disease (CD) and ulcerative colitis (UC) are chronic inflammation within the gastrointestinal tract. IBD patient conditions and treatments, ...such as with immunosuppressants, may result in a higher risk of viral and bacterial infection and more severe outcomes of infections. The effect of the clinical and demographic factors on the prognosis of COVID-19 among IBD patients is still a significant area of investigation. The lack of available data on a large set of COVID-19 infected IBD patients has hindered progress. To circumvent this lack of large patient data, we present a random sampling approach to generate clinical COVID-19 outcomes (outpatient management, hospitalized and recovered, and hospitalized and deceased) on 20,000 IBD patients modeled on reported summary statistics obtained from the Surveillance Epidemiology of Coronavirus Under Research Exclusion (SECURE-IBD), an international database to monitor and report on outcomes of COVID-19 occurring in IBD patients. We apply machine learning approaches to perform a comprehensive analysis of the primary and secondary covariates to predict COVID-19 outcome in IBD patients. Our analysis reveals that age, medication usage and the number of comorbidities are the primary covariates, while IBD severity, smoking history, gender and IBD subtype (CD or UC) are key secondary features. In particular, elderly male patients with ulcerative colitis, several preexisting conditions, and who smoke comprise a highly vulnerable IBD population. Moreover, treatment with 5-ASAs (sulfasalazine/mesalamine) shows a high association with COVID-19/IBD mortality. Supervised machine learning that considers age, number of comorbidities and medication usage can predict COVID-19/IBD outcomes with approximately 70% accuracy. We explore the challenge of drawing demographic inferences from existing COVID-19/IBD data. Overall, there are fewer IBD case reports from US states with poor health ranking hindering these analyses. Generation of patient characteristics based on known summary statistics allows for increased power to detect IBD factors leading to variable COVID-19 outcomes. There is under-reporting of COVID-19 in IBD patients from US states with poor health ranking, underpinning the perils of using the repository to derive demographic information.
Complex networks capture the structure, dynamics, and relationships among entities in real-world networked systems, encompassing domains like communications, society, chemistry, biology, ecology, ...politics, etc. Analysis of complex networks lends insight into the critical nodes, key pathways, and potential points of failure that may impact the connectivity and operational integrity of the underlying system. In this work, we investigate the topological properties or indicators, such as shortest path length, modularity, efficiency, graph density, diameter, assortativity, and clustering coefficient, that determine the vulnerability to (or robustness against) diverse attack scenarios. Specifically, we examine how node- and link-based network growth or depletion based on specific attack criteria affect their robustness gauged in terms of the largest connected component (LCC) size and diameter. We employ partial least squares discriminant analysis to quantify the individual contribution of the indicators on LCC preservation while accounting for the collinearity stemming from the possible correlation between indicators. Our analysis of 14 complex network datasets and 5 attack models invariably reveals high modularity and disassortativity to be prime indicators of vulnerability, corroborating prior works that report disassortative modular networks to be particularly susceptible to targeted attacks. We conclude with a discussion as well as an illustrative example of the application of this work in fending off strategic attacks on critical infrastructures through models that adaptively and distributively achieve network robustness.
There is a concerted effort to develop vaccines to combat the public health crisis caused by COVID-19. Health experts are raising questions concerning the clinical trials, price and distribution of ...the vaccines among the public. Policymakers must design regulations to allocate vaccines equitably based on epidemiological as well as economic considerations. Allocation strategies must factor in the extent and duration of immunity the vaccines yield from future infections. We present a time-varying linear optimization-based approach, which incorporates epidemiological factors, such as population density, susceptible count and infected ratio as well as transportation costs, to disseminate vaccines among zones. Our approach also employs an update rule on the epidemiological statistics spawning from the SEIRD model to learn the extent of immunity provided by the vaccines. Our experiments are performed on the map of New York State with realistic mobility using spatial and ordinary differential equation-based SEIRD models. We show that the proposed approach allocates vaccines optimally while preventing zones from resource starvation. It adapts its vaccine recommendation to meet the prespecified epidemiological and economic policy requirements of zones, and is highly amenable to accommodate multiple immunity ratios, vaccine types and the time-span of acquired vaccine immunity. Moreover, we demonstrate that one can learn the latent immunity ratios of the vaccines to make more informed recommendations with time. Our ordinary differential equation-based analysis on the real demographic and infection data of New York State show that a small fraction of zones tends to exhibit a high resource demand due to their vulnerability to the pandemic spread. Our vaccine allocation strategy incorporates infection ratio as one of the criteria and is applicable to conceive a drug allocation strategy. The results suggest that incorporating the vulnerability score of zones to the epidemic spread into vaccine allocation may enhance recommendations and aid policy making.
Link prediction algorithms in complex networks, such as social networks, biological networks, drug-drug interactions, communication networks, and so on, assign scores to predict potential links ...between two nodes. Link prediction (LP) enables researchers to learn unknown, new as well as future interactions among the entities being modeled in the complex networks. In addition to measures like degree distribution, clustering coefficient, centrality, etc., another metric to characterize structural properties is network assortativity which measures the tendency of nodes to connect with similar nodes. In this paper, we explore metrics that effectively predict the links based on the assortativity profiles of the complex networks. To this end, we first propose an approach that generates networks of varying assortativity levels and utilize three sets of link prediction models combining the similarity of neighborhoods and preferential attachment. We carry out experiments to study the LP accuracy (measured in terms of area under the precision-recall curve) of the link predictors individually and in combination with other baseline measures. Our analysis shows that link prediction models that explore a large neighborhood around nodes of interest, such as CH2-L2 and CH2-L3, perform consistently for assortative as well as disassortative networks. While common neighbor-based local measures are effective for assortative networks, our proposed combination of common neighbors with node degree is a good choice for the LP metric in disassortative networks. We discuss how this analysis helps achieve the best-parameterized combination of link prediction models and its significance in the context of link prediction from incomplete social and biological network data.
Analysis of the topology of transcriptional regulatory networks (TRNs) is an effective way to study the regulatory interactions between the transcription factors (TFs) and the target genes. TRNs are ...characterized by the abundance of motifs such as feed forward loops (FFLs), which contribute to their structural and functional properties. In this paper, we focus on the role of motifs (specifically, FFLs) in signal propagation in TRNs and the organization of the TRN topology with FFLs as building blocks. To this end, we classify nodes participating in FFLs (termed motif central nodes) into three distinct roles (namely, roles A, B and C), and contrast them with TRN nodes having high connectivity on the basis of their potential for information dissemination, using metrics such as network efficiency, path enumeration, epidemic models and standard graph centrality measures. We also present the notion of a three tier architecture and how it can help study the structural properties of TRN based on connectivity and clustering tendency of motif central nodes. Finally, we motivate the potential implication of the structural properties of motif centrality in design of efficient protocols of information routing in communication networks as well as their functional properties in global regulation and stress response to study specific disease conditions and identification of drug targets.
This article critically reviews the contemporary global production network (GPN) analyses from the perspective of Marxian political economy.1 The GPN analyses focus on rents created at various nodes ...of the production network, and it ignores the fact that returns from interventions at specific stages in the value chain are not independent of the entire process of surplus creation and realization. Rents from innovation depend on the movement of the average capital in the particular industry and the way political economy of institutions allow certain ‘scarcities’ remain protected while others being drawn into the realm of competition. The article also argues that the GPN analyses hardly explain the dynamics of inclusion and exclusion of firms within such networks. It is argued that the dynamics is primarily governed by the relative position of individual capital and its technical composition with reference to the capital that assumes average levels of technology in that industry at a particular point of time.
COVID-19 is a global health crisis that has caused ripples in every aspect of human life. Amid widespread vaccinations testing, manufacture and distribution efforts, nations still rely on human ...mobility restrictions to mitigate infection and death tolls. New waves of infection in many nations, indecisiveness on the efficacy of existing vaccinations, and emerging strains of the virus call for intelligent mobility policies that utilize contact pattern and epidemiological data to check contagion. Our earlier work leveraged network science principles to design social distancing optimization approaches that show promise in slowing infection spread however, they prove to be computationally prohibitive and require complete knowledge of the social network. In this work, we present scalable and distributed versions of the optimization approaches based on Markov Chain Monte Carlo Gibbs sampling and grid-based spatial parallelization that tackle both the challenges faced by the optimization strategies. We perform extensive simulation experiments to show the ability of the proposed strategies to meet necessary network science measures and yield performance comparable to the optimal counterpart, while exhibiting significant speed-up. We study the scalability of the proposed strategies as well as their performance in realistic scenarios when a fraction of the population temporarily flouts the location recommendations.