Literature-based gene ontology (GO) annotation is a process where expert curators use uniform expressions to describe gene functions reported in research papers, creating computable representations ...of information about biological systems. Manual assurance of consistency between GO annotations and the associated evidence texts identified by expert curators is reliable but time-consuming, and is infeasible in the context of rapidly growing biological literature. A key challenge is maintaining consistency of existing GO annotations as new studies are published and the GO vocabulary is updated. In this work, we introduce a formalisation of biological database annotation inconsistencies, identifying four distinct types of inconsistency. We propose a novel and efficient method using state-of-the-art text mining models to automatically distinguish between consistent GO annotation and the different types of inconsistent GO annotation. We evaluate this method using a synthetic dataset generated by directed manipulation of instances in an existing corpus, BC4GO. We provide detailed error analysis for demonstrating that the method achieves high precision on more confident predictions. Two models built using our method for distinct annotation consistency identification tasks achieved high precision and were robust to updates in the GO vocabulary. Our approach demonstrates clear value for human-in-the-loop curation scenarios.
Infectious diseases spread through inherently spatial processes. Road and air traffic data have been used to model these processes at national and global scales. At metropolitan scales, however, ...mobility patterns are fundamentally different and less directly observable. Estimating the spatial distribution of infection has public health utility, but few studies have investigated this at an urban scale. In this study we address the question of whether the use of urban-scale mobility data can improve the prediction of spatial patterns of influenza infection. We compare the use of different sources of urban-scale mobility data, and investigate the impact of other factors relevant to modelling mobility, including mixing within and between regions, and the influence of hub and spoke commuting patterns.
We used journey-to-work (JTW) data from the Australian 2011 Census, and GPS journey data from the Sygic GPS Navigation & Maps mobile app, to characterise population mixing patterns in a spatially-explicit SEIR (susceptible, exposed, infectious, recovered) meta-population model.
Using the JTW data to train the model leads to an increase in the proportion of infections that arise in central Melbourne, which is indicative of the city's spoke-and-hub road and public transport networks, and of the commuting patterns reflected in these data. Using the GPS data increased the infections in central Melbourne to a lesser extent than the JTW data, and produced a greater heterogeneity in the middle and outer regions. Despite the limitations of both mobility data sets, the model reproduced some of the characteristics observed in the spatial distribution of reported influenza cases.
Urban mobility data sets can be used to support models that capture spatial heterogeneity in the transmission of infectious diseases at a metropolitan scale. These data should be adjusted to account for relevant urban features, such as highly-connected hubs where the resident population is likely to experience a much lower force of infection that the transient population. In contrast to national and international scales, the relationship between mobility and infection at an urban level is much less apparent, and requires a richer characterisation of population mobility and contact.
Early case detection is critical to preventing onward transmission of COVID-19 by enabling prompt isolation of index infections, and identification and quarantining of contacts. Timeliness and ...completeness of ascertainment depend on the surveillance strategy employed. This paper presents modelling used to inform workplace testing strategies for the Australian government in early 2021. We use rapid prototype modelling to quickly investigate the effectiveness of testing strategies to aid decision making. Models are developed with a focus on providing relevant results to policy makers, and these models are continually updated and improved as new questions are posed. Developed to support the implementation of testing strategies in high risk workplace settings in Australia, our modelling explores the effects of test frequency and sensitivity on outbreak detection. We start with an exponential growth model, which demonstrates how outbreak detection changes depending on growth rate, test frequency and sensitivity. From the exponential model, we learn that low sensitivity tests can produce high probabilities of detection when testing occurs frequently. We then develop a more complex Agent Based Model, which was used to test the robustness of the results from the exponential model, and extend it to include intermittent workplace scheduling. These models help our fundamental understanding of disease detectability through routine surveillance in workplaces and evaluate the impact of testing strategies and workplace characteristics on the effectiveness of surveillance. This analysis highlights the risks of particular work patterns while also identifying key testing strategies to best improve outbreak detection in high risk workplaces.
Estimating community level scabies prevalence is crucial for targeting interventions to areas of greatest need. The World Health Organisation recommends sampling at the unit of households or schools, ...but there is presently no standardised approach to scabies prevalence assessment. Consequently, a wide range of sampling sizes and methods have been used. As both prevalence and drivers of transmission vary across populations, there is a need to understand how sampling strategies for estimating scabies prevalence interact with local epidemiology to affect the accuracy of prevalence estimates.
We used a simulation-based approach to compare the efficacy of different scabies sampling strategies. First, we generated synthetic populations broadly representative of remote Australian Indigenous communities and assigned a scabies status to individuals to achieve a specified prevalence using different assumptions about scabies epidemiology. Second, we calculated an observed prevalence for different sampling methods and sizes.
The distribution of prevalence in subpopulation groups can vary substantially when the underlying scabies assignment method changes. Across all of the scabies assignment methods combined, the simple random sampling method produces the narrowest 95% confidence interval for all sample sizes. The household sampling method introduces higher variance compared to simple random sampling when the assignment of scabies includes a household-specific component. The school sampling method overestimates community prevalence when the assignment of scabies includes an age-specific component.
Our results indicate that there are interactions between transmission assumptions and surveillance strategies, emphasizing the need for understanding scabies transmission dynamics. We suggest using the simple random sampling method for estimating scabies prevalence. Our approach can be adapted to various populations and diseases.
Respiratory syncytial virus (RSV) infects almost all children by the age of 2 years, with the risk of hospitalisation highest in the first 6 months of life. Development and licensure of a vaccine to ...prevent severe RSV illness in infants is a public health priority. A recent phase 3 clinical trial estimated the efficacy of maternal vaccination at 39% over the first 90 days of life. Households play a key role in RSV transmission; however, few estimates of population-level RSV vaccine impact account for household structure.
We simulated RSV transmission within a stochastic, individual-based model framework, using an existing demographic model, structured by age and household and parameterised with Australian data, as an exemplar of a high-income country. We modelled vaccination by immunising pregnant women and explicitly linked the immune status of each mother-infant pair. We quantified the impact on children for a range of vaccine properties and uptake levels.
We found that a maternal immunisation strategy would have the most substantial impact in infants younger than 3 months, reducing RSV infection incidence in this age group by 16.6% at 70% vaccination coverage. In children aged 3-6 months, RSV infection was reduced by 5.3%. Over the first 6 months of life, the incidence rate for infants born to unvaccinated mothers was 1.26 times that of infants born to vaccinated mothers. The impact in older age groups was more modest, with evidence of infections being delayed to the second year of life.
Our findings show that while individual benefit from maternal RSV vaccination could be substantial, population-level reductions may be more modest. Vaccination impact was sensitive to the extent that vaccination prevented infection, highlighting the need for more vaccine trial data.
This perspective is part of an international effort to improve epidemiological models with the goal of reducing the unintended consequences of infectious disease interventions. The scenarios in which ...models are applied often involve difficult trade-offs that are well recognised in public health ethics. Unless these trade-offs are explicitly accounted for, models risk overlooking contested ethical choices and values, leading to an increased risk of unintended consequences. We argue that such risks could be reduced if modellers were more aware of ethical frameworks and had the capacity to explicitly account for the relevant values in their models. We propose that public health ethics can provide a conceptual foundation for developing this capacity. After reviewing relevant concepts in public health and clinical ethics, we discuss examples from the COVID-19 pandemic to illustrate the current separation between public health ethics and infectious disease modelling. We conclude by describing practical steps to build the capacity for ethically aware modelling. Developing this capacity constitutes a critical step towards ethical practice in computational modelling of public health interventions, which will require collaboration with experts on public health ethics, decision support, behavioural interventions, and social determinants of health, as well as direct consultation with communities and policy makers.
Tuberculosis (TB) control efforts are hampered by an imperfect understanding of TB epidemiology. The true age distribution of disease is unknown because a large proportion of individuals with active ...TB remain undetected. Understanding of transmission is limited by the asymptomatic nature of latent infection and the pathogen's capacity for late reactivation. A better understanding of TB epidemiology is critically needed to ensure effective use of existing and future control tools.
We use an agent-based model to simulate TB epidemiology in the five highest TB burden countries-India, Indonesia, China, the Philippines and Pakistan-providing unique insights into patterns of transmission and disease. Our model replicates demographically realistic populations, explicitly capturing social contacts between individuals based on local estimates of age-specific contact in household, school and workplace settings. Time-varying programmatic parameters are incorporated to account for the local history of TB control.
We estimate that the 15-19-year-old age group is involved in more than 20% of transmission events in India, Indonesia, the Philippines and Pakistan, despite representing only 5% of the local TB incidence. According to our model, childhood TB represents around one fifth of the incident TB cases in these four countries. In China, three quarters of incident TB were estimated to occur in the ≥ 45-year-old population. The calibrated per-contact transmission risk was found to be similar in each of the five countries despite their very different TB burdens.
Adolescents and young adults are a major driver of TB in high-incidence settings. Relying only on the observed distribution of disease to understand the age profile of transmission is potentially misleading.
Abstract
During the COVID-19 pandemic, evidence has accumulated that movement restrictions enacted to combat virus spread produce disparate consequences along socioeconomic lines. We investigate the ...hypothesis that people engaged in financially secure employment are better able to adhere to mobility restrictions, due to occupational factors that link the capacity for flexible work arrangements to income security. We use high-resolution spatial data on household internet traffic as a surrogate for adaptation to home-based work, together with the geographical clustering of occupation types, to investigate the relationship between occupational factors and increased internet traffic during work hours under lockdown in two Australian cities. By testing our hypothesis based on the observed trends, and exploring demographic factors associated with divergences from our hypothesis, we are left with a picture of unequal impact dominated by two major influences: the types of occupations in which people are engaged, and the composition of households and families. During lockdown, increased internet traffic was correlated with income security and, when school activity was conducted remotely, to the proportion of families with children. Our findings suggest that response planning and provision of social and economic support for residents within lockdown areas should explicitly account for income security and household structure. Overall, the results we present contribute to the emerging picture of the impacts of COVID-19 on human behaviour, and will help policy makers to understand the balance between public health and social impact in making decisions about mitigation policies.
During the early stages of the COVID-19 pandemic, there was considerable uncertainty surrounding epidemiological and clinical aspects of SARS-CoV-2. Governments around the world, starting from ...varying levels of pandemic preparedness, needed to make decisions about how to respond to SARS-CoV-2 with only limited information about transmission rates, disease severity and the likely effectiveness of public health interventions. In the face of such uncertainties, formal approaches to quantifying the value of information can help decision makers to prioritise research efforts.
In this study we use Value of Information (VoI) analysis to quantify the likely benefit associated with reducing three key uncertainties present in the early stages of the COVID-19 pandemic: the basic reproduction number (Formula: see text), case severity (CS), and the relative infectiousness of children compared to adults (CI). The specific decision problem we consider is the optimal level of investment in intensive care unit (ICU) beds. Our analysis incorporates mathematical models of disease transmission and clinical pathways in order to estimate ICU demand and disease outcomes across a range of scenarios.
We found that VoI analysis enabled us to estimate the relative benefit of resolving different uncertainties about epidemiological and clinical aspects of SARS-CoV-2. Given the initial beliefs of an expert, obtaining more information about case severity had the highest parameter value of information, followed by the basic reproduction number Formula: see text. Resolving uncertainty about the relative infectiousness of children did not affect the decision about the number of ICU beds to be purchased for any COVID-19 outbreak scenarios defined by these three parameters.
For the scenarios where the value of information was high enough to justify monitoring, if CS and Formula: see text are known, management actions will not change when we learn about child infectiousness. VoI is an important tool for understanding the importance of each disease factor during outbreak preparedness and can help to prioritise the allocation of resources for relevant information.
Remote Australian Aboriginal and Torres Strait Islander communities have potential to be severely impacted by COVID-19, with multiple factors predisposing to increased transmission and disease ...severity. Our modelling aims to inform optimal public health responses. An individual-based simulation model represented SARS-CoV2 transmission in communities ranging from 100 to 3500 people, comprised of large, interconnected households. A range of strategies for case finding, quarantining of contacts, testing, and lockdown were examined, following the silent introduction of a case. Multiple secondary infections are likely present by the time the first case is identified. Quarantine of close contacts, defined by extended household membership, can reduce peak infection prevalence from 60 to 70% to around 10%, but subsequent waves may occur when community mixing resumes. Exit testing significantly reduces ongoing transmission. Concurrent lockdown of non-quarantined households for 14 days is highly effective for epidemic control and reduces overall testing requirements; peak prevalence of the initial outbreak can be constrained to less than 5%, and the final community attack rate to less than 10% in modelled scenarios. Lockdown also mitigates the effect of a delay in the initial response. Compliance with lockdown must be at least 80-90%, however, or epidemic control will be lost. A SARS-CoV-2 outbreak will spread rapidly in remote communities. Prompt case detection with quarantining of extended-household contacts and a 14 day lockdown for all other residents, combined with exit testing for all, is the most effective strategy for rapid containment. Compliance is crucial, underscoring the need for community supported, culturally sensitive responses.