Emerging infectious diseases (EIDs), including the latest COVID-19 pandemic, have emerged and raised global public health crises in recent decades. Without existing protective immunity, an EID may ...spread rapidly and cause mass casualties in a very short time. Therefore, it is imperative to identify cases with risk of disease progression for the optimized allocation of medical resources in case medical facilities are overwhelmed with a flood of patients. This study has aimed to cope with this challenge from the aspect of preventive medicine by exploiting machine learning technologies. The study has been based on 83,227 hospital admissions with influenza-like illness and we analysed the risk effects of 19 comorbidities along with age and gender for severe illness or mortality risk. The experimental results revealed that the decision rules derived from the machine learning based prediction models can provide valuable guidelines for the healthcare policy makers to develop an effective vaccination strategy. Furthermore, in case the healthcare facilities are overwhelmed by patients with EID, which frequently occurred in the recent COVID-19 pandemic, the frontline physicians can incorporate the proposed prediction models to triage patients suffering minor symptoms without laboratory tests, which may become scarce during an EID disaster. In conclusion, our study has demonstrated an effective approach to exploit machine learning technologies to cope with the challenges faced during the outbreak of an EID.
Correct quantification of transcript expression is essential to understand the functional elements in different physiological conditions. For the organisms without the reference transcriptome, de ...novo transcriptome assembly must be carried out prior to quantification. However, a large number of erroneous contigs produced by the assemblers might result in unreliable estimation. In this regard, this study investigates how assembly quality affects the performance of quantification based on de novo transcriptome assembly. We examined the over-extended and incomplete contigs, and demonstrated that assembly completeness has a strong impact on the estimation of contig abundance. Then we investigated the behavior of the quantifiers with respect to sequence ambiguity which might be originally presented in the transcriptome or accidentally produced by assemblers. The results suggested that the quantifiers often over-estimate the expression of family-collapse contigs and under-estimate the expression of duplicated contigs. For organisms without reference transcriptome, it remains challenging to detect the inaccurate estimation on family-collapse contigs. On the contrary, we observed that the situation of under-estimation on duplicated contigs can be warned through analyzing the read proportion of estimated abundance (RPEA) of contigs in the connected component inferenced by the quantifiers. In addition, we suggest that the estimated quantification results on the connected component level have better accuracy over sequence level quantification. The analytic results conducted in this study provides valuable insights for future development of transcriptome assembly and quantification.
Hypnotics have been reported to be associated with dementia. However, the relationship between insomnia, hypnotics and dementia is still controversial. We sought to examine the risk of dementia in ...patients with long-term insomnia and the contribution of hypnotics.
Data was collected from Taiwan's Longitudinal Health Insurance Database. The study cohort comprised all patients aged 50 years or older with a first diagnosis of insomnia from 2002 to 2007. The comparison cohort consisted of randomly selected patients matched by age and gender. Each patient was individually tracked for 3 years from their insomnia index date to identify whether the patient had a first diagnosis of dementia. Cox regression was used to estimate hazard ratios (HRs) and 95% confidence intervals (CIs).
We identified 5693 subjects with long-term insomnia and 28,465 individuals without. After adjusting for hypertension, diabetes mellitus, hyperlipidemia, and stroke, those with long-term insomnia had significantly higher risks of dementia (HR, 2.34; 95% CI, 1.92-2.85). Patients with long-term insomnia and aged 50 to 65 years had a higher increased risk of dementia (HR, 5.22; 95% CI, 2.62-10.41) than those older than 65 years (HR, 2.33; 95% CI, 1.90-2.88). The use of hypnotics with a longer half-life and at a higher prescribed dose predicted a greater increased risk of dementia.
Patients with long-term use of hypnotics have more than a 2-fold increased risk of dementia, especially those aged 50 to 65 years. In addition, the dosage and half-lives of the hypnotics used should be considered, because greater exposure to these medications leads to a higher risk of developing dementia.
The potential relationship between anaesthesia, surgery and onset of dementia remains elusive.
To determine whether the risk of dementia increases after surgery with anaesthesia, and to evaluate ...possible associations among age, mode of anaesthesia, type of surgery and risk of dementia.
The study cohort comprised patients aged 50 years and older who were anaesthetised for the first time since 1995 between 1 January 2004 and 31 December 2007, and a control group of randomly selected patients matched for age and gender. Patients were followed until 31 December 2010 to identify the emergence of dementia.
Relative to the control group, patients who underwent anaesthesia and surgery exhibited an increased risk of dementia (hazard ratio = 1.99) and a reduced mean interval to dementia diagnosis. The risk of dementia increased in patients who received intravenous or intramuscular anaesthesia, regional anaesthesia and general anaesthesia.
The results of our nationwide, population-based study suggest that patients who undergo anaesthesia and surgery may be at increased risk of dementia.
Genome-wide association studies (GWAS) provide a powerful means to identify associations between genetic variants and phenotypes. However, GWAS techniques for detecting epistasis, the interactions ...between genetic variants associated with phenotypes, are still limited. We believe that developing an efficient and effective GWAS method to detect epistasis will be a key for discovering sophisticated pathogenesis, which is especially important for complex diseases such as Alzheimer's disease (AD).
In this regard, this study presents GenEpi, a computational package to uncover epistasis associated with phenotypes by the proposed machine learning approach. GenEpi identifies both within-gene and cross-gene epistasis through a two-stage modeling workflow. In both stages, GenEpi adopts two-element combinatorial encoding when producing features and constructs the prediction models by L1-regularized regression with stability selection. The simulated data showed that GenEpi outperforms other widely-used methods on detecting the ground-truth epistasis. As real data is concerned, this study uses AD as an example to reveal the capability of GenEpi in finding disease-related variants and variant interactions that show both biological meanings and predictive power.
The results on simulation data and AD demonstrated that GenEpi has the ability to detect the epistasis associated with phenotypes effectively and efficiently. The released package can be generalized to largely facilitate the studies of many complex diseases in the near future.
In recent decades, the global incidence of dengue has increased. Affected countries have responded with more effective surveillance strategies to detect outbreaks early, monitor the trends, and ...implement prevention and control measures. We have applied newly developed machine learning approaches to identify laboratory-confirmed dengue cases from 4,894 emergency department patients with dengue-like illness (DLI) who received laboratory tests. Among them, 60.11% (2942 cases) were confirmed to have dengue. Using just four input variables age, body temperature, white blood cells counts (WBCs) and platelets, not only the state-of-the-art deep neural network (DNN) prediction models but also the conventional decision tree (DT) and logistic regression (LR) models delivered performances with receiver operating characteristic (ROC) curves areas under curves (AUCs) of the ranging from 83.75% to 85.87% for DT, DNN and LR: 84.60% ± 0.03%, 85.87% ± 0.54%, 83.75% ± 0.17%, respectively. Subgroup analyses found all the models were very sensitive particularly in the pre-epidemic period. Pre-peak sensitivities (<35 weeks) were 92.6%, 92.9%, and 93.1% in DT, DNN, and LR respectively. Adjusted odds ratios examined with LR for low WBCs ≤ 3.2 (x103/μL), fever (≥38°C), low platelet counts < 100 (x103/μL), and elderly (≥ 65 years) were 5.17 95% confidence interval (CI): 3.96-6.76, 3.17 95%CI: 2.74-3.66, 3.10 95%CI: 2.44-3.94, and 1.77 95%CI: 1.50-2.10, respectively. Our prediction models can readily be used in resource-poor countries where viral/serologic tests are inconvenient and can also be applied for real-time syndromic surveillance to monitor trends of dengue cases and even be integrated with mosquito/environment surveillance for early warning and immediate prevention/control measures. In other words, a local community hospital/clinic with an instrument of complete blood counts (including platelets) can provide a sentinel screening during outbreaks. In conclusion, the machine learning approach can facilitate medical and public health efforts to minimize the health threat of dengue epidemics. However, laboratory confirmation remains the primary goal of surveillance and outbreak investigation.
Microbial communities are massively resident in the human body, yet dysbiosis has been reported to correlate with many diseases, including various cancers. Most studies focus on the gut microbiome, ...while the bacteria that participate in tumor microenvironments on site remain unclear. Previous studies have acquired the bacteria expression profiles from RNA-seq, whole genome sequencing, and whole exon sequencing in The Cancer Genome Atlas (TCGA). However, small-RNA sequencing data were rarely used. Using TCGA miRNA sequencing data, we evaluated bacterial abundance in 32 types of cancer. To uncover the bacteria involved in cancer, we applied an analytical process to align unmapped human reads to bacterial references and developed the BIC database for the transcriptional landscape of bacteria in cancer. BIC provides cancer-associated bacterial information, including the relative abundance of bacteria, bacterial diversity, associations with clinical relevance, the co-expression network of bacteria and human genes, and their associated biological functions. These results can complement previously published databases. Users can easily download the result plots and tables, or download the bacterial abundance matrix for further analyses. In summary, BIC can provide information on cancer microenvironments related to microbial communities. BIC is available at: http://bic.jhlab.tw/.
Abstract Background Interstitial lung disease (ILD) is the primary cause of mortality in systemic sclerosis (SSc), an autoimmune disease characterized by tissue fibrosis. SSc-related ILD (SSc-ILD) ...occurs more frequently in females aged 30–55 years, whereas idiopathic pulmonary fibrosis (IPF) is more prevalent in males aged 60–75 years. SSc-ILD occurs earlier than IPF and progresses rapidly. FCN1, FABP4, and SPP1 macrophages are involved in the pathogenesis of lung fibrosis; SPP1 macrophages demonstrate upregulated expression in both SSc-ILD and IPF. To identify the differences between SSc-ILD and IPF using single-cell analysis, clarify their distinct pathogeneses, and propose directions for prevention and treatment. Methods We performed single-cell RNA sequencing on NCBI Gene Expression Omnibus (GEO) databases GSE159354 and GSE212109, and analyzed lung tissue samples across healthy controls, IPF, and SSc-ILD. The primary measures were the filtered genes integrated with batch correction and annotated cell types for distinguishing patients with SSc-ILD from healthy controls. We proposed an SSc-ILD pathogenesis using cell–cell interaction inferences, and predicted transcription factors regulating target genes using SCENIC. Drug target prediction of the TF gene was performed using Drug Bank Online. Results A subset of macrophages activates the MAPK signaling pathway under oxidative stress. Owing to the lack of inhibitory feedback from ANNEXIN and the autoimmune characteristics, this leads to an earlier onset of lung fibrosis compared to IPF. During initial lung injury, fibroblasts begin to activate the IL6 pathway under the influence of SPP1 alveolar macrophages, but IL6 appears unrelated to other inflammatory and immune cells. This may explain why tocilizumab (an anti-IL6-receptor antibody) only preserves lung function in patients with early SSc-ILD. Finally, we identified BCLAF1 and NFE2L2 as influencers of MAPK activation in macrophages. Metformin downregulates NFE2L2 and could serve as a repurposed drug candidate. Conclusions SPP1 alveolar macrophages play a role in the profibrotic activity of IPF and SSc-ILD. However, SSc-ILD is influenced by autoimmunity and oxidative stress, leading to the continuous activation of MAPK in macrophages. This may result in an earlier onset of lung fibrosis than in IPF. Such differences could serve as potential research directions for early prevention and treatment.
Recently, non-coding RNAs are of growing interest, and more scientists attach importance to research on their functions. Long non-coding RNAs (lncRNAs) are defined as non-protein coding transcripts ...longer than 200 nucleotides. We already knew that lncRNAs are related to cancers and will be dysregulated in them. But most of their functions are still left to further study. A mechanism of RNA regulation, known as competing endogenous RNAs (ceRNAs), has been proposed to explain the complex relationships among mRNAs and lncRNAs by competing for binding with shared microRNAs (miRNAs).
We proposed an analysis framework to construct the association networks among lncRNA, mRNA, and miRNAs based on their expression patterns and decipher their network modules.
We collected a large-scale gene expression dataset of 1,061 samples from breast invasive carcinoma (BRCA) patients, each consisted of the expression profiles of 4,359 lncRNAs, 16,517 mRNAs, and 534 miRNAs, and applied the proposed analysis approach to interrogate them. We have uncovered the underlying ceRNA modules and the key modulatory lncRNAs for different subtypes of breast cancer.
We proposed a modulatory analysis to infer the ceRNA effects among mRNAs and lncRNAs and performed functional analysis to reveal the plausible mechanisms of lncRNA modulation in the four breast cancer subtypes. Our results might provide new directions for breast cancer therapeutics and the proposed method could be readily applied to other diseases.