Colorectal cancer (CRC) has been associated with changes in volatile metabolic profiles in several human biological matrices. This enables its non-invasive detection, but the origin of these volatile ...organic compounds (VOCs) and their relation to the gut microbiome are not yet fully understood. This systematic review provides an overview of the current understanding of this topic. A systematic search using PubMed, Embase, Medline, Cochrane Library, and the Web of Science according to PRISMA guidelines resulted in seventy-one included studies. In addition, a systematic search was conducted that identified five systematic reviews from which CRC-associated gut microbiota data were extracted. The included studies analyzed VOCs in feces, urine, breath, blood, tissue, and saliva. Eight studies performed microbiota analysis in addition to VOC analysis. The most frequently reported dysregulations over all matrices included short-chain fatty acids, amino acids, proteolytic fermentation products, and products related to the tricarboxylic acid cycle and Warburg metabolism. Many of these dysregulations could be related to the shifts in CRC-associated microbiota, and thus the gut microbiota presumably contributes to the metabolic fingerprint of VOC in CRC. Future research involving VOCs analysis should include simultaneous gut microbiota analysis.
•Prospective exploration of environmental determinants of headache.•Headache appears to be a transient condition in the population.•Air pollution and urban temperature contribute to the reporting of ...headache.•The largest effect was observed for NO2, PM10, and heat island effect.
Headache is one of the most prevalent and disabling health conditions globally. We prospectively explored the urban exposome in relation to weekly occurrence of headache episodes using data from the Dutch population-based Occupational and Environmental Health Cohort Study (AMIGO).
Participants (N = 7,339) completed baseline and follow-up questionnaires in 2011 and 2015, reporting headache frequency. Information on the urban exposome covered 80 exposures across 10 domains, such as air pollution, electromagnetic fields, and lifestyle and socio-demographic characteristics. We first identified all relevant exposures using the Boruta algorithm and then, for each exposure separately, we estimated the average treatment effect (ATE) and related standard error (SE) by training causal forests adjusted for age, depression diagnosis, painkiller use, general health indicator, sleep disturbance index and weekly occurrence of headache episodes at baseline.
Occurrence of weekly headache was 12.5 % at baseline and 11.1 % at follow-up. Boruta selected five air pollutants (NO2, NOX, PM10, silicon in PM10, iron in PM2.5) and one urban temperature measure (heat island effect) as factors contributing to the occurrence of weekly headache episodes at follow-up. The estimated causal effect of each exposure on weekly headache indicated positive associations. NO2 showed the largest effect (ATE = 0.007 per interquartile range (IQR) increase; SE = 0.004), followed by PM10 (ATE = 0.006 per IQR increase; SE = 0.004), heat island effect (ATE = 0.006 per one-degree Celsius increase; SE = 0.007), NOx (ATE = 0.004 per IQR increase; SE = 0.004), iron in PM2.5 (ATE = 0.003 per IQR increase; SE = 0.004), and silicon in PM10 (ATE = 0.003 per IQR increase; SE = 0.004).
Our results suggested that exposure to air pollution and heat island effects contributed to the reporting of weekly headache episodes in the study population.
Early detection of colorectal cancer (CRC) by screening programs is crucial because survival rates worsen at advanced stages. However, the currently used screening method, the fecal immunochemical ...test (FIT), suffers from a high number of false-positives and is insensitive for detecting advanced adenomas (AAs), resulting in false-negatives for these premalignant lesions. Therefore, more accurate, noninvasive screening tools are needed. In this study, the utility of analyzing volatile organic compounds (VOCs) in exhaled breath in a FIT-positive population to detect the presence of colorectal neoplasia was studied.
In this multicenter prospective study, breath samples were collected from 382 FIT-positive patients with subsequent colonoscopy participating in the national Dutch bowel screening program (n = 84 negative controls, n = 130 non-AAs, n = 138 AAs, and n = 30 CRCs). Precolonoscopy exhaled VOCs were analyzed using thermal desorption-gas chromatography-mass spectrometry, and the data were preprocessed and analyzed using machine learning techniques.
Using 10 discriminatory VOCs, AAs could be distinguished from negative controls with a sensitivity and specificity of 79% and 70%, respectively. Based on this biomarker profile, CRC and AA combined could be discriminated from controls with a sensitivity and specificity of 77% and 70%, respectively, and CRC alone could be discriminated from controls with a sensitivity and specificity of 80% and 70%, respectively. Moreover, the feasibility to discriminate non-AAs from controls and AAs was shown.
VOCs in exhaled breath can detect the presence of AAs and CRC in a CRC screening population and may improve CRC screening in the future.
Disease detection and monitoring using volatile organic compounds (VOCs) is becoming increasingly popular. For a variety of (gastrointestinal) diseases the microbiome should be considered. As its ...output is to large extent volatile, faecal volatilomics carries great potential. One technical limitation is that current faecal headspace analysis requires specialized instrumentation which is costly and typically does not work in harmony with thermal desorption units often utilized in e.g. exhaled breath studies. This lack of harmonization hinders uptake of such analyses by the Volatilomics community. Therefore, this study optimized and compared two recently harmonized faecal headspace sampling platforms:
and the
. Statistical design of experiment was applied to find optimal sampling conditions by maximizing reproducibility, the number of VOCs detected, and between subject variation. To foster general applicability those factors were defined using semi-targeted as well as untargeted metabolic profiles. HiSorb probes were found to result in a faster sampling procedure, higher number of detected VOCs, and higher stability. The headspace collection using the Microchamber resulted in a lower number of detected VOCs, longer sampling times and decreased stability despite a smaller number of interfering VOCs and no background signals. Based on the observed profiles, recommendations are provided on pre-processing and study design when using either one of both platforms. Both can be used to perform faecal headspace collection, but altogether HiSorb is recommended.
Primary sclerosing cholangitis (PSC) is a chronic cholestatic liver disease characterized by progressive inflammation and fibrosis of the bile ducts. PSC is a complex disease of largely unknown ...aetiology that is strongly associated with inflammatory bowel disease (IBD). Diagnosis, especially at an early stage, is difficult and to date there is no diagnostic biomarker. The present study aimed to assess the diagnostic potential of volatile organic compounds (VOCs) in exhaled breath to detect (early) PSC in an IBD population.
Breath samples were obtained from 16 patients with PSC alone, 47 with PSC and IBD, and 53 with IBD alone during outpatient clinic visits. Breath sampling was performed using the ReCIVA breath sampler and subsequently analysed by gas chromatography mass spectrometry. Random forest modelling was performed to find discriminatory VOCs and create a predictive model that was tested using an independent test set.
The final model to discriminate patients with PSC, with or without IBD, from patients with IBD alone included twenty VOCs and achieved a sensitivity, specificity, and area under the receiver-operating curve on the test set of 77%, 83%, and 0.84 respectively. Three VOCs (isoprene, 2-octanone and undecane) together correlated significantly with the Amsterdam-Oxford score for PSC disease prognosis. A sensitivity analysis showed stable results across early-stage PSC, including in those with normal alkaline phosphatase levels, as well as further progressed PSC.
The present study demonstrates that exhaled breath can distinguish PSC cases from IBD and has potential as a non-invasive clinical breath test for (early) PSC.
Primary sclerosing cholangitis is a complex chronic liver disease, which ultimately results in cirrhosis, liver failure, and death. Detection, especially in early disease stages, can be challenging, and therefore therapy typically starts when there is already some irreversible damage. The current study shows that metabolites in exhaled breath, so called volatile organic compounds, hold promise to non-invasively detect primary sclerosing cholangitis, including at early disease stages.
Display omitted
•Volatile organic compounds in exhaled breath can detect primary sclerosing cholangitis in an inflammatory bowel disease population.•Volatile organic compounds in exhaled breath relate to Amsterdam Oxford score.•Sensitivity analysis showed stable sensitivity for detection of early-stage PSC.
Up to 5% of inflammatory bowel disease patients may at some point develop primary sclerosing cholangitis (PSC). PSC is a rare liver disease that ultimately results in liver damage, cirrhosis and ...liver failure. It typically remains subclinical until irreversible damage has been inflicted. Hence, it is crucial to screen IBD patients for PSC, but its early detection is challenging, and the disease's etiology is not well understood. This current study aimed at the early detection of PSC in an IBD population using Volatile Organic Compounds in fecal headspace and exhaled breath. To this aim, fecal material and exhaled breath were collected from 73 patients (
= 16 PSC/IBD;
= 8 PSC;
= 49 IBD), and their volatile profile were analyzed using Gas Chromatography-Mass Spectrometry. Using the most discriminatory features, PSC detection resulted in areas under the ROC curve (AUCs) of 0.83 and 0.84 based on fecal headspace and exhaled breath, respectively. Upon data fusion, the predictive performance increased to AUC 0.92. The observed features in the fecal headspace relate to detrimental microbial dysbiosis and exogenous exposure. Future research should aim for the early detection of PSC in a prospective study design.
Current technological developments have allowed for a significant increase and availability of data. Consequently, this has opened enormous opportunities for the machine learning and data science ...field, translating into the development of new algorithms in a wide range of applications in medical, biomedical, daily-life, and national security areas. Ensemble techniques are among the pillars of the machine learning field, and they can be defined as approaches in which multiple, complex, independent/uncorrelated, predictive models are subsequently combined by either averaging or voting to yield a higher model performance. Random forest (RF), a popular ensemble method, has been successfully applied in various domains due to its ability to build predictive models with high certainty and little necessity of model optimization. RF provides both a predictive model and an estimation of the variable importance. However, the estimation of the variable importance is based on thousands of trees, and therefore, it does not specify which variable is important for which sample group.
The present study demonstrates an approach based on the pseudo-sample principle that allows for construction of bi-plots (i.e. spin plots) associated with RF models. The pseudo-sample principle for RF. is explained and demonstrated by using two simulated datasets, and three different types of real data, which include political sciences, food chemistry and the human microbiome data. The pseudo-sample bi-plots, associated with RF and its unsupervised version, allow for a versatile visualization of multivariate models, and the variable importance and the relation among them.
Display omitted
•Pseudo-samples enable visualization of the variable importance in random forest (RF).•Interpretation of variable importance in RF and unsupervised random forest (URF).•Possibility of obtaining so called bi-plot for RF and URF.•Relation between variables are obtained using principal coordinates analysis.
Data fusion has gained much attention in the field of life sciences, and this is because analysis of biological samples may require the use of data coming from multiple complementary sources to ...express the samples fully. Data fusion lies in the idea that different data platforms detect different biological entities. Therefore, if these different biological compounds are then combined, they can provide comprehensive profiling and understanding of the research question in hand. Data fusion can be performed in three different traditional ways: low-level, mid-level, and high-level data fusion. However, the increasing complexity and amount of generated data require the development of more sophisticated fusion approaches. In that regard, the current study presents an advanced data fusion approach (i.e. proximities stacking) based on random forest proximities coupled with the pseudo-sample principle. Four different data platforms of 130 samples each (faecal microbiome, blood, blood headspace, and exhaled breath samples of patients who have Crohn's disease) were used to demonstrate the classification performance of this new approach. More specifically, 104 samples were used to train and validate the models, whereas the remaining 26 samples were used to validate the models externally. Mid-level, high-level, as well as individual platform classification predictions, were made and compared against the proximities stacking approach. The performance of each approach was assessed by calculating the sensitivity and specificity of each model for the external test set, and visualized by performing principal component analysis on the proximity matrices of the training samples to then, subsequently, project the test samples onto that space. The implementation of pseudo-samples allowed for the identification of the most important variables per platform, finding relations among variables of the different data platforms, and the examination of how variables behave in the samples. The proximities stacking approach outperforms both mid-level and high-level fusion approaches, as well as all individual platform predictions. Concurrently, it tackles significant bottlenecks of the traditional ways of fusion and of another advanced fusion way discussed in the paper, and finally, it contradicts the general belief that the more data, the merrier the result, and therefore, considerations have to be taken into account before any data fusion analysis is conducted.
Display omitted
•Random forest proximities of various data platforms can be fused via a weighted sum to increase prediction accuracy in complex biological data.•The problem of variable interpretation and examination when working with proximities or kernels is tackled by implementing the pseudo-sample principle.•Random forest proximities fusion can outperform the traditional ways of fusion as well as demonstrate the contribution of every platform in the outcome.•The pseudo-sample principle allows for identification of relations among variables from different data platforms.
Real-world data from electronic health records (EHRs) represent a wealth of information for studying the benefits and risks of medical treatment. However, they are limited in scope and should be ...complemented by information from the patient perspective.
The aim of this study is to develop an innovative research infrastructure that combines information from EHRs with patient experiences reported in questionnaires to monitor the risks and benefits of medical treatment.
We focused on the treatment of overactive bladder (OAB) in general practice as a use case. To develop the Benefit, Risk, and Impact of Medication Monitor (BRIMM) infrastructure, we first performed a requirement analysis. BRIMM's starting point is routinely recorded general practice EHR data that are sent to the Dutch Nivel Primary Care Database weekly. Patients with OAB were flagged weekly on the basis of diagnoses and prescriptions. They were invited subsequently for participation by their general practitioner (GP), via a trusted third party. Patients received a series of questionnaires on disease status, pharmacological and nonpharmacological treatments, adverse drug reactions, drug adherence, and quality of life. The questionnaires and a dedicated feedback portal were developed in collaboration with a patient association for pelvic-related diseases, Bekkenbodem4All. Participating patients and GPs received feedback. An expert meeting was organized to assess the strengths, weaknesses, opportunities, and threats of the new research infrastructure.
The BRIMM infrastructure was developed and implemented. In the Nivel Primary Care Database, 2933 patients with OAB from 27 general practices were flagged. GPs selected 1636 (55.78%) patients who were eligible for the study, of whom 295 (18.0% of eligible patients) completed the first questionnaire. A total of 288 (97.6%) patients consented to the linkage of their questionnaire data with their EHR data. According to experts, the strengths of the infrastructure were the linkage of patient-reported outcomes with EHR data, comparison of pharmacological and nonpharmacological treatments, flexibility of the infrastructure, and low registration burden for GPs. Methodological weaknesses, such as susceptibility to bias, patient selection, and low participation rates among GPs and patients, were seen as weaknesses and threats. Opportunities represent usefulness for policy makers and health professionals, conditional approval of medication, data linkage to other data sources, and feedback to patients.
The BRIMM research infrastructure has the potential to assess the benefits and safety of (medical) treatment in real-life situations using a unique combination of EHRs and patient-reported outcomes. As patient involvement is an important aspect of the treatment process, generating knowledge from clinical and patient perspectives is valuable for health care providers, patients, and policy makers. The developed methodology can easily be applied to other treatments and health problems.