•Exposure to PM2.5 was associated with higher risk of lung cancer.•Elevated risks persisted even at levels lower than the EU limit value of 25 µg/m3.•No association between NO2, BC or O3 and lung ...cancer incidence was observed.
Ambient air pollution has been associated with lung cancer, but the shape of the exposure-response function - especially at low exposure levels - is not well described. The aim of this study was to address the relationship between long-term low-level air pollution exposure and lung cancer incidence.
The “Effects of Low-level Air Pollution: a Study in Europe” (ELAPSE) collaboration pools seven cohorts from across Europe. We developed hybrid models combining air pollution monitoring, land use data, satellite observations, and dispersion model estimates for nitrogen dioxide (NO2), fine particulate matter (PM2.5), black carbon (BC), and ozone (O3) to assign exposure to cohort participants’ residential addresses in 100 m by 100 m grids. We applied stratified Cox proportional hazards models, adjusting for potential confounders (age, sex, calendar year, marital status, smoking, body mass index, employment status, and neighborhood-level socio-economic status). We fitted linear models, linear models in subsets, Shape-Constrained Health Impact Functions (SCHIF), and natural cubic spline models to assess the shape of the association between air pollution and lung cancer at concentrations below existing standards and guidelines.
The analyses included 307,550 cohort participants. During a mean follow-up of 18.1 years, 3956 incident lung cancer cases occurred. Median (Q1, Q3) annual (2010) exposure levels of NO2, PM2.5, BC and O3 (warm season) were 24.2 µg/m3 (19.5, 29.7), 15.4 µg/m3 (12.8, 17.3), 1.6 10−5m−1 (1.3, 1.8), and 86.6 µg/m3 (78.5, 92.9), respectively. We observed a higher risk for lung cancer with higher exposure to PM2.5 (HR: 1.13, 95% CI: 1.05, 1.23 per 5 µg/m3). This association was robust to adjustment for other pollutants. The SCHIF, spline and subset analyses suggested a linear or supra-linear association with no evidence of a threshold. In subset analyses, risk estimates were clearly elevated for the subset of subjects with exposure below the EU limit value of 25 µg/m3. We did not observe associations between NO2, BC or O3 and lung cancer incidence.
Long-term ambient PM2.5 exposure is associated with lung cancer incidence even at concentrations below current EU limit values and possibly WHO Air Quality Guidelines.
Blood circulating proteins are confounded readouts of the biological processes that occur in different tissues and organs. Many proteins have been linked to complex disorders and are also under ...substantial genetic control. Here, we investigate the associations between over 1000 blood circulating proteins and body mass index (BMI) in three studies including over 4600 participants. We show that BMI is associated with widespread changes in the plasma proteome. We observe 152 replicated protein associations with BMI. 24 proteins also associate with a genome-wide polygenic score (GPS) for BMI. These proteins are involved in lipid metabolism and inflammatory pathways impacting clinically relevant pathways of adiposity. Mendelian randomization suggests a bi-directional causal relationship of BMI with LEPR/LEP, IGFBP1, and WFIKKN2, a protein-to-BMI relationship for AGER, DPT, and CTSA, and a BMI-to-protein relationship for another 21 proteins. Combined with animal model and tissue-specific gene expression data, our findings suggest potential therapeutic targets further elucidating the role of these proteins in obesity associated pathologies.
Can mitigating only particle mass, as the existing air quality measures do, ultimately lead to reduction in ultrafine particles (UFP)? The aim of this study was to provide a broader urban perspective ...on the relationship between UFP, measured in terms of particle number concentration (PNC) and PM2.5 (mass concentration of particles with aerodynamic diameter < 2.5 μm) and factors that influence their concentrations. Hourly average PNC and PM2.5 were acquired from 10 cities located in North America, Europe, Asia, and Australia over a 12-month period. A pairwise comparison of the mean difference and the Kolmogorov-Smirnov test with the application of bootstrapping were performed for each city. Diurnal and seasonal trends were obtained using a generalized additive model (GAM). The particle number to mass concentration ratios and the Pearson's correlation coefficient were calculated to elucidate the nature of the relationship between these two metrics.
Results show that the annual mean concentrations ranged from 8.0 × 103 to 19.5 × 103 particles·cm−3 and from 7.0 to 65.8 μg·m−3 for PNC and PM2.5, respectively, with the data distributions generally skewed to the right, and with a wider spread for PNC. PNC showed a more distinct diurnal trend compared with PM2.5, attributed to the high contributions of UFP from vehicular emissions to PNC. The variation in both PNC and PM2.5 due to seasonality is linked to the cities' geographical location and features. Clustering the cities based on annual median concentrations of both PNC and PM2.5 demonstrated that a high PNC level does not lead to a high PM2.5, and vice versa. The particle number-to-mass ratio (in units of 109 particles·μg−1) ranged from 0.14 to 2.2, >1 for roadside sites and <1 for urban background sites with lower values for more polluted cities. The Pearson's r ranged from 0.09 to 0.64 for the log-transformed data, indicating generally poor linear correlation between PNC and PM2.5. Therefore, PNC and PM2.5 measurements are not representative of each other; and regulating PM2.5 does little to reduce PNC. This highlights the need to establish regulatory approaches and control measures to address the impacts of elevated UFP concentrations, especially in urban areas, considering their potential health risks.
Shock index (SI) and modified shock index (mSI) are useful instruments for early risk stratification in acute myocardial infarction (AMI) patients. They are strong predictors for short-term ...mortality. Nevertheless, the association between SI or mSI and long-term mortality in AMI patients has not yet been sufficiently examined.
For this study, a total of 10,174 patients with AMI was included. All cases were prospectively recorded by the population-based Augsburg Myocardial Infarction Registry from 2000 until 2017. Endpoint was all-cause mortality with a median observational time of 6.5 years IQR: 3.5-7.4. Using ROC analysis and calculating Youden-Index, the sample was dichotomized into a low and a high SI and mSI group, respectively. Moreover, multivariable adjusted COX regression models were calculated. All analyses were performed for the total sample as well as for STEMI and NSTEMI cases separately.
Optimal cut-off values were 0.580 for SI and 0.852 for mSI (total sample). AUC values were 0.6382 (95% CI: 0.6223-0.6549) for SI and 0.6552 (95% CI: 0.6397-0.6713) for mSI. Fully adjusted COX regression models revealed significantly higher long-term mortality for patients with high SI and high mSI compared to patients with low indices (high SI HR: 1.42 1.32-1.52, high mSI HR: 1.46 1.36-1.57). Furthermore, the predictive ability was slightly better for mSI compared to SI and more reliable in NSTEMI cases compared to STEMI cases (for SI and mSI).
High SI and mSI are useful tools for early risk stratification including long-term outcome especially in NSTEMI cases, which can help physicians to make decision on therapy. NSTEMI patients with high SI and mSI might especially benefit from immediate invasive therapy.
Key messages
Shock index and modified shock index are predictors of long-term mortality after acute myocardial infarction.
Both indices predict long-term mortality not only for STEMI cases, but even more so for NSTEMI cases.
Statistical analysis of microbial genomic data within epidemiological cohort studies holds the promise to assess the influence of environmental exposures on both the host and the host-associated ...microbiome. However, the observational character of prospective cohort data and the intricate characteristics of microbiome data make it challenging to discover causal associations between environment and microbiome. Here, we introduce a causal inference framework based on the Rubin Causal Model that can help scientists to investigate such environment-host microbiome relationships, to capitalize on existing, possibly powerful, test statistics, and test plausible sharp null hypotheses. Using data from the German KORA cohort study, we illustrate our framework by designing two hypothetical randomized experiments with interventions of (i) air pollution reduction and (ii) smoking prevention. We study the effects of these interventions on the human gut microbiome by testing shifts in microbial diversity, changes in individual microbial abundances, and microbial network wiring between groups of matched subjects via randomization-based inference. In the smoking prevention scenario, we identify a small interconnected group of taxa worth further scrutiny, including Christensenellaceae and Ruminococcaceae genera, that have been previously associated with blood metabolite changes. These findings demonstrate that our framework may uncover potentially causal links between environmental exposure and the gut microbiome from observational data. We anticipate the present statistical framework to be a good starting point for further discoveries on the role of the gut microbiome in environmental health.
Few studies have investigated effects of air pollution on the incidence of cerebrovascular events.
We assessed the association between long-term exposure to multiple air pollutants and the incidence ...of stroke in European cohorts.
Data from 11 cohorts were collected, and occurrence of a first stroke was evaluated. Individual air pollution exposures were predicted from land-use regression models developed within the European Study of Cohorts for Air Pollution Effects (ESCAPE). The exposures were: PM2.5 particulate matter (PM) ≤ 2.5 μm in diameter, coarse PM (PM between 2.5 and 10 μm), PM10 (PM ≤ 10 μm), PM2.5 absorbance, nitrogen oxides, and two traffic indicators. Cohort-specific analyses were conducted using Cox proportional hazards models. Random-effects meta-analysis was used for pooled effect estimation.
A total of 99,446 study participants were included, 3,086 of whom developed stroke. A 5-μg/m3 increase in annual PM2.5 exposure was associated with 19% increased risk of incident stroke hazard ratio (HR) = 1.19, 95% CI: 0.88, 1.62. Similar findings were obtained for PM10. The results were robust to adjustment for an extensive list of cardiovascular risk factors and noise coexposure. The association with PM2.5 was apparent among those ≥ 60 years of age (HR = 1.40, 95% CI: 1.05, 1.87), among never-smokers (HR = 1.74, 95% CI: 1.06, 2.88), and among participants with PM2.5 exposure < 25 μg/m3 (HR = 1.33, 95% CI: 1.01, 1.77).
We found suggestive evidence of an association between fine particles and incidence of cerebrovascular events in Europe, even at lower concentrations than set by the current air quality limit value.
Analyzing epidemiological data with simplified mathematical models of disease development provides a link between the time‐course of incidence and the underlying biological processes. Here we point ...out that considerable modeling flexibility is gained if the model is solved by simulation only. To this aim, a model of atherosclerosis is proposed: a Markov Chain with continuous state space which represents the coronary artery intimal surface area involved with atherosclerotic lesions of increasing severity. Myocardial infarction rates are assumed to be proportional to the area of most severe lesions. The model can be fitted simultaneously to infarction incidence rates observed in the KORA registry, and to the age‐dependent prevalence and extent of atherosclerotic lesions in the PDAY study. Moreover, the simulation approach allows for non‐linear transition rates, and to consider at the same time randomness and inter‐individual heterogeneity. Interestingly, the fit revealed significant age dependence of parameters in females around the age of menopause, qualitatively reproducing the known vascular effects of female sex hormones. For males, the incidence curve flattens for higher ages. According to the model, frailty explains this flattening only partially, and saturation of the disease process plays also an important role. This study shows the feasibility of simulating subclinical and epidemiological data with the same mathematical model. The approach is very general and may be extended to investigate the effects of risk factors or interventions. Moreover, it offers an interface to integrate quantitative individual health data as assessed, for example, by imaging.
Background
Untargeted mass spectrometry (MS)-based metabolomics data often contain missing values that reduce statistical power and can introduce bias in biomedical studies. However, a systematic ...assessment of the various sources of missing values and strategies to handle these data has received little attention. Missing data can occur systematically, e.g. from run day-dependent effects due to limits of detection (LOD); or it can be random as, for instance, a consequence of sample preparation.
Methods
We investigated patterns of missing data in an MS-based metabolomics experiment of serum samples from the German KORA F4 cohort (n = 1750). We then evaluated 31 imputation methods in a simulation framework and biologically validated the results by applying all imputation approaches to real metabolomics data. We examined the ability of each method to reconstruct biochemical pathways from data-driven correlation networks, and the ability of the method to increase statistical power while preserving the strength of established metabolic quantitative trait loci.
Results
Run day-dependent LOD-based missing data accounts for most missing values in the metabolomics dataset. Although multiple imputation by chained equations performed well in many scenarios, it is computationally and statistically challenging. K-nearest neighbors (
KNN
) imputation on observations with variable pre-selection showed robust performance across all evaluation schemes and is computationally more tractable.
Conclusion
Missing data in untargeted MS-based metabolomics data occur for various reasons. Based on our results, we recommend that
KNN
-based imputation is performed on observations with variable pre-selection since it showed robust results in all evaluation schemes.
Multiple risk factors contribute jointly to the development and progression of cardiometabolic diseases. Therefore, joint longitudinal trajectories of multiple risk factors might represent different ...degrees of cardiometabolic risk.
We analyzed population-based data comprising three examinations (Exam 1: 1999-2001, Exam 2: 2006-2008, Exam 3: 2013-2014) of 976 male and 1004 female participants of the KORA cohort (Southern Germany). Participants were followed up for cardiometabolic diseases, including cardiovascular mortality, myocardial infarction and stroke, or a diagnosis of type 2 diabetes, until 2016. Longitudinal multivariate k-means clustering identified sex-specific trajectory clusters based on nine cardiometabolic risk factors (age, systolic and diastolic blood pressure, body-mass-index, waist circumference, Hemoglobin-A1c, total cholesterol, high- and low-density lipoprotein cholesterol). Associations between clusters and cardiometabolic events were assessed by logistic regression models.
We identified three trajectory clusters for men and women, respectively. Trajectory clusters reflected a distinct distribution of cardiometabolic risk burden and were associated with prevalent cardiometabolic disease at Exam 3 (men: odds ratio (OR)ClusterII = 2.0, 95% confidence interval: (0.9-4.5); ORClusterIII = 10.5 (4.8-22.9); women: ORClusterII = 1.7 (0.6-4.7); ORClusterIII = 5.8 (2.6-12.9)). Trajectory clusters were furthermore associated with incident cardiometabolic cases after Exam 3 (men: ORClusterII = 3.5 (1.1-15.6); ORClusterIII = 7.5 (2.4-32.7); women: ORClusterII = 5.0 (1.1-34.1); ORClusterIII = 8.0 (2.2-51.7)). Associations remained significant after adjusting for a single time point cardiovascular risk score (Framingham).
On a population-based level, distinct longitudinal risk profiles over a 14-year time period are differentially associated with cardiometabolic events. Our results suggest that longitudinal data may provide additional information beyond single time-point measures. Their inclusion in cardiometabolic risk assessment might improve early identification of individuals at risk.