Machine learning classifiers are being increasingly used nowadays for Land Use and Land Cover (LULC) mapping from remote sensing images. However, arriving at the right choice of classifier requires ...understanding the main factors influencing their performance. The present study investigated firstly the effect of training sampling design on the classification results obtained by Random Forest (RF) classifier and, secondly, it compared its performance with other machine learning classifiers for LULC mapping using multi-temporal satellite remote sensing data and the Google Earth Engine (GEE) platform. We evaluated the impact of three sampling methods, namely Stratified Equal Random Sampling (SRS(Eq)), Stratified Proportional Random Sampling (SRS(Prop)), and Stratified Systematic Sampling (SSS) upon the classification results obtained by the RF trained LULC model. Our results showed that the SRS(Prop) method favors major classes while achieving good overall accuracy. The SRS(Eq) method provides good class-level accuracies, even for minority classes, whereas the SSS method performs well for areas with large intra-class variability. Toward evaluating the performance of machine learning classifiers, RF outperformed Classification and Regression Trees (CART), Support Vector Machine (SVM), and Relevance Vector Machine (RVM) with a >95% confidence level. The performance of CART and SVM classifiers were found to be similar. RVM achieved good classification results with a limited number of training samples.
Due to its comparatively high spatial resolution and its daily repeat frequency, the tropospheric nitrogen dioxide product provided by the TROPOspheric Monitoring Instrument (TROPOMI) onboard the ...Sentinel-5 Precursor platform has attracted significant attention for its potential for urban-scale monitoring of air quality. However, the exploitation of such data in, for example, operational assimilation of local-scale dispersion models is often complicated by substantial data gaps due to cloud cover or other retrieval limitations. These challenges are particularly prominent in high-latitude regions where significant cloud cover and high solar zenith angles are often prevalent. Using the example of Norway as a representative case for a high-latitude region, we here evaluate the spatiotemporal patterns in the availability of valid data from the operational TROPOMI tropospheric nitrogen dioxide (NO2) product over five urban areas (Oslo, Bergen, Trondheim, Stavanger, and Kristiansand) and a 2.5 year period from July 2018 through November 2020. Our results indicate that even for relatively clean environments such as small Norwegian cities, distinct spatial patterns of tropospheric NO2 are visible in long-term average datasets from TROPOMI. However, the availability of valid data on a daily level is limited by both cloud cover and solar zenith angle (during the winter months), causing the fraction of valid retrievals in each study site to vary from 20% to 50% on average. A temporal analysis shows that for our study sites and the selected period, the fraction of valid pixels in each domain shows a clear seasonal cycle reaching a maximum of 50% to 75% in the summer months and 0% to 20% in winter. The seasonal cycle in data availability shows the inverse behavior of NO2 pollution in Norway, which typically has its peak in the winter months. However, outside of the mid-winter period we find the TROPOMI NO2 product to provide sufficient data availability for detailed mapping and monitoring of NO2 pollution in the major urban areas in Norway and see potential for the use of the data in local-scale data assimilation and emission inversions applications.
Satellite observations from instruments such as the TROPOspheric Monitoring Instrument (TROPOMI) show significant potential for monitoring the spatiotemporal variability of NO2, however they ...typically provide vertically integrated measurements over the tropospheric column. In this study, we introduce a machine learning approach entitled ‘S-MESH’ (Satellite and ML-based Estimation of Surface air quality at High resolution) that allows for estimating daily surface NO2 concentrations over Europe at 1 km spatial resolution based on eXtreme gradient boost (XGBoost) model using primarily observation-based datasets over the period 2019–2021. Spatiotemporal datasets used by the model include TROPOMI NO2 tropospheric vertical column density, night light radiance from the Visible Infrared Imaging Radiometer Suite (VIIRS), Normalized Difference Vegetation Index from the Moderate Resolution Imaging Spectroradiometer (MODIS), observations of air quality monitoring stations from the European Environment Agency database and modeled meteorological parameters such as planetary boundary layer height, wind velocity, temperature. The overall model evaluation shows a mean absolute error of 7.77 μg/m3, a median bias of 0.6 μg/m3 and a Spearman rank correlation of 0.66. The model performance is found to be influenced by NO2 concentration levels, with the most reliable predictions at concentration levels of 10–40 μg/m3 with a bias of <40%. The spatial and temporal error analyses indicate the spatial robustness of the model across the study area, with better prediction accuracy during the winter months and the associated higher NO2 concentrations. Despite the complexity and the continental scale of the study area, the XGBoost-based model shows fast execution potential in providing daily estimates of surface NO2 concentrations over Europe. The Shapley Additive exPlanations (SHAP) value analysis highlights TROPOMI NO2 tropospheric column density as the main source of information in deriving surface NO2 concentrations, indicating its significant potential for such studies. The SHAP values also indicate the importance of anthropogenic emission proxy inputs such as VIIRS night lights, in complementing TROPOMI NO2 values for deriving higher resolution and detailed spatial patterns of NO2 variations.
•Surface NO2 concentrations at 1 km resolution is estimated over Europe using XGBoost.•Satellite-based input features' potential contribution is observed.•SHAP values indicate highest importance of Sentinel-5P TROPOMI observation.•Finer spatial patterns can be derived using VIIRS nightlight.•XGBoost is a good candidate for continental-scale study areas.
These datasets consist of estimated daily surface NO2 concentrations over Europe at ~1km spatial resolution in tiff file format generated using S-MESH model and is part of the research ...article https://doi.org/10.1016/j.rse.2024.114321. Files are zipped into 3 folders each corresponding to a year and can be unzipped from command line using "tar -xvzf filename.tar.gz". Each file represents surface NO2 during the Sentinel-5P satellite overpass time and the file is named based on the date of measurement. Each tiff file is a single band image with an extent of 25°W-42.5°E & 29.9-74.28°N in EPSG:4326 - WGS 84 projection. The surface NO2 concentrations are estimated using Sentinel-5P TROPOMI tropospheric column density and a XGBoost machine learning model. The overall median absolute error of the model predictions across Europe is 4.43μg/m3
Summary
Data: NO2 concentrations over Europe at ~1km spatial resolution
Time Period: 2019-2021
Methodology: Using Sentinel-5P TROPOMI NO2 and XGBoost
More information in the article https://authors.elsevier.com/sd/article/S0034-4257(24)00339-0 or https://doi.org/10.1016/j.rse.2024.114321