Akademska digitalna zbirka SLovenije - logo
E-viri
Celotno besedilo
Recenzirano
  • Forecasting PM2.5 concentra...
    Pozo-Luyo, César Alejandro; Cruz-Duarte, Jorge M.; Amaya, Ivan; Ortiz-Bayliss, José Carlos

    Atmospheric pollution research, November 2023, 2023-11-00, Letnik: 14, Številka: 11
    Journal Article

    The Monterrey Metropolitan Area is one of the most densely populated and polluted regions in Latin America. Hence, providing early warnings to the population when pollutant concentrations reach high levels is critical. This allows people at higher health risk to make informed decisions about when to go out, mitigating future health complications. Using forecasting models, we can produce timely warnings for future concentration levels. In this work, we implement a set of short-term shallow machine learning models that would serve as a baseline for future forecasting analyses of PM2.5 concentration levels in the Monterrey Metropolitan Area. The proposed approach starts with multiple imputation through chained equations for missing value imputation, the incorporation of time metadata, and target winsorization. Then, we rely on the well-known random search for parameter optimization of the machine learning models and k-fold cross-validation, obtaining favorable results. We devise these models for a single-step and single-station analysis on an hourly multivariate air quality dataset (containing 77203 rows and 16 columns from the first hour of January 1, 2015 00:00:00 to April 17, 2022 23:00:00) and compare them using standard regression metrics. Therefore, we identify the forecasting model with the best performance, which was an Extra Trees Regressor with a Root Mean Squared Error of 0.013, a Mean Absolute Error of 0.006 (equivalent to a Mean Absolute Percentage Error of 0.294% and a Symmetric Mean Absolute Percentage Error of 0.078%), and a Maximum Error of 0.187μg/m3. Display omitted •We study various shallow ML algorithms to forecast PM2.5 levels in the MMA.•We provide a working framework in Python to forecast PM2.5 levels in the MMA.•We observed that shallow ML models compete with DL algorithms in forecasting tasks.