NUK - logo
E-viri
Recenzirano Odprti dostop
  • Leveraging Automated Machin...
    Manduchi, Elisabetta; Moore, Jason H.

    International journal of public health, 04/2021, Letnik: 66
    Journal Article

    Here we describe an application to infectious disease epidemiology leveraging data from ClinEpiDB, a resource aimed at advancing global public health by facilitating the exploration and analysis of epidemiological studies 12. In the traditional regression-based analyses on this dataset reported in 14, the independent variables were age bin (< 5, 5–14, 15+), gender, a history of travel in the two weeks preceding the survey visit, a history of malaria in the past year, antimalarial use in the two weeks preceding the visit, reported use of repellent, and whether the visit occurred during the rainy season. For each of these three types, we ran TPOT 50 times with different random splits of the input data into training (75%) and hold-out testing (25%) portions. ...to mitigate the effect of the high imbalance between number of cases and controls, in each run we randomly undersampled the controls to equal the number of cases prior to the random split. Embedding AutoML tools within epidemiology platforms like ClinEpiDB would empower users to directly perform sophisticated analyses, accelerating the benefits derived from these public health resources.