UNI-MB - logo
UMNIK - logo
 
(UM)
  • A chess rating system for evolutionary algorithms : a new method for the comparison and ranking of evolutionary algorithms
    Veček, Niki ; Mernik, Marjan, 1964- ; Črepinšek, Matej
    The null hypothesis significance testing (NHST) is of utmost importance for comparing evolutionary algorithms as the performance of one algorithm over another can be scientifically proven. However, ... NHST is often misused, improperly applied and misinterpreted. In order to avoid the pitfalls of NHST usage this paper proposes a new method, a Chess Rating System for Evolutionary Algorithms (CRS4EAs) for the comparison and ranking of evolutionary algorithms. A computational experiment in CRS4EAs is conducted in the form of a tournament where the evolutionary algorithms are treated as chess players and a comparison between the solutions of two algorithms on the objective function is treated as one game outcome. The rating system used in CRS4EAs was inspired by the Glicko-2 rating system, based on the Bradley-Terry model for dynamic pairwise comparisons, where each algorithm is represented by rating, rating deviation, a rating/confidence interval, and rating volatility. The CRS4EAs was empirically compared to NHST within a computational experiment conducted on 16 evolutionary algorithms and a benchmark suite of 20 numerical minimisation problems. The analysis of the results shows that the CRS4EAs is comparable with NHST but may also have many additional benefits. The computations in CRS4EAs are less complicated and sensitive than those in statistical significance tests, the method is less sensitive to outliers, reliable ratings can be obtained over a small number of runs, and the conservativity/liberality of CRS4EAs is easier to control.
    Source: Information sciences. - ISSN 0020-0255 (Vol. 277, Sep. 2014, str. 656-679)
    Type of material - article, component part
    Publish date - 2014
    Language - english
    COBISS.SI-ID - 17670422
    DOI

source: Information sciences. - ISSN 0020-0255 (Vol. 277, Sep. 2014, str. 656-679)

loading ...
loading ...
loading ...