NUK - logo
E-viri
Recenzirano Odprti dostop
  • Quantile cross-spectral den...
    López-Oriona, Ángel; Vilar, José A.

    Expert systems with applications, 12/2021, Letnik: 185
    Journal Article

    Clustering of multivariate time series is a central problem in data mining with applications in many fields. Frequently, the clustering target is to identify groups of series generated by the same multivariate stochastic process. Most of the approaches to address this problem include a prior step of dimensionality reduction which may result in a loss of information or consider dissimilarity measures based on correlations and cross-correlations but ignoring the serial dependence structure. We propose a novel approach to measure dissimilarity between multivariate time series aimed at jointly capturing both cross dependence and serial dependence. Specifically, each series is characterized by a set of matrices of estimated quantile cross-spectral densities, where each matrix corresponds to a pair of quantile levels. Then the dissimilarity between every couple of series is evaluated by comparing their estimated quantile cross-spectral densities, and the pairwise dissimilarity matrix is taken as starting point to develop a partitioning around medoids algorithm. Since the quantile-based cross-spectra capture dependence in quantiles of the joint distribution, the proposed metric has a high capability to discriminate between high-level dependence structures. An extensive simulation study shows that our clustering procedure outperforms a wide range of alternative methods and exhibits robustness to noise distribution besides being computationally efficient. A real data application involving bivariate financial time series illustrates the usefulness of the proposed approach. The procedure is also applied to cluster nonstationary series from the UEA multivariate time series classification archive. •A new measure based on quantiles to perform clustering of multivariate time series.•The measure examines simultaneously both cross-dependence and serial dependence.•The proposed measure takes advantage of the nice properties of the quantiles.•Accurate, robust and efficient clustering performance in the frequency domain.•The methodology is successfully applied to cluster time series of S&P 500.