Akademska digitalna zbirka SLovenije - logo
(UM)
  • Discovering significant biclusters in gene expression data
    Rizman Žalik, Krista
    Clustering concerns the discovery of homogenous groups of data according to a certain similarity measure. It is an important problem that arises in diverse applications, including the analysis of ... gene expression and drug interaction data. It suffers from the course of dimensionality. It is not meaningful to look for clusters in high dimensional spaces, as the average density of points anywhere in the input high-dimensional space is likely to be low. In such high-dimensional spaces, as gene expression data obtained from microarray experiments are, the results from the application of standard clustering methods are limited. This limitation is imposed by the existence of a number of experimental conditions or gene samples, where the expression levels of the same genes are uncorrelated. A similar limitation exists when condition-clustering is performed. Recently, biclustering, a non-supervised discovering approach that performs simultaneous clustering on the row and column dimensions of the data matrix, has been shown to be very effective. The goal of biclustering is to find submatrices of genes and conditions, or samples where the genes have nearly the same expression levels for nearly all conditions. Some clustering methods have been adopted or proposed. However, some concerns still remain, such as the robustness of mining methods on the noise and input parameters. In this paper we tackle the problem of effectively biclustering gene expression data by proposing an algorithm. We use a density-based approach to identify biclusters. Our experimental results show that the algorithm is effective.
    Source: WSEAS transactions on information science and applications. - ISSN 1790-0832 (Vol. 2, iss. 9, Sep. 2005, str. 1454-1461)
    Type of material - article, component part ; adult, serious
    Publish date - 2006
    Language - english
    COBISS.SI-ID - 14906120