Speaker: Dr. Xiaowen Liu, University of Waterloo, Canada Inviter: Dr. Dongbo Bu, Center for Advanced Computing Research, ICT Time: 10:00pm-11:30 am, Feb 16, 2009 (Monday) Place: 446, Institute of Computing Technology, Chinese Academy of Sciences
Abstract: Bi-clustering is an important approach in microarray data analysis. The underlying bases for using bi-clustering in the analysis of gene expression data are (1) similar genes may exhibit similar behaviors only under a subset of conditions, not all conditions, (2) genes may participate in more than one function, resulting in one regulation pattern in one context and a different pattern in another. Using bi-clustering algorithms, one can obtain sets of genes that are co-regulated under subsets of conditions. We consider two variations of the bi-clustering problem: the Consensus Submatrix Problem and the Bottleneck Submatrix Problem. We show that both problems are NP-hard and give randomized approximation algorithms for special cases of the two problems. We also develop a polynomial time algorithm to find an optimal bi-cluster with the maximum similarity score. To our knowledge, this is the first formulation for bi-cluster problems that admits a polynomial time algorithm for optimal solutions. We then extend the algorithm to handle various kinds of other cases. Experiments on simulation data and real data show that the new algorithms outperform most of the existing methods in many cases. Our new algorithms have the following advantages: (1) no discretization procedure is required, (2) performs well for overlapping bi-clusters, and (3) works well for additive bi-clusters.
|