Tree-Cut Algorithms for Generating Cluster Ensembles

From: Uncovering Many Views of Biological Networks Using Ensembles of Near-Optimal Partitions

Geet Duggal*, Saket Navlakha*, Michelle Girvan, Carl Kingsford
University of Maryland College Park

* Equally contributed to this work.

Abstract

Densely interacting regions of biological networks often correspond to functional modules such as protein complexes. Most algorithms proposed to uncover modules, however, produce one clustering that only reveals a single view of how the cell is organized. We describe two new methods to find ensembles of provably near-optimal modularity partitions that lie within a heuristically constrained search space. We also show how to count the number of solutions in this space that lie within a bounded modularity range. We apply our algorithms to a protein interaction network for S.cerevisiae and show how fine-grained differences between near-optimal partitions can be used to define robust communities. We also propose a technique to find structurally diverse near-optimal solutions and show that these different partitions are enriched for different biological functions. Our results indicate that near-optimal solutions represent alternative and complementary views of the network's structure.


Overview of the Modu-Cut algorithm.

In Proceedings of the 1st MultiClust KDD Workshop on Discovering, Summarizing, and Using Multiple Clusterings, 2010.
Download the paper.

Code

Download zip file. See README for instructions. Email questions to: geet@umd.edu and saket@cs.umd.edu.


Last modified: May 27, 2010