Distributed graph clustering by load balancing

He Sun, Luca Zanetti

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

4 Citations (Scopus)
226 Downloads (Pure)

Abstract

Graph clustering is a fundamental computational problem with a number of applications in algorithm design, machine learning, data mining, and analysis of social networks. Over the past decades, researchers have proposed a number of algorithmic design methods for graph clustering. However, most of these methods are based on complicated spectral techniques or convex optimisation, and cannot be applied directly for clustering many networks that occur in practice, whose information is often collected on different sites. Designing a simple and distributed clustering algorithm is of great interest, and has wide applications for processing big datasets. In this paper we present a simple and distributed algorithm for graph clustering: for a wide class of graphs that are characterised by a strong cluster-structure, our algorithm finishes in a poly-logarithmic number of rounds, and recovers a partition of the graph close to an optimal partition. The main component of our algorithm is an application of the random matching model of load balancing, which is a fundamental protocol in distributed computing and has been extensively studied in the past 20 years. Hence, our result highlights an intrinsic and interesting connection between graph clustering and load balancing.

At a technical level, we present a purely algebraic result characterising the early behaviours of load balancing processes for graphs exhibiting a cluster-structure. We believe that this result can be further applied to analyse other gossip processes, such as rumour spreading and averaging processes.
Original languageEnglish
Title of host publicationSPAA '17 Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures
PublisherAssociation for Computing Machinery (ACM)
Pages163-171
Number of pages9
ISBN (Electronic)9781450345934
DOIs
Publication statusPublished - 26 Jul 2017
Event29th ACM Symposium on Parallelism in Algorithms and Architectures - Washington D.C., United States
Duration: 24 Jul 201726 Jul 2017

Conference

Conference29th ACM Symposium on Parallelism in Algorithms and Architectures
Abbreviated titleSPAA'17
CountryUnited States
CityWashington D.C.
Period24/07/1726/07/17

Fingerprint Dive into the research topics of 'Distributed graph clustering by load balancing'. Together they form a unique fingerprint.

Cite this