PAPER DIGEST
Most Influential WWW 2015 Paper · 2026-03 edition

The K-clique Densest Subgraph Problem

Charalampos Tsourakakis

Venue
ACM Web Conference (WWW) 2015
Recognition
Most Influential WWW 2015 Paper (Rank No. 5)
Edition
2026-03
Impact factor
5
Certificate ID
ac589934159181bf

Abstract

Numerous graph mining applications rely on detecting subgraphs which are large near-cliques. Since formulations that are geared towards finding large near-cliques are hard and frequently inapproximable due to connections with the Maximum Clique problem, the poly-time solvable densest subgraph problem which maximizes the average degree over all possible subgraphs "lies at the core of large scale data mining" [10]. However, frequently the densest subgraph problem fails in detecting large near-cliques in networks. In this work, we introduce the k-clique densest subgraph problem, k ≥ 2. This generalizes the well studied densest subgraph problem which is obtained as a special case for k=2. For k=3 we obtain a novel formulation which we refer to as the triangle densest subgraph problem: given a graph G(V,E), find a subset of vertices S* such that τ(S*)=max limits<sub>S ⊆ V</sub> t(S)/|S|, where t(S) is the number of triangles induced by the set S. On the theory side, we prove that for any k constant, there exist an exact polynomial time algorithm for the k-clique densest subgraph problem}. Furthermore, we propose an efficient 1/k-approximation algorithm which generalizes the greedy peeling algorithm of Asahiro and Charikar [8,18] for k=2. Finally, we show how to implement efficiently this peeling framework on MapReduce for any k ≥ 3, generalizing the work of Bahmani, Kumar and Vassilvitskii for the case k=2 [10]. On the empirical side, our two main findings are that (i) the triangle densest subgraph is consistently closer to being a large near-clique compared to the densest subgraph and (ii) the peeling approximation algorithms for both k=2 and k=3 achieve on real-world networks approximation ratios closer to 1 rather than the pessimistic 1/k guarantee. An interesting consequence of our work is that triangle counting, a well-studied computational problem in the context of social network analysis can be used to detect large near-cliques. Finally, we evaluate our proposed method on a popular graph mining application.

Download PDF certificate