Community Detection and Mining in Social Media (Class) - SergiuTripon/msc-thesis-na-epsrc GitHub Wiki

HomeLiterature Survey ▸ Community Detection and Mining in Social Media (Class)


Year: 2010
Authors: Lei Tang, Huan Liu
Files: literature/community-detection-class/


Contents

1. Community Structure

A community (group, cluster, cohesive subgroups, modules) is a set of nodes between which the interactions are (relatively) frequent;

  • Community: People in a group interact with each other more frequently than those outside the group;
  • friends of a friend are likely to be friends as well;
  • measured by clustering coefficient, density of connection among one's friends;
  • Applications: Recommendation based networks, Network Compression, Visualization of a huge network;

Back to Top

2. Community Detection

Community Detection involves discovering groups in a network where individuals group memberships are not explicitly given.

Roughly, community detection methods can be divided into 4 categories (not exclusive):

  • Node-Centric Community - each node is a group that satisfies certain properties;
    • cliques, k-cliques, k-clubs;
  • Group-Centric Community - consider the connections within a group as a whole, the group has to satisfy certain properties without zooming into node-level;
    • quasi-cliques;
  • Network-Centric Community - partition the whole network into several disjoint sets;
    • clustering based on vertex similarity;
    • latent space models, block models, spectral clustering, modularity maximization;
  • Hierarchy-Centric Community - construct a hierarchical structure of communities;
    • divisive clustering;
    • agglomerative clustering;

Back to Top

3. Community Detection Evaluation

  • For groups with clear definitions
    • e.g. cliques, k-cliques, k-clubs, quasi-cliques;
    • verify whether the extracted communities satisfy the definition;
  • For networks with ground truth information
    • normalized mutual information;
    • accuracy of pairwise community memberships;
  • For networks with semantics
    • networks come with semantic or attribute information of nodes or connections;
    • human subjects can verify whether the extracted communities are coherent;
    • evaluation is qualitative;
    • it is also intuitive and helps understand a community;
  • For networks without ground truth or semantic information
    • this is the most common situation;
    • an option is to resort to cross-validation;
      • extract communities from a (training) network
      • evaluate the quality of the community structure on a network constructed from a different date or based of a related type of interaction;
    • quantitative evaluation functions;
      • modularity;
      • block model approximation error;

Back to Top