Community Detection and Mining in Social Media (Class) - SergiuTripon/msc-thesis-na-epsrc GitHub Wiki
Home ▸ Literature Survey ▸ Community Detection and Mining in Social Media (Class)
Year: 2010
Authors: Lei Tang, Huan Liu
Files: literature/community-detection-class/
Contents
1. Community Structure
A community (group, cluster, cohesive subgroups, modules) is a set of nodes between which the interactions are (relatively) frequent;
- Community: People in a group interact with each other more frequently than those outside the group;
- friends of a friend are likely to be friends as well;
- measured by clustering coefficient, density of connection among one's friends;
- Applications: Recommendation based networks, Network Compression, Visualization of a huge network;
2. Community Detection
Community Detection involves discovering groups in a network where individuals group memberships are not explicitly given.
Roughly, community detection methods can be divided into 4 categories (not exclusive):
- Node-Centric Community - each node is a group that satisfies certain properties;
- cliques, k-cliques, k-clubs;
- Group-Centric Community - consider the connections within a group as a whole, the group has to satisfy certain properties without zooming into node-level;
- quasi-cliques;
- Network-Centric Community - partition the whole network into several disjoint sets;
- clustering based on vertex similarity;
- latent space models, block models, spectral clustering, modularity maximization;
- Hierarchy-Centric Community - construct a hierarchical structure of communities;
- divisive clustering;
- agglomerative clustering;
3. Community Detection Evaluation
- For groups with clear definitions
- e.g. cliques, k-cliques, k-clubs, quasi-cliques;
- verify whether the extracted communities satisfy the definition;
- For networks with ground truth information
- normalized mutual information;
- accuracy of pairwise community memberships;
- For networks with semantics
- networks come with semantic or attribute information of nodes or connections;
- human subjects can verify whether the extracted communities are coherent;
- evaluation is qualitative;
- it is also intuitive and helps understand a community;
- For networks without ground truth or semantic information
- this is the most common situation;
- an option is to resort to cross-validation;
- extract communities from a (training) network
- evaluate the quality of the community structure on a network constructed from a different date or based of a related type of interaction;
- quantitative evaluation functions;
- modularity;
- block model approximation error;