On Accuracy of Community Structure Discovery Algorithms - SergiuTripon/msc-thesis-na-epsrc GitHub Wiki
Home ▸ Literature Survey ▸ On Accuracy of Community Structure Discovery Algorithms
Year: 2011
Authors: Günce Keziban Orman, Vincent Labatut, Hocine Cherifi
File: orman_2011.pdf
Contents
- 1. Community Detection Algorithms
- 2. Method
- 3. Results
Community Detection Algorithms
Link-Centrality-Based Algorithms
- Radetal (Radicchi et al.)
Modularity Optimization Algorithms
- FastGreedy (FG) (Newman et al.)
- Louvain (LV) (Blondel et al.)
- Spinglass (SG) (Reichardt and Bornholdt)
Spectral Algorithms
- Leading Eigenvector (LEV) (Newman)
- Commfind (CF) (Donetti and Muñoz)
Random-Walk-Based Algorithms
- Walktrap (WT) (Pons and Latapy)
- MarkovCluster (MCL)
Information-Based Algorithms
- Infomod (IND) (Rosvall and Bergstorm)
- Infomap (INP) (Rosvall and Bergstorm)
Other Algorithms
- Label Propagation (LP) (Raghavan et al.)
Method
Artificial Network Generation
- generated benchmark graph using the LFR benchmark;
- nodes degrees and community sizes are both power-law distributed;
Performance Assessment
- as artificial networks are used, their communities are known a priori;
- uses NMI (Normalized mutual information) to compare algorithms;
- NMI ranges from 0 to 1;
- If the estimated communities correspond perfectly to the actual ones, the measure takes the value 1;
- It is zero when the estimated communities are independent from the actual ones;
- for each combination of parameters, a sample containing 25 networks was produced;
- each of the eleven community detection algorithms presented are tested on all generated samples;
- their partitioning performance is assessed using NMI;
- to compare the algorithms, we average the measured NMI values over the 25 networks of each sample;
- to assess their influence, we calculate Pearson’s correlation between each parameter and the NMI values;
Results
- In all cases Infomap outperforms all the other algorithms under investigation. It succeeds in identifying the communities even for high mixing coefficient values.
- Walktrap, MarkovCluster, SpinGlass and Louvain also have an excellent performance level, although not as good.
- Leading Eigenvector and Commfind are clearly outclassed.
- Label Propagation exhibits very different results for sample networks generated with the same parameters values.
- The network size and the average degree are two other parameters which influence algorithms performance.
- When the network size increases, some algorithms (Infomap, Infomod, Louvain) perform better, some others perform worse (Commfind, SpinGlass, LeadingEigenvector, Louvain, Radetal) and for the rest of the algorithms (Walktrap, FastGreedy, MarkovCluster), the performances does not change.
- For all algorithms, the higher the degree, the better the performance.