Topic oriented community detection of rating based social networks - SergiuTripon/msc-thesis-na-epsrc GitHub Wiki

HomeLiterature Survey ▸ Community structure in social and biological networks


Year: 2015
Authors: Ali Reihanian, Behrouz Minaei-Bidgoli, Hosein Alizadeh
File: reihanian_2015.pdf


Contents

The goal of this paper is to demonstrate the effect of topic consideration in finding more meaningful communities in social networking sites in which the users express their feelings toward different objects (like movies) by means of rating.

1. Topic-oriented community detection in a social network

1.1 Preprocessing and annotating topic labels

  • people communicate with each other through social objects;
  • these objects often imply the topics which people are interested in;
  • social objects can be classified into two kinds of situations:
    • the social objects which are attached to multi-members
    • the social objects which are attached to one member

In the first situation, the edges between members are built because of a social object. An example of this situation can be happened in a movie rating network. In this network, edges between members are built when they rate the same movie. As a matter of fact, in this network, each movie (social object) is attached to multi members. The members of the movie rating network are connected to each other due to the rating of the same movie.

In the second situation, each social object is attached to only one member. Therefore the social objects are considered to be the attributes of members of the network. An example of this situation can be found in a paper citation network. In this network, papers (members) cite each other. Also, each paper contains a text content (the title of a paper) which is a social object and can be considered as the attribute of the corresponding paper.

  • topics of each social object in a data set are retrieved;
  • each social object is labelled by its corresponding topic;

Back to Top

1.2 Clustering social objects

  • social objects in a network are partitioned into different clusters;
  • each cluster represents a unique topic which is shared by its members;
  • since the data sets which are used in this paper contain social objects with labelled topics, we manually partition these social objects into different clusters;

Back to Top

1.3 Creating topical clusters

  • partition the members of the network into different topical clusters;
  • in the first step, each social object has been annotated with a topic label;
  • in this step, members are partitioned into different topical clusters considering the topic labels of the social objects they are involved in;
  • thus in this step we find clusters in which every member has the same topic of interest;
  • therefore the total number of topical clusters is equal to the number of topics of interest in the network;
  • a user can be a member of several topical clusters, since it is common for a user to be interested in several topics;

Back to Top

1.4 Applying a community detection algorithm to the topical clusters

  • this step aims to find communities in each of the topical clusters which were created in the previous step;
  • members in each topical cluster are connected to each other with different strengths;
  • based on the number of ratings on the same social objects, some members may have stronger connections, while some others may have weak or no connections;

In order to perform this process, many community detection algorithms can be employed such as GN and so on. Newman proposes an important algorithm to partition network graphs of links and nodes into sub graphs. He also introduces a concept which is called modularity.

Since Newman’s algorithm was very time-consuming, Blondel et al. (2008) suggest the modified version of the algorithm in order to make it faster, giving rise to what is known as the "Louvain method". This algorithm is a modularity maximization algorithm which iteratively optimizes the modularity in a local way and aggregates nodes of the same community (Wang et al., 2014). In this paper, the "Louvain method" has been applied in order to find topical communities.

Back to Top

2. Experiment and analysis

2.1 Real life data sets and performance metric

  • used the publicly available data sets in our experiments which are Movielens 100k, Book-Crossing, CIAO, MovieTweetings and Movielens Latest; Zhao et al. (2012) introduced a performance evaluation metric which considers both topic and linkage structure.

Back to Top

2.2 Experiments

  • purity has its maximum value in each of the five data sets;
  • the reason is that, the topical clusters created in each data set incorporate members which are interested in the same unique topics;

Back to Top

2.3 Comparison

In order to prove the superiority of the results of detecting communities with topic consideration, in this section, we compare the results of topic-oriented community detection, with the results of classical community detection in which no content analysis is performed.

In the process of classical community detection approach, a community detection algorithm is applied to a network in which the weight of the edges represents the number of communications between relevant nodes. In this condition, no content analysis is done.

Modularity and Purity has higher values in the topic-oriented framework, since the basic network is partitioned into topical clusters, and each identified community includes members who have the same topic of interest.

Back to Top