Influence maximization of temporal social networks (IMT) is a problem that aims to find the most influential set of nodes in the temporal network so that their information can be the most widely spread. To solve the IMT problem, we propose an influence maximization algorithm based on an improved K-shell method, namely improved K-shell in temporal social networks (KT). The algorithm takes into account the global and local structures of temporal social networks. First, to obtain the kernel value

In the era of mobile Internet, online social networks have become an important channel for information dissemination and have greatly impacted the lives of people. Among the many studies on social networks, influence maximization (IM) is one of the most popular research directions which is first proposed by Domingos et al. [

To solve the IM problem, most researchers usually focus on the topology of traditional social networks and abstract the social network into a static graph, ignoring the temporal nature of the network. However, in real social networks, such as telephone and mail transmission networks, nodes are not always connected. Instead, they are connected at a specific time, and the connections between nodes are time-series, which means that the networks are temporal. Therefore, in these temporal social networks, it is not enough to determine the closeness of users by their connections with each other. It is also necessary to record more specific interaction times between users to accurately portray their relationships. In this way, we can identify nodes that play a critical role in the influence propagation and effectively address the issue of IM.

As shown in _{T} with the same probability of influence propagation between nodes as in _{T}, assuming that node 1 also activates node 2, and the weights on the edge between nodes 1 and 2 indicate that these two nodes are only connected at times 10 and 11, which means that node 2 is activated at time 10 at the earliest, i.e., node 2 is inactive before time 10. However, node 2 and its neighbor nodes 4 and 5 are connected only before time 9, so node 2 cannot activate its neighbors. Thus, the influence of node 1 is 1 in _{T}.

As mentioned above, due to the temporal nature of social networks, the topology between nodes changes dynamically over time, which makes traditional influence propagation models and influence maximization algorithms unsuitable for temporal social networks. In the past few years, some studies [

Therefore, this paper uses the temporal graph to model the temporal social network from a global perspective and proposes an improved K-shell method for effectively solving the IMT problem, namely the KT algorithm. Using the static and dynamic attributes of nodes, the algorithm distinguishes the influence of nodes from the global and local structure of the temporal social network. First, in the global scope of the network, an improved K-shell method is proposed, which stratifies the network based on the temporal characteristic of nodes and obtains the kernel value

We define the problem of IMT and provide an idea to solve it by considering the global and local structure of the network.

Based on the above idea, we propose the KT algorithm to solve the IMT problem. The algorithm stratifies the network based on an improved K-shell method and then selects the optimal seed according to the comprehensive degree of nodes.

We further propose a more effective KTIM algorithm by optimizing the seed selection strategy to solve the IMT problem.

Through experiments on four real-world temporal datasets, we demonstrate the efficiency and effectiveness of our proposed algorithms.

In the rest of this paper, Section 2 introduces related works. We formulate the IMT problem and propose a greedy based algorithm to solve it in Section 3. The fourth section describes the KT and KTIM algorithms. Section 5 provides experimental results on real-world datasets, and we conclude this paper in Section 6.

In this section, we briefly review the related work on solutions to the IM problem, including influence maximization algorithms in traditional and temporal social networks.

In traditional social networks, influence maximization algorithms mainly include greedy and heuristic algorithms. Kempe et al. [

To overcome the inefficiency of the greedy algorithm, researchers have proposed a large number of heuristic algorithms. The issues [

In recent years, many researchers have extended their research on the IM problem of social networks. Wang et al. [

In temporal social networks, the aim of maximizing influence is to find the most influential seed set under the temporal relationship so that the information propagation of this set is the most. Therefore, it is important to focus not only on topological features but also on temporal features [

Recently, Wu et al. [

In addition, Zhang et al. [

In conclusion, although the influence of the heuristic algorithm is unstable under different social networks and propagation models, its performance can be improved by several orders of magnitude compared to that of the greedy algorithm. This is due to the fact that the greedy algorithm obtains the largest influence nodes using time-consuming Monte Carlo simulations under a particular influence propagation model. In contrast, the heuristic algorithm is not constrained by the influence propagation model and selects the most influential nodes based on the characteristics of nodes and edges. Although the influence propagation effect of the greedy algorithm is excellent and stable, its time complexity is high and unsuitable for temporal social networks.

In a temporal social network, nodes are connected only at a specific time. As shown in

In traditional social networks, degree estimation is usually used to calculate the influence propagation probability between nodes. However, it is not appropriate in temporal social networks, as mentioned in

In traditional social networks, it is not necessary to consider the start time of a node being activated for the IM problem. In contrast, the start time of the node being successfully activated in the temporal social network needs to be considered. Therefore, in this paper, we adopt the ICT model [

Let the initial active start time of the seed node be

When node

Regardless of whether node

If node

The influence diffusion process is from new active nodes to their inactive neighbors until no new nodes are activated in the network.

According to the above description, this section introduces the concepts of node influence and marginal gain, and then defines the IMT problem.

^{*}

For the IMT problem, the traditional solution follows the idea of the greedy based algorithm, which mainly divides the problem into two sub-processes. One process is calculating the marginal gain of a single node joining the seed set by Monte Carlo simulation approximation. The other process is selecting the

Let

Let

Although the Greedy algorithm can solve the IMT problem, it is time-consuming. In this section, we propose KT and KTIM algorithms to solve it.

As shown in _{T}, we first calculate the temporal characteristic of nodes, denoted as

The traditional K-shell algorithm is used to stratify the network based on the degree of nodes in the global scope. Firstly, it continuously removes nodes of degree 1 and their connected edges from the network

It should be noted that the layer of nodes with a larger

As shown in _{T}. In _{T}, the out-degree of node

As mentioned above, to make the K-shell method applicable to temporal social networks, it is necessary to improve the K-shell method to consider the temporal characteristic of nodes to stratify the network. The temporal characteristic of a node is defined as follows:

Furthermore, we propose an improved K-shell method by exploiting the temporal characteristic of nodes in temporal social networks. The main idea is as follows: First, it continuously removes nodes and their edges whose temporal characteristic value is 1 until such nodes no longer appear. At this time, all deleted nodes are classified as a 1-shell layer and valued to 1. Then, it continuously deletes nodes and their edges with a temporal characteristic value of 2, classifies the deleted nodes as a 2-shell layer, and values these nodes to 2. The above process is repeated until all nodes in the network are classified, and

As shown in

In a temporal social network, after layering the network through the improved K-shell method in Section 4.1, each node is assigned a

Therefore, to address the above issues, we further focus on the other characteristics of nodes within the local scope of each kernel layer, so the comprehensive degree is proposed to represent the dynamic properties of nodes and to weigh the direct and indirect effects of nodes in the propagation process. The comprehensive degree is derived from common sense that if a node has more neighbors, and its neighbors also have more neighbors, then that node may have more influence. So, the comprehensive degree of a node is defined as follows:

As shown in

Based on the above analysis, the KT algorithm is proposed in this paper. The basic idea is that in a temporal social network, the network is first layered according to the temporal characteristic of nodes. Then starting from the core layer of the network, each layer selects the node with the largest comprehensive degree to join the seed set. Let

The algorithm first initializes the candidate seed set

Let the number of nodes be

The KT algorithm distinguishes the importance of nodes with the same

In a temporal social network, the KT algorithm selects the node with the highest comprehensive degree as the seed node for each network layer. However, typically each kernel layer may aggregate multiple nodes. When the KT algorithm selects only one node as a seed node at each kernel layer, other potentially more influential nodes will be lost, which may lead to a reduction in the influence range of the seed set.

As shown in

Therefore, to address the shortcomings of the seed selection strategy of the KT algorithm, we further optimize the seeds selection strategy by combining the static attribute

1) For the layered network

2) The top

Let

In Algorithm 3, line 1 initializes the candidate seed set

Let the number of nodes in the network

We conduct experiments on our proposed algorithms and a series of other compared algorithms on four real-world temporal social network datasets. These experiments aim to evaluate the effectiveness and efficiency of our proposed KT and KTIM algorithms for the IMT problem.

Name | Nodes | Temporal edges | Static edges | Time span/day |
---|---|---|---|---|

Email-Eu-core | 986 | 332334 | 24929 | 803 |

CollegeMsg | 1899 | 59835 | 20296 | 193 |

Math overflow | 21688 | 107581 | 90489 | 2350 |

Ask ubuntu | 75555 | 356822 | 178210 | 2418 |

Among them, dataset 1 [

◆ Greedy: Greedy is based on Algorithm 1 in this issue and utilizes the CELF algorithm in [

◆ IMIT: IMIT [

◆ TIM: TIM [

◆ CCA: CCA [

◆ DegreeDiscount: DegreeDiscount [

For the KTIM algorithm and the KT algorithm, because the network is layered by the improved K-shell method, the layered result at this stage is fixed. Therefore, this stage can be processed offline, and the running time of these two algorithms does not include the layered time in the experiments. For the IMIT and TIM algorithms, since 100 times Monte Carlo simulations in calculating the marginal gain are time-consuming, the data offline processing is also done. We do not consider the running time of Monte Carlo simulations when counting the running time for these two algorithms. Otherwise, the running time of IMIT and TIM would be much larger than the time without offline pre-processing.

In this section, the running time of each algorithm is calculated for selecting 50 seed nodes in datasets of different sizes. Since the Greedy algorithm is time-consuming (The experiments on datasets 3 and 4 are unfinished for more than sixty hours) and its running time is much longer than that of the other algorithms, Greedy is not performed on datasets 3 and 4. The final results are shown in

Algorithm | Email-Eu-core/s | CollegeMsg/s | Math overflow/s | Ask ubuntu/s |
---|---|---|---|---|

Greedy | 72720 | 382222 | ||

IMIT | 1.23 | 1.59 | 9.82 | 31.64 |

TIM | 0.49 | 0.50 | 1.96 | 1.69 |

DegreeDiscount | 0.03 | 0.24 | 0.26 | 0.14 |

CCA | 0.64 | 2.19 | 3.17 | 1.28 |

KT | 0.01 | 0.01 | 0.09 | 0.04 |

KTIM | 0.11 | 0.071 | 0.36 | 0.24 |

From

In this section, we test the effectiveness of the related algorithms. In each algorithm, when the seeds are selected, we run 1000 Monte Carlo simulations and take their average as a result. The final results are shown in

From

In short, according to

Throughout the analysis of the experimental results, Greedy has the largest influence but the longest running time, so this algorithm is not practical. Although DegreeDiscount and CCA have better performance in traditional influence maximization algorithms, they are not applicable to the IMT problem because they ignore the temporal characteristic of the network. In contrast, the KT and KTIM algorithms proposed in this paper, as well as the rest of the baseline algorithms, consider the network timing and are more suitable for the IMT problem. However, considering the efficiency and effectiveness of the algorithms, the KTIM algorithm achieves better results.

In terms of efficiency, as shown in

In terms of effectiveness, the KT algorithm proposed in this paper is slightly inferior to the rest of the temporal social network algorithms. However, the influence spread of the optimized KTIM algorithm is close to that of the IMIT algorithm and the TIM algorithm, which indirectly proves that the optimized seed selection strategy we propose is effective. Taking the Ask Ubuntu network in

In conclusion, it is clear from the above analysis that the KTIM algorithm can trade a balance between effectiveness and efficiency. Therefore, considering the running time and influence range, KTIM has the best results and can efficiently solve the IMT problem.

In this work, we study the problem of maximizing the influence of temporal social networks. Although the greedy algorithm can solve this problem, it is time-consuming. To solve this problem, we consider both the global and local structures of temporal social networks. In the global scope, we propose an improved K-shell method to stratify the network according to the temporal characteristic of nodes and calculate the kernel value

This work is supported by the

The authors declare they have no conflicts of interest to report regarding the present study.