Predicting the Popularity of Information on Social Platforms without Underlying Network Structure

Wu, Leilei; Yi, Lingling; Ren, Xiao-Long; Lü, Linyuan

doi:10.3390/e25060916

Open AccessArticle

Predicting the Popularity of Information on Social Platforms without Underlying Network Structure

by

Leilei Wu

^1,2,3,†,

Lingling Yi

^4,†,

Xiao-Long Ren

^1,*

and

Linyuan Lü

^1,5,*

¹

Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou 313001, China

²

Department of Physics, University of Fribourg, CH-1700 Fribourg, Switzerland

³

Alibaba Business School, Hangzhou Normal University, Hangzhou 311121, China

⁴

Tencent Technology (Shenzhen) Co., Ltd., Shenzhen 518000, China

⁵

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 611731, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Entropy 2023, 25(6), 916; https://doi.org/10.3390/e25060916

Submission received: 17 April 2023 / Revised: 25 May 2023 / Accepted: 6 June 2023 / Published: 9 June 2023

(This article belongs to the Special Issue Complexity, Entropy and the Physics of Information)

Download

Browse Figures

Versions Notes

Abstract

:

The ability to predict the size of information cascades in online social networks is crucial for various applications, including decision-making and viral marketing. However, traditional methods either rely on complicated time-varying features that are challenging to extract from multilingual and cross-platform content, or on network structures and properties that are often difficult to obtain. To address these issues, we conducted empirical research using data from two well-known social networking platforms, WeChat and Weibo. Our findings suggest that the information-cascading process is best described as an activate–decay dynamic process. Building on these insights, we developed an activate–decay (AD)-based algorithm that can accurately predict the long-term popularity of online content based solely on its early repost amount. We tested our algorithm using data from WeChat and Weibo, demonstrating that we could fit the evolution trend of content propagation and predict the longer-term dynamics of message forwarding from earlier data. We also discovered a close correlation between the peak forwarding amount of information and the total amount of dissemination. Finding the peak of the amount of information dissemination can significantly improve the prediction accuracy of our model. Our method also outperformed existing baseline methods for predicting the popularity of information.

Keywords:

information cascade; cascade prediction; popularity prediction; information diffusion; online social network

1. Introduction

With the booming development of communication technologies and mobile services, online social networks enable billions of users to create and share information worldwide freely. Reading and reposting online content has become a significant way for individuals to communicate and express opinions [1,2]. To this end, the dissemination of information plays a fundamental role in our daily life and is of great economic value and practical significance [3,4]. The capacity to collect, clean, and analyze large-scale data has transformed the field of social-network analysis and empowers scientists with enhanced convenience and efficacy in conducting large-scale study [5,6,7,8]. The study of information spreading in social networks has become one of the core topics in computational social science [3,9,10] and network science [11,12]. It attracts increasing attention from fields such as sociology, physics, computer science, etc.

Among the above, the popularity prediction of information on social platforms is a crucial issue that has been widely concerned by both academic and industrial researchers in recent years [13,14,15,16,17,18,19,20]. By “popularity”, we usually mean the final amount of viewing, collecting, forwarding, or sharing of information in networks [13], depending on the actual setting of each research.

First, let us briefly review the research progress on the popularity prediction of information. As one of the most classic studies, Szabo and Huberman [13] analyzed the popularity of content submitted to Digg and YouTube, where popularity means the number of votes on Digg or the number of views on YouTube, respectively. A strong linear correlation was discovered between the logarithmically transformed popularity of content in early and later time periods. The authors proposed a log-linear model-based linear regression (LR) method to predict popularity. See more details of this method in Section 2.4.

Inspired by the above approach, the linear regression with degree model (LR-D) [21] was proposed to predict the popularity of the information in a greater variety of data sets by considering the cumulative degree of the users who reshare content. Furthermore, Bao et al. [22] found a close relationship between the popularity of the information and the structural diversity of the social network. Specifically, there exists a strong negative/positive near-linear correlation between the final popularity and its link density/diffusion depth over time. Thus, the final popularity of information can be computed by linear regression with the structural characteristics (LR-S) model.

From another viewpoint, a user who has forwarded a message may trigger another user to forward the message with a probability. By considering the underlying arrival process of information, and the aging effect and reinforcement effect in the spreading process, Gao et al. [23] proposed a model, named Exponential reinforcement and Time Mapping process (PETM), which combines the reinforced Poisson process model with a power-law relaxation. Based on the theory of self-exciting point processes, Zhao et al. [21] developed a Self-Exciting Model of Information Cascades (SEIMIC) method to predict the future sharing volumes of given posts on Twitter. The SEISMIC only requires the timestamps of reposts and the number of followers of the users.

From the empirical analysis, it is easy to find that a handful of vital users [24] dominate the spreading of information on social networks. Taking into account this phenomenon, Gao et al. [25] propose a mixture process to predict the popularity of information.

Besides the above algorithms, an enormous amount of research has been conducted to predict the popularity of information on social networks recently [26,27,28,29,30]. These research advances shed light on the applications spanning from communication, decision-making, cooperation, viral marketing, and advertising to prompt user-generated content such as blogs and scientific papers and understanding the evolution of information cascades online.

However, these methods either rely heavily on complicated features that are time-varying and cannot be easily extracted from multilingual and cross-platform content, or on the underlying network structures or properties that are often difficult to obtain. In this article, we analyzed several empirical data sets and found that the information-cascading process is best characterized as an activate–decay dynamic process. Based on our findings, we propose an activate–decay (AD)-based algorithm for predicting the long-term popularity of online content solely based on their early repost amount, without requiring knowledge of the social-network structure or content properties. The results show that our method uses the forwarding amount of information in WeChat within the first two hours to forecast its popularity for seven days with remarkable accuracy. Furthermore, we identified a close correlation between the peak of the amount of information dissemination and the total amount of dissemination. As long as the peak of the amount of information dissemination can be found, the prediction accuracy will be significantly improved. Our method also outperformed existing baseline methods for predicting the popularity of information.

Following the above brief introduction to the problem we are investigating, the rest of this paper is structured as follows. First, we conduct empirical analyses of two data sets of information-forwarding processes across the Weibo and WeChat platforms. Our analysis describes the rise and fall of information as an activate–decay dynamic process, which provides insight into attempts to model and predict information transmission. Second, we propose a model based on the (Bi)Hill equation from biochemistry, which has limited parameters and can predict the popularity of information without requiring knowledge of the underlying structure of social networks or content features. Finally, we perform experiments to demonstrate the effectiveness of our proposed method.

2. Materials and Methods

In this section, we begin by presenting an empirical analysis of two prominent social-network platforms: WeChat and Weibo. We then use the observed spreading patterns of information on these platforms to develop a dynamic process that describes the rise and fall of information over time. Using this proposed dynamic process, we can predict the popularity of information.

2.1. Empirical Data Analysis

To begin with, let us provide a brief introduction to the datasets utilized.

The WeChat dataset comprises over 90,000 news articles, including political news, economic news, legal news, military news, scientific and technological news, cultural and educational news, sports news, social news, etc., and their forwarding records between the individuals in the WeChat social platform from 2 June to 8 June 2016, was created in a collaboration project with Tencent’s WeChat department. The forwarding records were collected from individuals sharing in timelines, group chat, and individual forwarding. The data includes the message

i d

and the time t when a message is forwarded. The forwarding records of all messages in this dataset were anonymized.

The Weibo dataset, obtained from a competition hosted by Wolong Big Data on DataCastle (https://challenge.datacastle.cn, accessed on 1 May 2023), consists of roughly 30,000 microblogs, with over 17,840,000 forwarding records. Weibo is commonly referred to as the “Twitter of China”. The messages in the Weibo dataset are mainly short paragraphs with at most 140 Chinese characters, with or without pictures. The dataset includes the content of microblogs, the users who published or forwarded the microblogs, the publish and forward time t, and the following relationship between users. In this research, we only use the ids and publish/forward times of the microblogs.

To better analyze the collective forwarding pattern of different messages, we standardize the timestamp of all the forwarding records in the three data sets and note the time when the message was released as

t = 0

. In Figure 1, we show the average amounts of information forwarded on WeChat and Weibo exhibit varying statistical trends over time. The figure’s top row depicts the correlation between the average forwarding amount and time unit. For WeChat and Weibo the horizontal axis scale is (a) 1 min, (b) 10 s. The X-axis is logarithmic. On average, it takes less than 30 min (1800 s) for a message to reach its peak from generation to transmission per unit time, while it takes only 200 s for Weibo. After passing the peak period, the forwarding volume of all messages gradually decreases over time. Figure 1 indicates that the entire process can be divided into two stages, namely active and decay. The active stage is very fast to reach the peak point while the decay stage lasts a very long time. To gain a comprehensive understanding of the entire process, the x-axis of Figure 1’s top row was plotted using a logarithmic scale. To visually show the rate of change in the forwarding number before and after reaching the maximum value, i.e., the maximum forwarding volume per unit time, after the information was released, the lower row of Figure 1 was plotted in a log-log coordinate. The shapes of the curves indicate that the change in the information’s dissemination rates roughly follows a power law. The dissemination of news takes a little time to reach the average peak, and the rate of information dissemination on different social platforms exhibits a subtle difference. Notably, Weibo shows faster transmission rates than WeChat. Please find more analysis in Section 2.2.2.

In this research, our goal is to predict the final number of forwarding of a given message. Building on the empirical analysis mentioned above, we formulate a mathematical method that captures the rise and fall of the information dissemination process depicted in Figure 1. Our model enables us to predict the future shares of a piece of information by examining its sharing history, indicating whether the sharing cascade has undergone an initial stage of rapid expansion and identifying the messages that are most likely to be shared extensively in the future. After clearing and filtering the records, the data sets were divided as a train set and a test with 75% and 25% of the messages according to their real release time.

2.2. The Activation-Decay Model

2.2.1. The Hill Equation and BiHill Equation

The Hill equation, which was introduced by A.V. Hill in 1910 [31], is a biochemical characterization equation that has been widely utilized for analyzing nonlinear quantitative drug-receptor relationships [32]. Additionally, the Hill equation and its variant BiHill can also be used to describe the nonlinear transmission mathematically [33]. Hill equation can be expressed as follows [34]

θ = \frac{1}{1 + {(\frac{K_{A}}{| L |})}^{n}},

(1)

θ

is the fraction of occupied sites where the ligand can bind to the active site of the receptor protein.

| L |

is free (unbound) ligand concentration. n is the Hill coefficient, which describes the synergy and is a measure of super sensitivity (i.e., the steepness of the response curve). Generally speaking, n determines the cooperativity of ligand binding in the following way:

n > 1

, positively cooperative binding: Once a ligand molecule is bound to the enzyme. the affinity of the enzyme for other ligands will increase.

n < 1

, negatively cooperative binding: Once one ligand molecule is bound to the enzyme, its affinity for other ligand molecules decreases.

n = 1

, noncooperative (completely independent) binding: The affinity of an enzyme for a ligand molecule does not depend on whether a ligand molecule has been bound to it.

We apply the Hill function to the process of information propagation, take it as the function of time t, and its equation form is expressed in the following:

H i l l (t) = \frac{p}{1 + {(\frac{k}{t})}^{h}},

(2)

where

p > 0

,

k > 0

,

h > 0

, are three parameters. And when

h > 0

, the system is in the activation effect, and the curve rises; when the

h > 0

, the system is in the inhibition effect, and the curve decays.

The Biphasic Hill equation, abbreviated as the BiHill equation, indicates that activation and inhibition exist in the whole system at the same time. The BiHill equation is expressed as follows:

B i H i l l (t) = \frac{P_{m}}{[1 + {(\frac{K_{a}}{t})}^{H_{a}}] * [1 + {(\frac{K_{i}}{t})}^{H_{i}}]},

(3)

where

p_{m} > 0

,

K_{a} > 0

,

K_{i} > 0

,

H_{a} > 0

,

H_{i} > 0

are the maximum value, the half-maximal activating value, the half-maximal inhibitory value, the activation Hill coefficient, the inhibitory Hill coefficient of

B i H i l l (t)

, respectively. See the upper row of Figure 1. Applying this function to the process of information dissemination, the effects of activation and inhibition mechanisms in information dissemination are consistent with the mathematical meaning of this formula.

2.2.2. The Activation-Decay Model

According to the empirical analysis, the average forwarding amount of messages changes over time. In the beginning, the average amount of forwarding in unit time increases fast. However, when reaching the peak, i.e., the maximal amount, it decays slowly, until close to 0. Define an index

r (t)

to measure the degree of information dissemination approaching the peak value,

r (t) = \frac{q_{m a x} - q (t)}{q (t)},

(4)

where

q_{m a x} = m a x [q (t)]

. It then clearly follows that:

r (t) = K * t^{H},

(5)

where K and H are two parameters. It is deduced that

q (t) = \frac{q_{m a x}}{1 + K * t^{H}} .

(6)

It is just a form of the Hill equation.

r (t)

is a quantitative index, and the greater its value is, the closer the amount of propagation per unit granularity is to the peak value. We have verified that

r (t)

is a segmented function with a log-log presenting the shape of “V” according to the real social-network data. When

H < 0

,

r (t)

is the “V” decaying part in the double log coordinate, while

H > 0

, it is the “V” rising part. See Figure 1.

In the process of disseminating information to a broader audience, there are often two opposing forces at play: activation and decay. Activation refers to factors that contribute to the spread or promotion of information, while decay refers to factors that inhibit or slow the spread of information. These two forces interact with each other in a dynamic and game-like manner, influencing the ultimate outcome of the information dissemination process. This interaction between activation and decay factors can be complex and multifaceted, as various factors may contribute to the spread or inhibition of information.

We consider that the process of information dissemination is the interaction of activation and decay factors, and a game exists between them. Before the peak value of propagation per unit granularity, the activation state plays a leading role. After the peak value, the decay factor begins to dominate. Hence

q (t)

will show a process of rising and then decaying over time. Therefore, we define

F = \frac{1}{1 + K * t^{H}} .

(7)

When

H < 0

, F is the motivation factor, and when

H > 0

, F is the decay factor.

Based on the analysis above, and the random fluctuations can be regarded as an additive noise term, we construct a prediction function named AD function,

q (t) = α * q_{m a x} * A c t i v a t i o n f a c t o r * D e c a y f a c t o r + E r r o r f u n c t i o n

, i.e.,

q (t) = α * q_{m a x} * \frac{1}{1 + K_{a} * t^{H_{a}}} * \frac{1}{1 + K_{d} * t^{H_{d}}} + e^{β} .

(8)

where

α

and

β

are harmonic parameters, which acquire from the historical date training. Additionally, it can be shown as:

q (t) = α * B i H i l l (t) + e^{β} .

(9)

Therefore, we can directly use the BiHill equation in OriginLab to fit the parameters

K_{a}

,

K_{d}

,

H_{a}

,

H_{d}

.

From the calculation of the average propagation of all messages selected by the system to the forwarding of each message, the prediction function is:

Q (t) = α * Q_{m a x} * \frac{1}{1 + K_{a} * t^{H_{a}}} * \frac{1}{1 + K_{d} * t^{H_{d}}} + e^{β} .

(10)

where

Q_{m a x} {= m a x [Q (t)] |}_{0}^{(T_{k n o w n})}

. Then the propagation total amount of each message in T days is

{Q_{i d}}^{T} = \sum_{0}^{T_{k n o w n}} Q {(t)}_{i d} + \sum_{T_{k n o w n} + 1}^{T} Q {(t)}_{i d} .

(11)

Except for

Q_{m a x}

, other parameters can be obtained from historical data training, i.e., we only need to know the peak value of information dissemination, and we can predict the information dissemination. In fact, the amount of social-network information dissemination will reach its peak in a short time, with WeChat within 30 min and Weibo within 5 min, see Figure 1.

2.2.3. The Algorithm for Popularity Prediction Based on Activation-Decay Model

Assume that we have propagation data of N messages in

T_{k n o w n}

, to predict the total information propagation (

T > T_{k n o w n}

) after T time, note

i d

is the message, the amount of

i d

’s being forwarded at t is

Q {(t)}_{i d}

and the average amount of N messages is

q (t)

:

Step 1

Gaining model parameters from historical data sets,

K_{a}, H_{a}, K_{d}, H_{d}

, as shown in Figure 2 ①–③:

(1): Taking the time of each message generation as the zero time, obtain the forward amount in every unit time (unit granularity adjustable). Process N messages’ forward amount in T period into data sequence, t, $i d$ , $Q {(t)}_{i d}$ .
(2): Calculate the average amount of these N messages in T period time $q (t) = \frac{\sum Q {(t)}_{i d}}{N}$ , which yields date sequence t, $q (t)$ .
(3): Estimate the parameters $K_{a}, H_{a}, K_{d}, H_{d}$ from Equations (4) and (5), or directly obtain these parameters by fitting through BiHill equation from Equation (9), see Figure 1.

Step 2

Obtaining best parameters,

α

and

β

, by training set and test set, as shown in Figure 2 ④.

(1): The training set data are divided into two parts, with the known maximum time $T_{k n o w n}$ (which can be set by oneself): the $0 - T_{k n o w n}$ part is the known information set, and the $T_{k n o w n} - T$ part is the information set for prediction. e.g., if the information propagation data of 10 min is known, i.e., the data within 0–10 min are available, and the rest is a test set.
(2): Find out the $Q_{m a x} {= m a x [Q (t)] |}_{0}^{T_{k n o w n}}$ , calculate the total propagation amount of each message from Equation (11). The calculated value of the propagation amount of each message is compared with the actual propagation amount and calculates the average absolute error $M P A E$ . When $M A P E$ is minimum, the parameters $α$ and $β$ are the optimal parameters.

Step 3

Put the Related parameters (

α, β, K_{a}, H_{a}, K_{d}, H_{d}

) into the AD algorithm to predict the propagation quantity of the information to be predicted, as shown in Figure 2 ⑤–⑦.

2.3. Evaluation Metrics for the Prediction Algorithm

In this subsection, the evaluation metrics of the prediction algorithms used were introduced briefly.

2.3.1. APE and MAPE

APE (Absolute Percent Error) is used to measure the relative error between the predicted value and the real value on the experimental dataset. APE is defined as:

A P E = \frac{| Q_{i d}^{p r e d i c t e d} - Q_{i d}^{r e a l} |}{Q_{i d}^{r e a l}} * 100 % .

(12)

The lower the value of APE, the better the accuracy of the prediction model.

MAPE (Mean Absolute percent error) is the average value of APE in the system, which is used to measure the relative errors between the average predicted value and the real value on the test set. MAPE is defined as:

M A P E = \frac{1}{N} * \sum_{1}^{N} \frac{| Q_{i d}^{p r e d i c t e d} - Q_{i d}^{r e a l} |}{Q_{i d}^{r e a l}} * 100 % .

(13)

Additionally, the lower the value of MAPE, the better the accuracy of the prediction model.

2.3.2. TIC

The TIC (Theil inequality coefficient) is an indicator to measure the prediction ability of the model. The smaller the general value is, the better the prediction ability of the model is. The TIC is defined as:

T I C = \frac{\sqrt{\frac{1}{N} * \sum_{1}^{N} {(Q_{i d}^{p r e d i c t e d})}^{2}}}{\sqrt{\frac{1}{N} * \sum_{1}^{N} {(Q_{i d}^{p r e d i c t e d})}^{2}} + \sqrt{\frac{1}{N} * \sum_{1}^{N} {(Q_{i d}^{r e a l})}^{2}}} .

(14)

Therefore, the value range of this coefficient is 0–1. The closer it is to 0, the smaller the root mean square of unit error, i.e., the closer the predicted value is to the actual value, the better the model fitting effect will be.

2.4. Baseline Algorithm

As discussed in the introduction, there are currently numerous ways to predict popularity, including three main categories. These are predictions of early popularity [13], influence factors [35,36], and cascade propagation [22,37,38]. To validate the accuracy of our prediction method, we chose a typical popularity prediction algorithm [13] as the baseline method. The authors performed a logarithmic transformation on the popularity of submissions of online content from two content-sharing portals, YouTube and Digg. They found a strong correlation between the early and later times and used this relationship to predict the future popularity of messages.

ln N_{s} (t_{2}) = ln N_{s} (t_{1}) + \sum_{τ = t_{1}}^{t_{2}} η (τ),

(15)

where

N_{s} (t)

is the popularity of message s at time t,

t_{1}

and

t_{2}

are two arbitrarily points in time,

t_{2} > t_{1}

, and

η (τ)

refers to independent values drawn from a fixed probability distribution.

3. Experimental Results

The performance of the prediction model will be shown in this section. We apply three error function indicators: APE, MAPE, and TIC. We evaluate both the AD algorithm and the baseline algorithm for data on WeChat and Weibo, by comparing the performance of the MAPE, TIC, and APE.

3.1. Prediction of the Popularity of Information

In Figure 3, we compare the performance of the AD algorithm and baseline algorithm (called BS algorithm) on WeChat (with message number N = 31,247) and Weibo (with message number N = 25,467) social networks. We can draw the following conclusions: (1) AD algorithm: Within a certain granularity range, as the granularity becomes larger, the accuracy will increase, but it will not continue to improve as the granularity increases. It can be seen from the figure that the optimal value on WeChat data is obtained when the granularity is 5 min, and the better value on Weibo is 120 s. (2) With the growth of the known information time series (

T_{k n o w n}

), the effects of the two algorithms are becoming better and better. In WeChat data, the AD algorithm outperforms the baseline algorithm (BS), no matter in MAPE or TIC index. In the Weibo data, the AD performed better than the BS at any granularity in the MAPE index. For TIC indexes, the AD algorithm does not perform better than the BS algorithm when the granularity is 30 s or 60 s. However, the AD algorithm begins to show its advantages when the granularity is 120 s. (3) After the granularity is changed, with the increase in the known propagation time, the accuracy rate is better, the reason should be that the peak value of some information may appear over a long time. If the time is short, the true peak of the information has not yet appeared when the statistics are calculated, which affects the accuracy.

In Figure 4, we compare the predictive performance between the AD algorithm and the baseline algorithm. The AD algorithm has a wider range of high prediction accuracy. Intuitively, the red area represents the smallest error (less than 0.2). Compared with the BS algorithm, the AD algorithm can predict the future forwarding amount more accurately (the known forwarding amount ranges from about 1 to 10,000), while the BS algorithm can only reach this standard in the known forwarding amount range (about 50–3000). Whether or not the information is popular in the future, the AD algorithm can give more accurate predictions. This means that the AD algorithm is more flexible and robust, and its prediction performance is less affected by the known information.

We run AD and BS algorithms on the test set and compute the APE as a function of time. We plot the quantiles of the distribution of APE of the AD algorithm in Figure 5. The AD method demonstrates a clear improvement over the baseline. Take the upper figure (WeChat data) as an example, after 30 min, the APE of both algorithms was only in a stable state. After observing the cascade for 20 min, for the AD algorithm, the 90th, 70th, and 50th percentiles of APE are less than

75.6 %, 54.2 %

, and

37.8 %

, respectively. This means that after 20 min, the average error is less than

37.8 %

for

50 %

of the messages and less than

71 %

for

90 %

of the messages. After 30 min, the error becomes stable—APE for

90 %

,

70 %

and

50 %

of the messages drops to

73.8 %

,

53 %

and

36.8 %

, respectively. At the same time, the degree of shadow location indicated in the figure indicates that the AD algorithm has greater prediction accuracy than the BS algorithm.

We make a more comprehensive presentation of the errors, plotting the AMPE, TIC, and the distribution of APE in a graph, and take these error indicators as a function of the known information-forwarding time, as shown in Figure 6. The greater the blue coverage area, the higher the algorithm’s prediction impact. Again, the AD algorithm is giving much more accurate rankings than the baseline algorithm in every way.

3.2. Determine the Peak $Q_{p e a k}$

In our AD algorithm, there is a very significant variable,

Q_{m a x}

. During the implementation, we found that if

Q_{m a x}

is the peak value

Q_{p e a k}

in the process of information forwarding (

Q_{m a x} = Q_{p e a k}

), the prediction accuracy of AD algorithm will be greatly improved, as shown in Figure 7.

Q_{p e a k}

is the maximum value of the time series of information-forwarding volume in the whole life cycle. It is different from

Q_{m a x}

, which is the maximum value of the time series of information-forwarding volume in the known period

T_{k n o w n}

. We use the amount of information forwarded in the

T_{k n o w n}

to predict the total amount of information forwarded in the life cycle (7 days in this paper). The experimental results show that whether the

Q_{p e a k}

of information occurs within the known time

T_{k n o w n}

will directly affect the prediction accuracy.

Peak Time $t_{p e a k}$

Peak time

t_{p e a k}

, we refer to the time when the popularity reaches the highest value

Q_{p e a k}

per unit time once the popularity evolution starts. That means we can obtain

Q_{m a x} = Q_{p e a k}

if

t_{p e a k} < T_{k n o w n}

. The longer the known time

T_{k n o w n}

, and the greater the probability of the real peak

Q_{p e a k}

appearing, the more accurate the prediction result is. See Figure 7, MAPE_realpeak, which signifies that the

Q_{p e a k}

value emerged within

t_{p e a k} < T_{k n o w n} = 120

min, i.e.,

Q_{m a x} = Q_{p e a k}

, which we term the real peak, as illustrated by the red dot in Figure 7. MAPE_fakepeak, which indicates

t_{p e a k} > T_{k n o w n} = 120

min, i.e.,

Q_{p e a k}

did not emerge within the known 120 min, then

Q_{m a x} < Q_{p e a k}

, we use its maximum value

Q_{m a x}

to predict, evidently its prediction accuracy rate is lower than

Q_{m a x} = Q_{p e a k}

, see the blue dot in Figure 7. The real forecast result is the outcome of combining the aforementioned two conditions, as represented in Figure 7’s green dot schematic design. As a result, the most crucial issue we should examine in our future work is how to determine or forecast

Q_{p e a k}

. In the first known 120 min of message spread data, using the peak

Q_{p e a k}

to predict the final counts, the MAPE can reach 0.27, while the fake peak result is 0.35.

To more intuitively assess the impact of

Q_{p e a k}

on the prediction outcomes, we partition the dataset into two portions for prediction using

t_{p e a k} < T_{k n o w n}

and

t_{p e a k} > T_{k n o w n}

(

T_{k n o w n}

= 40 min, with WeChat Official Account, it takes less than 30 min on average for a message to reach its peak from generation to transmission per unit time, see Figure 1). In Figure 8, the peak

Q_{p e a k}

has been reached in the left figure, i.e.,

Q_{m a x} = Q_{p e a k} (t_{p e a k} < T_{k n o w n})

, and that the colored spots with

A P E < 0.4

account for

70.7 %

of the total. Its final retweets range from

10^{3}

to

10^{5}

(Y axis). Nevertheless, the peak

Q_{p e a k}

is not attained in the right figure, i.e.,

Q_{m a x} < Q_{p e a k} (t_{p e a k} > T_{k n o w n})

, the colored points with

A P E < 0.4

account for

65.1 %

of the total, and the final forwarding volume range is only from

10^{3}

to

10^{4}

(Y axis). This demonstrates that

Q_{p e a k}

has a considerable influence on the final forwarding amount range. The determination of the peak

Q_{p e a k}

may not only broaden the forecast range of information popularity, but it can also considerably enhance information popularity predictability.

4. Conclusions

The spread of information, ideas, innovation, influence, behaviors, and styles within social networks is ubiquitous [7,8]. The popularity prediction of information on social platforms is a hot research topic recently [28,30]. Nonetheless, the majority of current methodologies either heavily depend on intricate features that are time-dependent and arduous to extract from multilingual and cross-platform content, or rely on intricate network structures or properties that are frequently challenging to acquire. In this paper, we analyzed several empirical data sets and found that the information-cascading process is best characterized as an activate–decay dynamic process. Then, we introduced the activate–decay-based (AD) algorithm, which predicts the long-term forwarding amount of information without requiring knowledge of social-network structure or content features. Instead, the AD algorithm only uses limited information, i.e., the amount of information forwarded within specific time intervals (e.g., 30 min for WeChat, and 3 min for Weibo), to predict the total forwarding amount over several days accurately.

The AD algorithm is a straightforward and practical approach for predicting information popularity, which outperforms the baseline algorithm in accuracy. However, a challenge remains in determining the actual maximum forwarding amount within a given time interval. To address this challenge, we assume that the maximum propagation amount per unit of time based on past data denoted as

Q_{m a x}^{r e a l}

, represents the peak value. Nonetheless, we find that identifying the genuine peak forwarding value can further improve the accuracy of our prediction results, as illustrated in Figure 7. Therefore, we plan to focus on this issue in future research.

Author Contributions

Methodology, L.W. and X.-L.R.; formal analysis, L.W., L.Y., X.-L.R. and L.L.; resources, L.Y. and L.L.; writing—original draft preparation, L.W. and X.-L.R.; writing—review and editing, L.W., L.Y., X.-L.R. and L.L.; visualization, L.W.; supervision, X.-L.R. and L.L.; funding acquisition, L.L. and X.-L.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the STI 2030–Major Projects (2022ZD0211400), the China Postdoctoral Science Foundation (2022M710620), the Sichuan Science and Technology Program (2023NSFSC1353), the Project of Huzhou Science and Technology Bureau (2021YZ12), and the UESTCYDRI research start-up (U032200117). This work has been partially supported by the New Cornerstone Science Foundation through XPLORER PRIZE.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data from Weibo in this study will be available at the following GitHub repository: https://github.com/renxiaolong/InformationPopularityPrediction (accessed on 5 June 2023) after this paper is accepted. The dataset of WeChat was generated during a collaboration project with Tecent’s WeChat department. All the WeChat data are kept within the company.

Conflicts of Interest

The authors declare no conflict of interest.

References

Brady, W.J.; McLoughlin, K.; Doan, T.N.; Crockett, M.J. How social learning amplifies moral outrage expression in online social networks. Sci. Adv. 2021, 7, eabe5641. [Google Scholar] [CrossRef]
Zhao, J.; Wu, J.; Xu, K. Weak ties: Subtle role of information diffusion in online social networks. Phys. Rev. E 2010, 82, 016105. [Google Scholar] [CrossRef] [Green Version]
Lazer, D.; Pentland, A.; Adamic, L.; Aral, S.; Barabási, A.L.; Brewer, D.; Christakis, N.; Contractor, N.; Fowler, J.; Gutmann, M.; et al. Life in the network: The coming age of computational social science. Science 2009, 323, 721–723. [Google Scholar] [CrossRef] [Green Version]
Freelon, D.; Marwick, A.; Kreiss, D. False equivalencies: Online activism from left to right. Science 2020, 369, 1197–1201. [Google Scholar] [CrossRef] [PubMed]
Wasserman, S. Social Network Analysis: Methods and Applications; Cambridge University Press: Cambridge, UK, 1994. [Google Scholar]
Aggarwal, C.C. An introduction to social network data analytics. In Social Network Data Analytics; Springer: New York, NY, USA, 2011; pp. 1–15. [Google Scholar]
Pastor-Satorras, R.; Castellano, C.; Van Mieghem, P.; Vespignani, A. Epidemic processes in complex networks. Rev. Mod. Phys. 2015, 87, 925–979. [Google Scholar] [CrossRef] [Green Version]
Brockmann, D.; Helbing, D. The hidden geometry of complex, network-driven contagion phenomena. Science 2013, 342, 1337–1342. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Giles, J. Making the links. Nature 2012, 488, 448–450. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Conte, R.; Gilbert, N.; Bonelli, G.; Cioffi-Revilla, C.; Deffuant, G.; Kertesz, J.; Loreto, V.; Moat, S.; Nadal, J.P.; Sanchez, A.; et al. Manifesto of computational social science. Eur. Phys. J. Spec. Top. 2012, 214, 325–346. [Google Scholar] [CrossRef] [Green Version]
Barabási, A.L. Network Science; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
Newman, M. Networks: An Introduction; Oxford University Press: Oxford, UK, 2010. [Google Scholar]
Szabo, G.; Huberman, B.A. Predicting the popularity of online content. Commun. ACM 2010, 53, 80–88. [Google Scholar] [CrossRef] [Green Version]
Cheng, J.; Adamic, L.; Dow, P.A.; Kleinberg, J.M.; Leskovec, J. Can cascades be predicted? In Proceedings of the 23rd International Conference on World Wide Web, Seoul, Republic of Korea, 7–11 April 2014; pp. 925–936. [Google Scholar]
Liao, D.; Xu, J.; Li, G.; Huang, W.; Liu, W.; Li, J. Popularity prediction on online articles with deep fusion of temporal process and content features. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 200–207. [Google Scholar]
Chen, X.; Zhou, F.; Zhang, K.; Trajcevski, G.; Zhong, T.; Zhang, F. Information diffusion prediction via recurrent cascades convolution. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, China, 8–11 April 2019; pp. 770–781. [Google Scholar]
Zhou, F.; Xu, X.; Trajcevski, G.; Zhang, K. A survey of information cascade analysis: Models, predictions, and recent advances. ACM Comput. Surv. (CSUR) 2021, 54, 1–36. [Google Scholar] [CrossRef]
Yu, L.; Liu, C.; Zhang, Z.K. Multi-linear interactive matrix factorization. Knowl.-Based Syst. 2015, 85, 307–315. [Google Scholar] [CrossRef] [Green Version]
Yu, L.; Huang, J.; Zhou, G.; Liu, C.; Zhang, Z.K. TIIREC: A tensor approach for tag-driven item recommendation with sparse user generated content. Inf. Sci. 2017, 411, 122–135. [Google Scholar] [CrossRef] [Green Version]
Prasse, B.; Mieghem, P.V. Predicting network dynamics without requiring the knowledge of the interaction graph. Proc. Natl. Acad. Sci. USA 2022, 119, e2205517119. [Google Scholar] [CrossRef] [PubMed]
Zhao, Q.; Erdogdu, M.A.; He, H.Y.; Rajaraman, A.; Leskovec, J. SEISMIC: A self-exciting point process model for predicting tweet popularity. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; pp. 1513–1522. [Google Scholar]
Bao, P.; Shen, H.W.; Huang, J.; Cheng, X.Q. Popularity prediction in microblogging network: A case study on sina weibo. In Proceedings of the 22nd International Conference on World Wide Web (WWW’13 Companion), Rio de Janeiro, Brazil, 13–17 May 2013; ACM: New York, NY, USA, 2013; pp. 177–178. [Google Scholar]
Gao, S.; Ma, J.; Chen, Z. Modeling and predicting retweeting dynamics on microblogging platforms. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, Shanghai, China, 2–6 February 2015; pp. 107–116. [Google Scholar]
Lü, L.; Chen, D.; Ren, X.L.; Zhang, Q.M.; Zhang, Y.C.; Zhou, T. Vital nodes identification in complex networks. Phys. Rep. 2016, 650, 1–63. [Google Scholar] [CrossRef] [Green Version]
Gao, J.; Shen, H.; Liu, S.; Cheng, X. Modeling and predicting retweeting dynamics via a mixture process. In Proceedings of the 25th International Conference Companion on World Wide Web, Montreal, QC, Canada, 11–15 April 2016; International World Wide Web Conferences Steering Committee: Geneva, Switzerland, 2016; pp. 33–34. [Google Scholar]
Yu, H.; Hu, Y.; Shi, P. A prediction method of peak time popularity based on Twitter hashtags. IEEE Access 2020, 8, 61453–61461. [Google Scholar] [CrossRef]
Wu, B.; Cheng, W.H.; Liu, P.; Liu, B.; Zeng, Z.; Luo, J. Smp challenge: An overview of social media prediction challenge 2019. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 2667–2671. [Google Scholar]
Zhang, X.; Aravamudan, A.; Anagnostopoulos, G.C. Anytime Information Cascade Popularity Prediction via Self-Exciting Processes. In Proceedings of the International Conference on Machine Learning, Baltimore, MD, USA, 17–23 July 2022; pp. 26028–26047. [Google Scholar]
Wang, J.; Jiang, W.; Li, K.; Wang, G.; Li, K. Incremental group-level popularity prediction in online social networks. ACM Trans. Internet Technol. (TOIT) 2021, 22, 1–26. [Google Scholar] [CrossRef]
Chen, T.; Guo, J.; Wu, W. Graph representation learning for popularity prediction problem: A survey. arXiv 2022, arXiv:2203.07632. [Google Scholar] [CrossRef]
Hill, A.V. The possible effects of the aggregation of the molecules of hæmoglobin on its dissociation curves. J. Physiol. 1910, 40, i–vii. [Google Scholar]
Goutelle, S.; Maurin, M.; Rougier, F.; Barbaut, X.; Bourguignon, L.; Ducher, M.; Maire, P. The Hill equation: A review of its capabilities in pharmacological modelling. Fundam. Clin. Pharmacol. 2008, 22, 633–648. [Google Scholar] [CrossRef] [PubMed]
Frank, S.A. Input-output relations in biological systems: Measurement, information and the Hill equation. Biol. Direct 2013, 8, 1–25. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Nelson, D.; Lehninger, A.; Cox, M. Lehninger Principles of Biochemistry; W. H. Freeman: New York, NY, USA, 2008. [Google Scholar]
He, X.; Gao, M.; Kan, M.Y.; Liu, Y.; Sugiyama, K. Predicting the popularity of web 2.0 items based on user comments. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, Madrid, Spain, 11–15 July 2014; pp. 233–242. [Google Scholar]
Bandari, R.; Asur, S.; Huberman, B. The pulse of news in social media: Forecasting popularity. In Proceedings of the International AAAI Conference on Web and Social Media, Dublin, Ireland, 4–7 June 2012; Volume 6, pp. 26–33. [Google Scholar]
Kupavskii, A.; Ostroumova, L.; Umnov, A.; Usachev, S.; Serdyukov, P.; Gusev, G.; Kustarev, A. Prediction of retweet cascade size over time. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, Maui, HI, USA, 29 October–2 November 2012; pp. 2335–2338. [Google Scholar]
Li, H.; Ma, X.; Wang, F.; Liu, J.; Xu, K. On popularity prediction of videos shared in online social networks. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, San Francisco, CA, USA, 27 October–1 November 2013; pp. 169–178. [Google Scholar]

Figure 1. The average forwarding amounts of information on WeChat and Weibo display similar statistical trends over time. In this figure, the upper row depicts the relationship between the average forwarding amount and time unit, with the horizontal axis scaled to (a) 1 min and (b) 10 s for WeChat and Weibo, respectively. The lower row is the trend of the average forwarding volume from its peak value over time. In terms of time, it takes time for the amount of news dissemination to reach the average peak, and the dissemination of information on different social platforms shows a large gap in the rate of information dissemination. The transmission rate of information on Weibo is faster than on WeChat. On average, for WeChat, it takes less than 30 min (1800 s) for a message to reach its peak from generation to transmission per unit time, while it takes only 200 s for Weibo.

Figure 2. The flow chart of the proposed AD algorithm.

Figure 3. Predicting the final forward number of messages after seven days based on knowing

T_{k n o w n}

period of information. The upper row of the figure is the results on the WeChat dataset, while the lower is on the Weibo dataset. The X-axis represents the known propagation time. The Y-axis means that the prediction accuracy varies with the time of known information transmission. The granularity of extracted data would affect the accuracy of AD algorithm prediction. In the upper part (WeChat) of the figure, the prediction result would reach a relatively optimal level when the unit time was 10 min, while in the lower part (Weibo) of the figure, the unit time was 120 s. These results indicate that the proposed AD algorithm outperforms the baseline (BS) algorithm.

Figure 3. Predicting the final forward number of messages after seven days based on knowing

T_{k n o w n}

period of information. The upper row of the figure is the results on the WeChat dataset, while the lower is on the Weibo dataset. The X-axis represents the known propagation time. The Y-axis means that the prediction accuracy varies with the time of known information transmission. The granularity of extracted data would affect the accuracy of AD algorithm prediction. In the upper part (WeChat) of the figure, the prediction result would reach a relatively optimal level when the unit time was 10 min, while in the lower part (Weibo) of the figure, the unit time was 120 s. These results indicate that the proposed AD algorithm outperforms the baseline (BS) algorithm.

Figure 4. APE distribution on utilizing the initial 120-min data to predict the number of messages forwarded in the next 7 days. The X-axis represents the number of messages forwarded in the first 120 min, and the Y-axis represents the total number of messages forwarded in 7 days. The colored bars indicate the size of the APE. The upper part of the figure represents the experimental WeChat data results. The lower part of the figure represents the experimental Weibo data results.

Figure 5. Absolute Percentage Error (APE) distribution of the algorithms in the test set. We show the median and the middle 50th, 70th, and 90th percentiles of the distribution of APE across the forward messages. The upper part of the figure represents the experimental WeChat data results. The lower part of the figure represents the experimental Weibo data results.

Figure 6. The APE distribution and the MAPE and TIC index vary with knowing

T_{k n o w n}

period of information when predicting the final forward amount after seven days. The X-axis is the time of the known information set, and Y-axis is the ratio of the APE for predicting the final forward number of messages. Compared with the BS method of predicting the popularity of information, the AD method obviously outperforms in every way. The upper part of the figure represents the experimental WeChat data results. The lower part of the figure represents the experimental Weibo data results.

Figure 6. The APE distribution and the MAPE and TIC index vary with knowing

T_{k n o w n}

period of information when predicting the final forward amount after seven days. The X-axis is the time of the known information set, and Y-axis is the ratio of the APE for predicting the final forward number of messages. Compared with the BS method of predicting the popularity of information, the AD method obviously outperforms in every way. The upper part of the figure represents the experimental WeChat data results. The lower part of the figure represents the experimental Weibo data results.

Figure 7. MAPE of the messages varies with the knowing information in the AD algorithm on the WeChat dataset. The X-axis is the time of the known information set, and Y-axis is the MAPE for predicting the final forward number of messages. The red line represents the messages that have obtained their

Q_{p e a k}

by

T_{k n o w n}

, while the blue line means the messages have not obtained their peak

Q_{p e a k}

by

T_{k n o w n}

. The internal graph is the ratio of true and fake peaks in information propagation over the first known 120 min. AD algorithm can predict more accurately when the

Q_{p e a k}

of the message is known.

Figure 7. MAPE of the messages varies with the knowing information in the AD algorithm on the WeChat dataset. The X-axis is the time of the known information set, and Y-axis is the MAPE for predicting the final forward number of messages. The red line represents the messages that have obtained their

Q_{p e a k}

by

T_{k n o w n}

, while the blue line means the messages have not obtained their peak

Q_{p e a k}

by

T_{k n o w n}

. The internal graph is the ratio of true and fake peaks in information propagation over the first known 120 min. AD algorithm can predict more accurately when the

Q_{p e a k}

of the message is known.

Figure 8. APE distribution of the messages in AD algorithm on the WeChat dataset when the peak forward amount

Q_{p e a k}

is known (left panels) and not known (right panels). The X-axis represents the number of messages forwarded in the known time

T_{k n o w n}

, and the Y-axis represents the total number of messages forwarded in 7 days.

Figure 8. APE distribution of the messages in AD algorithm on the WeChat dataset when the peak forward amount

Q_{p e a k}

is known (left panels) and not known (right panels). The X-axis represents the number of messages forwarded in the known time

T_{k n o w n}

, and the Y-axis represents the total number of messages forwarded in 7 days.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, L.; Yi, L.; Ren, X.-L.; Lü, L. Predicting the Popularity of Information on Social Platforms without Underlying Network Structure. Entropy 2023, 25, 916. https://doi.org/10.3390/e25060916

AMA Style

Wu L, Yi L, Ren X-L, Lü L. Predicting the Popularity of Information on Social Platforms without Underlying Network Structure. Entropy. 2023; 25(6):916. https://doi.org/10.3390/e25060916

Chicago/Turabian Style

Wu, Leilei, Lingling Yi, Xiao-Long Ren, and Linyuan Lü. 2023. "Predicting the Popularity of Information on Social Platforms without Underlying Network Structure" Entropy 25, no. 6: 916. https://doi.org/10.3390/e25060916

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predicting the Popularity of Information on Social Platforms without Underlying Network Structure

Abstract

1. Introduction

2. Materials and Methods

2.1. Empirical Data Analysis

2.2. The Activation-Decay Model

2.2.1. The Hill Equation and BiHill Equation

2.2.2. The Activation-Decay Model

2.2.3. The Algorithm for Popularity Prediction Based on Activation-Decay Model

2.3. Evaluation Metrics for the Prediction Algorithm

2.3.1. APE and MAPE

2.3.2. TIC

2.4. Baseline Algorithm

3. Experimental Results

3.1. Prediction of the Popularity of Information

3.2. Determine the Peak $Q_{p e a k}$

Peak Time $t_{p e a k}$

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Predicting the Popularity of Information on Social Platforms without Underlying Network Structure

Abstract

1. Introduction

2. Materials and Methods

2.1. Empirical Data Analysis

2.2. The Activation-Decay Model

2.2.1. The Hill Equation and BiHill Equation

2.2.2. The Activation-Decay Model

2.2.3. The Algorithm for Popularity Prediction Based on Activation-Decay Model

2.3. Evaluation Metrics for the Prediction Algorithm

2.3.1. APE and MAPE

2.3.2. TIC

2.4. Baseline Algorithm

3. Experimental Results

3.1. Prediction of the Popularity of Information

3.2. Determine the Peak Q p e a k

Peak Time t p e a k

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.2. Determine the Peak $Q_{p e a k}$

Peak Time $t_{p e a k}$