Then, we conduct LDA on traces of tweets D for every user. Table 2 lists 5 topics obtained by LDA as an example and the top 5 associated translated words in each topic.Table 1. Dataset Statistics. Users 211,000 Retweets 7,223,036 Original tweets 39,779,870 Relations 1,612,doi:10.1371/journal.pone.0158855.tTable 2. Mitochondrial division inhibitor 1 web Examples of topics and associated words extracted by LDA. Topic # Associated words 1 company management marketing brand market doi:10.1371/journal.pone.0158855.t002 2 user technology intelligence APPs Android 3 children parents education teacher cultivation 4 designs photo works photography creativity style 5 match sports the world cup seasons NBAPLOS ONE | DOI:10.1371/journal.pone.0158855 July 14,8 /Discover Influential LeadersExperiment resultsIn this section, we make comparisons against related algorithms using the above dataset with Spark. The related algorithms studied include: ?PageRank, which measures the influence by taking the link structure of the network into account. The experiment setting in the comparison is set as follows: the return probability is 0.15 and the transition probability is pij ?1=kiout , representing the probability that i goes to j, while kiout denotes the outbound degree of i. ?LeaderRank, which introduced a ground node to connect all of the nodes, setting the transition probability as pij ?1=kiout . ?TwitterRank, which first studied topic-related ranking. In this comparison, we extract the same topics using LDA in TD-Rank. The first experiment discusses the topic property of the rank result. We introduce the entropy on topic distribution. The entropy is calculated by: Ei ??T X t?po log o ?i;t i;t?0?Notice that we use the topic distribution on the original tweets here because original tweets are posted by a user to express his or her own interests. Then, we obtain the average entropy of users grouped in ranking list order as 1-10, 11-20, 21-30, 31-40 and 41-50. The comparison GDC-0084 web results are demonstrated in Fig 3. For ease of visualization, we first compare PageRank, LeaderRank and 20 topic-related TD-Rank results in Fig 3(a). For TwitterRank, we select top 10 users in every topic as an example and compare the results with TD-Rank in Fig 3(b). From the results, it is obvious that the users ranked by TD-Rank and TwitterRank have far less entropy compared with PageRank and LeaderRank, indicating that our proposed algorithm finds topic-related influential leaders similarly to TwitterRank. Moreover, in Fig 3(b) the ranking of users by TwitterRank has more entropy than the ranking by TD-Rank, indicating that users in our ranking list are more closely related to the same topic. Another issue concerning the ranking results is the problem of robustness. Many spammers exist in social networks who attempt to gain reputation for advertising purposes [24]. To investigate this issue, we create the v edges which link v fake followers to every user and observe the positional changes in the ranking. Specifically, we simulate the situation where a user creates v fake spammers and compare the positional changes in both ranking results. The whole process is described as follows. Suppose the user is i, we randomly select v users denoting as u1, u2, . . ., uv. Then following directed links are created to disturb the algorithm: < u1, i >, < u2, i >, . . ., < uv, i >. The results are reported in Fig 4. The horizontal axis of Fig 4 shows the original rank of a particular user, and the vertical axis is the manipulated.Then, we conduct LDA on traces of tweets D for every user. Table 2 lists 5 topics obtained by LDA as an example and the top 5 associated translated words in each topic.Table 1. Dataset Statistics. Users 211,000 Retweets 7,223,036 Original tweets 39,779,870 Relations 1,612,doi:10.1371/journal.pone.0158855.tTable 2. Examples of topics and associated words extracted by LDA. Topic # Associated words 1 company management marketing brand market doi:10.1371/journal.pone.0158855.t002 2 user technology intelligence APPs Android 3 children parents education teacher cultivation 4 designs photo works photography creativity style 5 match sports the world cup seasons NBAPLOS ONE | DOI:10.1371/journal.pone.0158855 July 14,8 /Discover Influential LeadersExperiment resultsIn this section, we make comparisons against related algorithms using the above dataset with Spark. The related algorithms studied include: ?PageRank, which measures the influence by taking the link structure of the network into account. The experiment setting in the comparison is set as follows: the return probability is 0.15 and the transition probability is pij ?1=kiout , representing the probability that i goes to j, while kiout denotes the outbound degree of i. ?LeaderRank, which introduced a ground node to connect all of the nodes, setting the transition probability as pij ?1=kiout . ?TwitterRank, which first studied topic-related ranking. In this comparison, we extract the same topics using LDA in TD-Rank. The first experiment discusses the topic property of the rank result. We introduce the entropy on topic distribution. The entropy is calculated by: Ei ??T X t?po log o ?i;t i;t?0?Notice that we use the topic distribution on the original tweets here because original tweets are posted by a user to express his or her own interests. Then, we obtain the average entropy of users grouped in ranking list order as 1-10, 11-20, 21-30, 31-40 and 41-50. The comparison results are demonstrated in Fig 3. For ease of visualization, we first compare PageRank, LeaderRank and 20 topic-related TD-Rank results in Fig 3(a). For TwitterRank, we select top 10 users in every topic as an example and compare the results with TD-Rank in Fig 3(b). From the results, it is obvious that the users ranked by TD-Rank and TwitterRank have far less entropy compared with PageRank and LeaderRank, indicating that our proposed algorithm finds topic-related influential leaders similarly to TwitterRank. Moreover, in Fig 3(b) the ranking of users by TwitterRank has more entropy than the ranking by TD-Rank, indicating that users in our ranking list are more closely related to the same topic. Another issue concerning the ranking results is the problem of robustness. Many spammers exist in social networks who attempt to gain reputation for advertising purposes [24]. To investigate this issue, we create the v edges which link v fake followers to every user and observe the positional changes in the ranking. Specifically, we simulate the situation where a user creates v fake spammers and compare the positional changes in both ranking results. The whole process is described as follows. Suppose the user is i, we randomly select v users denoting as u1, u2, . . ., uv. Then following directed links are created to disturb the algorithm: < u1, i >, < u2, i >, . . ., < uv, i >. The results are reported in Fig 4. The horizontal axis of Fig 4 shows the original rank of a particular user, and the vertical axis is the manipulated.