[1]黄林昊 郭昆.基于并行决策树的微博互动数预测[J].福建工程学院学报,2017,15(03):294-300.[doi:10.3969/j.issn.1672-4348.2017.03.019]
 Huang Linhao,Guo Kun.Interaction number prediction of micro-blog based on parallel decision tree[J].Journal of FuJian University of Technology,2017,15(03):294-300.[doi:10.3969/j.issn.1672-4348.2017.03.019]
点击复制

基于并行决策树的微博互动数预测()
分享到:

《福建工程学院学报》[ISSN:2097-3853/CN:35-1351/Z]

卷:
第15卷
期数:
2017年03期
页码:
294-300
栏目:
出版日期:
2017-06-25

文章信息/Info

Title:
Interaction number prediction of micro-blog based on parallel decision tree
作者:
黄林昊 郭昆
福建广播电视大学电子信息与计算机系
Author(s):
Huang Linhao Guo Kun
Electronic Information and Computer Department, Fujian Radio and TV University
关键词:
微博 互动数 并行 决策树 预测
Keywords:
micro-blog interaction number parallel decision tree forecast
分类号:
TP 311.5
DOI:
10.3969/j.issn.1672-4348.2017.03.019
文献标志码:
A
摘要:
社交网络的快速发展,微博成为主要的社交媒体平台,针对如何预测微博文本的未来互动数,对微博进行有效的分发控制的问题,提出一种基于并行决策树的微博互动数所属级数预测的方法。首先,对用户以往发表的微博进行用户特征和微博文本特征的处理;然后,使用并行决策树分类算法对训练数据进行分类模型的构建;最后使用得到的分类模型对新微博文本的互动数所属级数进行分类预测。通过对比算法的实验,验证了所提方法具有较高的分类精度和较好的可扩展性,能够对微博所属级数进行有效的分类预测。
Abstract:
To predict the future interaction number of micro-blog texts to implement effective distribution control of micro-blogs, a method of forecasting the series number of micro-blog interaction numbers based on parallel decision tree was proposed. Firstly, the user characteristics and micro-blog text features of the user’s previous micro-blog were processed. Then, a classification model of the training data was constructed via a parallel decision tree classification algorithm. Finally, the series number of the interaction number of new micro-blog texts was classified via the classification model. The experimental results show that the proposed method has high classification accuracy and good scalability and can effectively forecast micro-blog series.

参考文献/References:

[1] 王洁,汤小春.基于社区网络内容的个性化推荐算法研究[J].计算机应用研究,2011,28(4):1248-1250.
[2] Yang B, Cheung W, Liu J. Community mining from signed social networks[J].IEEE Transactions on Knowledge and Data Engineering,2007,19(10):1333-1348.
[3] Phuvipadawat S,Murata T.Breaking news detection and tracking in Twitter[C]//2010IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT),Aug.31-Sept.3,2010,Toronto,Ontario,Canada.Washington:IEEE Computer Society,2010,3:120-123.
[4] 曹玖新,吴江林,石伟,等.新浪微博网信息传播分析与预测[J].计算机学报,2014,37(4):779-790.
[5] Boyd D, Golder S, Lotan G. Tweet, tweet, retweet: Conversational aspects of retweeting on twitter[C]//43rd Hawaii International Conference on System Sciences (HICSS),Koloa, Kauai,Havaii.Jan 5-8,2010.Washington:IEEE,2010:1-10.
[6] Kwak H, Lee C, Park H, et al.What is Twitter, a social network or a news media?[C]//Proceedings of the 19th International Conference on World Wide Web. Apr 26-30,2010, Raleigh,North Carolina,USA.New York:ACM,2010:591-600.
[7] Stern D H, Herbrich R, Graepel T. Matchbox: large scale online Bayesian recommendations[C]//Proceedings of the 18th International Conference on World Wide Web. Apr 20-24,2009, Madrid,Spain. New York:ACM,2009:111-120.
[8] Zaman T R, Herbrich R, Van Gael J, et al. Predicting information spreading in twitter[J]. Computational Social Science and the Wisdom of Crowds. Citeseer,2010,104(45):17599-17601.
[9] Yang Z,Guo J,Cai K, et al. Understanding retweeting behaviors in social networks[C]//Proceedings of the 19th ACM International Conference on Information and Knowledge Management. Oct 26-30,2010, Toronto,ON,Canada. New York: ACM,2010:1633-1636.
[10] Liben-Nowell D, Kleinberg J. Tracing information flow on a global scale using Internet chain-letter data[J].Proceedings of the National Academy of Sciences,2008,105(12):4633-4638.
[11] Fan P, Li P, Jiang Z, et al. Measurement and analysis of topology and information propagation on Sina-Microblog[C]//2011 IEEE International Conference on Intelligence and Security Informatics(ISI), July 9-12,2011, Beijing China. Washington:IEEE,2011:396-401.
[12] Webberley W,Allen S,Whitaker R.Retweeting: A study of messageforwarding in twitter[C]//2011 Workshop on Mobile and Online Social Networks (MOSN), Sept 8,2011, Milan,Italy. Washington:IEEE,2011:13-18.
[13] 谢婧,刘功申,苏波,等.社交网络中的用户转发行为预测[J].上海交通大学学报,2013,47(4):585-588.
[14] 匡冲,刘知远,孙茂松.微博转发者的个性化排序[J].山东大学学报(理学版),2014,49(11):31-36.
[15] 严玉良,董一鸿,何贤芒,等.FSMBUS:一种基于Spark 的大规模频繁子图挖掘算法[J].计算机研究与发展,2015,52(8):1768-1783.
[16] 丁圣勇,闵世武,樊勇兵.基于Spark平台的NetFlow流量分析系统[J].电信科学,2014,30(10):48-51.
[17] 牛海玲,鲁慧民,刘振杰.基于Spark 的Apriori算法的改进[J].东北师大学报(自然科学版),2016,48(1):84-89.
[18] 徐鹏,林森.基于 C4.5 决策树的流量分类方法[J].软件学报,2009,20(10):2692-2704.
[19] 韩家炜,坎伯.数据挖掘:概念与技术[M].北京:机械工业出版社,2012:213-222.
[20] 陈羽中,郭松荣,陈宏,等.基于并行分类算法的电力客户欠费预警[J].计算机应用,2016,36(6):1757-1761.

更新日期/Last Update: 2017-06-25