情报研究

基于深度神经网络的微博图书名识别研究

  • 朱娜娜 ,
  • 景东 ,
  • 薛涵
展开
  • 1. 哈尔滨学院图书馆 哈尔滨 150001;
    2. 哈尔滨工业大学机电工程学院 哈尔滨 150001;
    3. 哈尔滨工程大学图书馆 哈尔滨 150001
朱娜娜(ORCID:0000-0002-6511-1081),馆员;薛涵(ORCID:0000-0001-6878-0486),馆员,博士。

收稿日期: 2016-01-12

  修回日期: 2016-02-06

  网络出版日期: 2016-02-20

基金资助

本文系国家社会科学基金项目"社交媒体突发公共事件的协同应急机制研究"(项目编号:14CXW045)和教育部人文社会科学研究项目"微博突发公共事件传播路径的实时分析及其趋势预测"(项目编号:13YJC860013)研究成果之一。

A Deep Neural Network for Book Title Identification in Microblog

  • Zhu Nana ,
  • Jing Dong ,
  • Xue Han
Expand
  • 1. Harbin University Library, Harbin 150001;
    2. School of Mechatronics Engineering, Harbin Institute of Technology, Harbin 150001;
    3. Harbin Engineering University Library, Harbin 150001

Received date: 2016-01-12

  Revised date: 2016-02-06

  Online published: 2016-02-20

摘要

[目的/意义] 微博作为一种新兴的社交媒体平台,被互联网用户广泛关注。微博数据中包含着大量的用户信息、用户行为及用户生成内容,基于微博内容自动识别图书名有利于分析用户阅读兴趣、收集用户对图书的评价和挖掘图书相关知识。[方法/过程] 基于微博的数据特点,提出一种基于深度神经网络的表示学习方法,利用微博中候选图书名的上下文连续向量化表示,实现微博内容中的图书名自动识别。[结果/结论] 实验结果表明,所提出的方法显著优于传统基于特征工程的有指导机器学习方法,并达到91.92%的精确率。

本文引用格式

朱娜娜 , 景东 , 薛涵 . 基于深度神经网络的微博图书名识别研究[J]. 图书情报工作, 2016 , 60(4) : 102 -106,141 . DOI: 10.13266/j.issn.0252-3116.2016.04.014

Abstract

As a blooming social media platform, microblog has received extensive attention by the web users. The microblog data include massive user profile, user behavior and user generated content. Automatic identification of book title in microblog contributes to analysis of user interests and data mining of books. [Method/process] Based on the features of the microblog data, in this paper, we proposed a deep neural network approach to identify the book title in the microblog which is posted by users. [Result/conclusion] The experimental results show that the proposed approach significantly outperforms the traditional supervised learning approaches which are based on the feature engineering and the accuracy reaches 91.92%.

参考文献

[1] MCCULLOCH W S, PITTS W. A logical calculus of the ideas immanent in nervous activity[J]. Bulletin of mathematical biophysics, 1943, 5(4):115-133.
[2] COLLOBERT R, WESTON J. A unified architecture for natural language processing: deep neural networks with multitask learning[C] // Proceedings of the 25th international conference on machine learning.Helsinki: ACM, 2008:160-167.
[3] MIKOLOV T, CHEN K. Efficient estimation of word representations in vector space[EB/OL].[2016-01-11]. http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Workshops/WeKEx/paper_6.pdf.
[4] RIZZO G,TRONCY R.NERD:evaluating named entity recognition tools in the Web of data[EB/OL].[2016-01-11].http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Workshops/WeKEx/paper_6.pdf.
[5] FININ T, MURNANE W, KARANDIKAR A. Annotating named entities in Twitter data with crowdsourcing[C] // Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon's mechanical turk.Morristown:Association for Computational Linguistics, 2010:80-88.
[6] LIU X, ZHANG S, WEI F. Recognizing named entities in Tweets[C] //Proceedings of the 49th annual meeting of the Association for Computational Linguistics:human language technologies-Volume 1.Portland:Association for Computational Linguistics, 2011:359-367.
[7] RITTER A, CLARK S, MAUSAM A. Named entity recognition in Tweets: an experimental study[C] // Proceedings of the Conference on empirical methods in natural language processing.Edinburgh:Association for Computational Linguistics, 2011:1524-1534.
[8] LI C, WENG J, HE Q, et al.TwiNER: named entity recognition in targeted Twitter stream[C] //Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval.Portland:ACM, 2012:721-730.
[9] JIANG R, WANG T, TANG J. Named entity recognition for micro-blog[J]. Computer & digital engineering, 2014,42(4):647-651.
[10] TANG J, FANG Z, SUN J. Incorporating social context and domain knowledge for entity recognition[C] // International conference on World Wide Web. Florence: International World Wide Web Conferences Steering Committee, 2015:517-526.
[11] MIKOLOV T, SUTSKEVER I, CHEN K,et al. Distributed representations of words and phrases and their compositionality[J].Advances in neural information processing systems, 2013,26:3111-3119.
[12] CHANG C C, LIN C J. LIBSVM : a library for support vector machines[J]. ACM Transactions on intelligent systems and technology, 2011,2(3):389-396.

文章导航

/