An Automatic Classification System of Mass Online Academic Literatures

  • Wang Xiaoyue ,
  • Bai Rujiang ,
  • Wang Xiaodi ,
  • Zhu Na
Expand
  • Institute of Scientific & Technical Information, Shandong University of Technology, Zibo 255049

Received date: 2013-06-18

  Revised date: 2013-07-30

  Online published: 2013-08-20

Abstract

With the development of the Internet, the amount of online academic literatures has been increasing exponentially, and it is difficult for science researchers to harness the power of the literature. It is necessary to develop a method for automatic acquiring, processing and classifying the literatures. This paper designs and implements an automatic classification system for massive online academic literatures based on the experimental researches done before. This system is a modular design which consists of four models of automatic fetching, term-document matrix processing, ontology integrating and semantics-driven classifying. This is proven that it can automatically accomplish the acquiring, processing and classifying online academic literatures.

Cite this article

Wang Xiaoyue , Bai Rujiang , Wang Xiaodi , Zhu Na . An Automatic Classification System of Mass Online Academic Literatures[J]. Library and Information Service, 2013 , 57(16) : 117 -122 . DOI: 10.7536/j.issn.0252-3116.2013.16.022

References

[1] Manning C D,Schuetze H. Foundations of statistical natural language processing[M].Cambridge:The MIT Press, 1999.
[2] Hadoop W T. The definitive guide[M].US:Yahoo Press, 2010.
[3] Miller G A.WordNet: A lexical database for English[J]. Communications of the ACM,1995, 38(11): 39-41.
[4] 白如江,王效岳,亢丽芸. 基于Heritrix的网络学术文献获取研究[J]. 图书情报工作,2012,56(11): 99-104.
[5] 亢丽芸,王效岳,白如江. MapReduce原理及其主要实现平台分析[J]. 现代图书情报技术,2012(2): 60-67.
[6] 白如江,于晓繁,王效岳. 国内外主要本体库比较分析研究[J]. 现代图书情报技术,2011(1): 3-13.
[7] 于晓繁,王效岳,白如江. 本体集成方法和工具综述[J]. 现代图书情报技术,2011(1): 14-21.
[8] 马范玲,胡泽文. 基于SUMO本体的图书自动分类模型研究[J]. 情报杂志,2011(1): 168-173.
[9] 胡泽文,王效岳,白如江. 基于SUMO和WordNet本体集成的文本分类模型研究[J]. 现代图书情报技术,2011(1): 31-38.

Outlines

/