Word Clusters Discovery Based on Semantic Network of Concepts

  • Du Huiping
Expand
  • Information Management Department of Shanghai Normal University, Shanghai 200234

Received date: 2016-06-15

  Revised date: 2016-10-14

  Online published: 2016-11-05

Abstract

[Purpose/significance] This article proposes a new method to recognize word clusters which could be used for semantic tools construction and query expansion. This method can reduce the load of experts' cognition burden and promote the efficiency of generating and updating semantic tools. [Method/process] This paper used Island algorithm in the social network analysis to discover word clusters in a sematic network of concepts which was generated through the word co-occurrence analysis and sematic similarity computing. By taking the finance field as an example, this article compared the proposed method with the hierarchical clustering method and the "same morpheme" method. [Result/conclusion] It is discovered that Island algorithm is better than hierarchical clustering algorithms in word cluster recognition. Island algorithm and the "same morpheme" method have their own advantages, so they can be used in combination and complement each other.

Cite this article

Du Huiping . Word Clusters Discovery Based on Semantic Network of Concepts[J]. Library and Information Service, 2016 , 60(21) : 122 -127 . DOI: 10.13266/j.issn.0252-3116.2016.21.016

References

[1] 刘华梅,侯汉清.基于情报检索的汉语同义词识别初探[J].情报理论与实践,2005,28(4):373-382.
[2] 宋明亮.汉语词汇字面相似性原理与后控制词表动态维护研究[J].情报学报,1996,15(4):261-271.
[3] 朱毅华,侯汉清,沙印亭.计算机识别汉语同义词的两种算法的比较和测评[J].中国图书馆学报,2002,28(4):82-85.
[4] 侯汉清,吴志强.利用字面相似度识别汉语同义词的实验[EB/OL].[2016-05-30].http://d.wanfangdata.com.cn/Conference/3311527.
[5] 章成志.基于文本层次模型的Web概念挖掘研究:基于概念语义网络的自动标引和自动分类研究[D].南京:南京农业大学,2002.
[6] 钟伟金.共现关键词-叙词同义关系自动识别研究:基于户信息法、概率法的对比分析[J].图书情报工作,2012,56(18):122-126.
[7] 谷威,李超凡,王洪俊,等.专利检索日志的同义词获取[J].现代图书情报技术,2015(2):24-30.
[8] 陆勇,侯汉清.面向信息检索的汉语同义词自动识别和挖掘[J].情报理论与实践,2006,29(4):472-475.
[9] 于娟,尹积栋,费庶.基于句法结构分析的同义词识别方法研究[J].现代图书情报技术,2013(9):35-40.
[10] 常春,赖院根.基于文献标题词汇共现获取词间关系研究[J].图书情报工作,2009,53(4):17-20.
[11] 王世清,吴雯娜,常春.叙词表编制中等同关系获取方法[C]//戴维民,赵建华,汪东波,等.网络环境下信息组织的创新与发展:全国第五次情报检索语言发展方向研讨会论文集.北京:国家图书馆出版社,2009:114-119.
[12] Harris Z S.Distributional structure[J].Words,1954,10(23):146-162.
[13] 杜慧平,仲云云.自然语言叙词表自动构建研究[M].南京:东南大学出版社,2010:100-103.
[14] 沃特·德·诺伊,安德烈·姆尔瓦,弗拉迪米尔·巴塔盖尔吉,等.蜘蛛:社会网络分析技术[M].林枫,译.北京:世界图书出版公司,2014:108-110.
[15] Han J,Kamber M.数据挖掘:概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2008:267-269.
[16] 周荣莲.汉语叙词表语义场构造分析[J].图书情报工作,2000,54(8):41-45.
[17] 张琪玉.字面相似聚类法辅助构造词族表、分面类表和自动标引[J].图书馆论坛,2002,22(5):95-96.

Outlines

/