收稿日期: 2014-07-24
修回日期: 2014-09-04
网络出版日期: 2014-10-05
基金资助
本文系国家自然科学基金项目“面向专利文献的统计机器翻译语境分析”(项目编号:61303152)和中日国际合作项目“面向科技文献的日汉双向实用型机器翻译合作研究”(项目编号:2014DFA11350)研究成果之一。
Research on Patent Topics Extraction Based on Semantic Role Labeling
Received date: 2014-07-24
Revised date: 2014-09-04
Online published: 2014-10-05
孟令恩 , 李颖 , 何彦青 , 屈鹏 , 王惠临 . 基于语义角色标注的专利主题提取研究[J]. 图书情报工作, 2014 , 58(19) : 19 -24 . DOI: 10.13266/j.issn.0252-3116.2014.19.003
Automatic topics extraction is crucial to mine information of patent literatures. The existing patent text analysis platforms use either manual annotation or templates to find topics. This paper introduces semantic role labeling (SRL) information to help extract patent topics automatically. To improve application effect of SRL to patent literatures, it first introduces the method of automatical sentences implification, then labels semantic roles for the simplified sentences, finally synthesizes semantic information and frequently used words with semantic framework to extract patent topics. The experimental results show that it can s extract beneficial knowledge from patents, and prove the valve of this study.
[1] Gildea D, Jurafsky D.Automatic labeling of semantic roles[J].Computational Linguistics.2002, 28(3):245-288.
[2] Narayanan S, Harabagiu S.Question answer based on semantic structures[C]//Proceedings of the 20th International Conference on Computational Linguistics.Geneva: Association for Computational Linguistics,2004.
[3] Kong Fang, Zhou Guodong,Zhu Qiaoming,et al. Employing the centering theory in pronoun resolution from the semantic perspective[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Suntec: Association for Computational Linguistics, 2009:987-996.
[4] Surdeanu M, Harabgiu S, Willams J, et al.Using predicate-argument structures for information extraction[C]//Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. Sapporo: Association for Computational Linguistics, 2003.
[5] Wu Dekai, Fung P.Can semantic role labeling improve SMT[C]//Proceedings of the 13th Annual Conference of the European Association for Machine Translation. Barcelona:European Association for Machine Translation,2009:218-225.
[6] Baker C F, Fillmore C J, Lowe J B.The Berkeley FrameNet Project [C]// Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics. Montreal: Association for Computational Linguistics, 1998:86-90.
[7] Kipper K, Korhonen A, Ryant N, et al. A large-scale classification of English verbs[J]. Language Resources and Evaluation, 2008(42):21-40.
[8] Palmer M, Gildea D, Kingsbury P. The proposition bank: An annotated corpus of semantic roles[J]. Computational Linguistics, 2005, 31(1):71-106.
[9] Gildea D, Jurafsky D. Automatic labeling of semantic roles[J]. Computational Linguistics, 2002, 28(3):245-288.
[10] Pradhan S, Hacioglu K, Krugler V, et al. Support vector learning for semantic argument classification[J]. Machine Learning Journal, 2005,60(1/3):11-39.
[11] Carreras X, M'arquez L, Chrupaa G. Hierarchical recognition of propositional arguments with perceptrons[C]//Ng H T, Riloff R. HLTNAACL 2004 Workshop: Eighth Conference on Computational Natural Language Learning (CoNLL-2004). Boston:Association for Computational Linguistics, 2004.
[12] Koomen P, Punyakanok V, Roth D, et al. Generalized inference with multiple semantic role labeling systems[C]//Proceedings of CoNLL-2005. Ann Arbor:Association for Computational Linguistics, 2005.
[13] Schapire R E, Singer Y. Improved boosting algorithms using confidencerated predictions[J]. Machine Learning, 1999, 37(3):297-336.
[14] Breiman L. Random forests[J]. Machine Learning, 2001, 45(1):5-32.
[15] 汪雪锋,王有国,刘玉琴.多数据源协同下的专利分析系统构建[J].图书情报工作,2013,57(14):92-96.
[16] 姜彩红,乔晓东,朱礼军.基于本体的专利摘要知识抽取[J].现代图书情报技术,2009(2):23-28.
[17] 张兆锋, 桂婕, 李颖. 中文专利信息资源深加工方案设计与实证研究[J].数字图书馆论坛, 2014(7):45-51.
/
〈 | 〉 |