[Purpose/significance] Automatic identification and extraction of research design fingerprint from scientific papers is able to provide researchers with significant methodology and research support for project design, validity evaluation of research methods, problem diagnosis of research process and identification and evaluation of research results.[Method/process]The paper, based on the concept model of research design fingerprint in scientific papers, proposes amulti-rule hybrid machine learning methods to design and implement the fingerprint identification algorithm model and analyze and verify the feasibility and validity of the methodby sample data in the field of datamining.[Result/conclusion] The results show that in addition to the research data and research trends, the recognition accuracy of other research design fingerprint is almost 80%. And the acceptance of coverage, in addition to research tools and research data, is almost 80%.
Qian Li
,
Zhang Xiaolin
,
Wang Qian
. Building and Implement on Automatic Identification Method of Research Design Fingerprint of Scientific Papers[J]. Library and Information Service, 2018
, 62(2)
: 135
-143
.
DOI: 10.13266/j.issn.0252-3116.2018.02.018
[1] 钱力,张晓林,王茜. 基于科技文献的研究设计指纹描述框架研究[J].大学图书馆学报,2015(1):14-20.
[2] GIRJU R, BEAMER B, ROZOVSKAYA A, et al. A knowledge-rich approach to identifying semantic relations between nominals[J]. Information processing & management an international journal, 2010, 46(5):589-610.
[3] WANG D, LIU X, LUO H, et al. A novel framework for semantic entity identification and relationship integration in large scale text data[J]. Future generation computer systems, 2016, 64(C):198-210.
[4] VARGASVERA M, MOTTA E, DOMINGUE J, et al. MnM:ontology driven semi-automatic and automatic support for semantic markup[C]//International conference on knowledge engineering and knowledge management.London:Springer-Verlag, 2002:379-391.
[5] HANDSCHUH S, STAAB S, CIRAVEGNA F. S-cream——Semi-automatic CREAtion of metadata[C]//Knowledge engineering and knowledge management.Ontologies and the semantic Web.London:Springer-Verlag,2002:358-372.
[6] Advanced knowledge technologies[EB/OL].[2017-09-26]. http://www.iam.ecs.soton.ac.uk/projects/akt/.
[7] CIRAVEGNA F, DINGLI A, PETRELLI D, et al. User-system cooperation in document annotation based on information extraction[C]//International conference on knowledge engineering and knowledge management.Ontologies and the semantic web.London:Springer-Verlag,2002:122-137.
[8] DILL S,EIRON N,GIBSON D,et al. A case for automatedlarge scale semantic annotation.[EB/OL].[2016-10-20].http://www.websemanticsjournal.org/index.php/ps/article/viewFile/30/28.
[9] CIRAVEGNA F,CHAPMAN S,DINGLI A,et al. Learning to harvest information for the semantic web[C]//Proceedings of the 1st European semantic web symposium.Greece:Heraklion, 2004:312-326.
[10] GUO Y F, SILINS I, STENIUS U, et al. Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review[J]. Bioinformatics, 2013, 29(11):1440-1447.
[11] 苏牧,肖人彬.基于语句聚类识别的知识动态提取方法研究[J].计算机学报,2001,24(5):487-495.
[12] 许勇,宋柔.基于HMM的百科辞典文本中句子的知识点分类[J].计算机工程与应用,2005:41(4):35-38.
[13] SOLDATOVA L N, LIAKATA M. An ontology methodology and CISP-the proposed core information about scientific papers[EB/OL].[2016-09-24].https://www.aber.ac.uk/en/media/departmental/impacs/computerscience/pdfs/ReportCISPshort.pdf.
[14] HOUNGBO, HOSPICE, MERCER R E. Method mention extraction from scientific research papers[C]//24th International conference on computational linguistics-proceedings of COLING 2012. New York:Curran associates, 2012.
[15] GUPTA S, MANNING C D. Analyzing the dynamics of research by extracting key aspects of scientific papers[C]//Proceedings of 5th international joint conference on natural language processing. New York:Curran associates, 2011:1-9.
[16] KIELA D, GUO Y, STENIUS U, et al. Unsupervised discovery of information structure in biomedical documents[J]. Bioinformatics, 2015, 31(7):1084-1092.
[17] GUO Y F, SILINS I, STENIUS U, et al. Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review[J]. Bioinformatics, 2013, 29(11):1440-1447.
[18] GUO Y F, REICHART R,KORHONEN A. Improved information structure analysis of scientific documents through discourse and lexical constraints[C]//Proceedings of NAACL-HLT, Association for Computational Linguistics.New York:Curran associates, 2013:928-937.
[19] ECKLE-KOHLER J, NGHIEM TD, GUREVYCH I. Automatically assigning research methods to journal articles in the domain of social sciences[J]. Proceedings of the American Society for Information Science and Technology, 2013, 50(1):1-8.
[20] 刘一宁,郑彦宁,化柏林. 学术定义抽取系统实现及实验分析[J].情报理论与实践,2011,34(12):15-19.
[21] 丁君军,郑彦宁,化柏林. 基于规则的学术概念属性抽取[J].情报理论与实践,2011, 34(12):10-14,33.
[22] 郭忠伟,周献中,黄志同. 作战文书自动生成系统中内容规划的设计[J].火力与指挥控制,2002,27(4):51-54.
[23] Pundit-Semantic annotation tool[EB/OL].[2017-03-20].http://thepund.it/.
[24] SWEETEB[EB/OL].[2017-03-20]. http://sweet.kmi.open.ac.uk/.
[25] GUPTA S, MANNING C D. Identifying focus, techniques and domain of scientific papers[EB/OL].[2017-03-20].https://www.researchgate.net/publication/267232558_Identifying_Focus_Techniques_and_Domain_of_Scientific_Papers.
[26] BETHARD S, MARTIN J H. Identification of event mentions and their semantic class[C]//Proceedings of the 2006 conference on empirical methods in natural language processing.Sydney:Emnlp,2006:146-154.
[27] FADER A, SODERLAND S, ETZIONI O. Identifying relations for open information extraction[C]//Conference on empirical methods in natural language processing. Edinburgh:Association for computational linguistics,2011:1535-1545.
[28] FOX-Agile knowledge engineering and semantic web (AKSW)[EB/OL].[2017-03-20]. http://aksw.org/Projects/FOX.html.
[29] 张智雄,吴振新,刘建华,等. 当前知识抽取的主要技术方法解析[J]. 现代图书情报技术, 2008, 24(8):2-11.
[30] MANNING C D, SURDEANU M, BAUER J, et al. The stanfordcorenlp natural language processing toolkit[C]//Proceedings of 52nd annual meeting of the Association for Computational Linguistics:system demonstrations. Maryland:Curran associates, 2014:55-60.
[31] LEE D, PARK J, SHIM J, et al. An efficient similarity join algorithm with cosine similarity predicate[C]//International conference on database and expert systems applications.Heidelberg:Springer, 2010:422-436.
[32] BESSIN J, DAS A. Big data analytics federal business analytics.[EB/OL].[2017-03-20].https://www.xerox.com/downloads/services/white-paper/big-data-analytics.pdf.
[33] 孙坦, 刘峥. 面向外文科技论文信息的知识组织体系建设思路[J].图书与情报,2013(1):2-7.
[34] IEEE互动百科[EB/OL].[2017-04-10].http://www.baike.com/wiki/IEEE.
[35] IEEE_thesaurus_2013.[EB/OL].[2017-04-10]. https://www.ieee.org/documents/ieee_thesaurus_2013.pdf.
[36] About WordNet[EB/OL].[2017-03-20].http://wordnet.princeton.edu/.
[37] Martha Palmer[EB/OL].[2017-03-20].http://verbs.colorado.edu/~mpalmer/projects.html.
[38] POSTEGUILLO S. The schematic structure of computer science research articles[J]. English for specific purposes, 1999, 18(2):139-160.