知识组织

融合统计学习和语义过滤的ADR信号抽取模型构建研究

  • 魏巍 ,
  • 郑杜
展开
  • 1. 中南财经政法大学大数据研究院 武汉 430074;
    2. 武汉大学信息管理学院 武汉 430072
魏巍(ORCID:0000-0003-3580-8360),讲师,博士,E-mail:503175355@qq.com;郑杜,博士研究生。

收稿日期: 2017-09-07

  修回日期: 2017-12-05

  网络出版日期: 2018-03-05

基金资助

本文系国家自然科学基金项目"基于文本和web语义分析的智能咨询服务研究"(项目编号:71673209)研究成果之一。

The Study of Adverse Drug Reaction Signal Extraction Framework Based on the Integrated Statistical Learning and Semantic Filter

  • Wei Wei ,
  • Zheng Du
Expand
  • 1. Big data Institute, Zhongnan University of Economics and Law, Wuhan 430074;
    2. The Center for the Studies of Information Resources, Wuhan University, Wuhan 430072

Received date: 2017-09-07

  Revised date: 2017-12-05

  Online published: 2018-03-05

摘要

[目的/意义]社交媒体的出现为医疗健康数据的收集提供了新的途径,应用自然语言处理技术从社交媒体中抽取患者报告的ADR(Adverse Drug Reaction,药物不良反应)信号对于改善药物不良反应监测的临床和科学知识具有很大的潜力。然而,从社会媒体中提取患者报告的ADR信号仍然面临重大挑战。为此,开发一个利用高级自然语言处理技术从健康主题社交媒体中抽取ADR信号的研究模型。[方法/过程]该模型首先采用基于多词典源匹配的方法,从嘈杂的社交媒体中识别医学实体;然后采用最短依存路径核函数为基础的统计学习方法提取药物不良事件;并利用药品安全数据库的语义知识过滤药物的治疗和适用症信息以及否定的药物不良事件;最后,对报告源进行分类剔除传闻等噪音信息。[结果/结论]通过收集糖尿病论坛上的数据对模型的有效性进行验证,结果显示该模型的每一部分都有助于其整体性能的提升。

本文引用格式

魏巍 , 郑杜 . 融合统计学习和语义过滤的ADR信号抽取模型构建研究[J]. 图书情报工作, 2018 , 62(5) : 115 -124 . DOI: 10.13266/j.issn.0252-3116.2018.05.013

Abstract

[Purpose/significance] The emergence of social media provides a new way to collect healthcare data. By using natural language management technology,the adverse drug reaction(ADR)signal can be extracted from social media,it has great potential to improve the clinical and scientific knowledge of ADR monitoring.However, the extraction of ADR from patients' reports in the social media is still a major challenge. This paper puts forwards an adverse drug reaction signal extraction framework based on advanced natural language processing techniques.[Method/process] The ADR signal extraction framework include the following implementation steps:Firstly,it recognizes the medical entity from the noisy social media based on multi-dictionary sources matching. Secondly, it applies statistical learning based on the shortest dependency path kernel to extract the adverse drug events.Then, filtering the information on the treatment and application of drugs as well as negative drug adverse events by though the semantic knowledge of the drug safety database. Finally,in order to remove rumors and other noise information, we should categorize the source of the report.[Result/conclusion] We collect data from BBS diabetes to identify the validity of the model,the result shows that each part of the model contributes to its overall performance.

参考文献

[1] 梁少星,李枫林. 情景感知健康信息服务系统研究现状与展望[J].中华医学图书情报杂志,2014(7):31-36.
[2] 王丹. 药品不良反应主动监测及其发展趋势[J]. 中国药物警戒,2015(10):600-602.
[3] HARPAZ R,DUMOUCHEL W,SHAH N,et al. Novel data-mining methodologies for adverse drug event discovery and analysis[J].Clinical pharmacology and therapeutics,2012,91(6):1010-1021.
[4] LEAMAN R,WOJTULEWICZ L,SULLIVAN R,et al. Towards internet-age pharmacovigilance:extracting adverse drug reactions from user posts to health-related social networks[C]//Proceedings of the 2010 workshop on biomedical natural language processing. Berlin:Association for Computational Linguistics,2010:117-125.
[5] BIAN J,TOPALOGLU U,YU F. Towards large-scale twitter mining for drug-related adverse event[C]//Proceedings of the 2012 international workshop on smart health and wellbeing. New York:ACM,2012:25-32.
[6] CHEE B,BERLIN R,SCHATZ B. Predicting adverse drug events from personal health messages[C]//AMIA annual symposium proceedings.Bethesda:American Medical Informatics Association,2011:217.
[7] 王丽伟. 药物不良事件信息资源整合与数据挖掘研究[D]. 长春:吉林大学,2014.
[8] ABEED S,RACHEL G,AZADEH N,et al. Utilizing social media data for pharmacovigilance:a review[J].Journal of biomedical informatics,2015, 54(C):202-212.
[9] THOMAS P,NEVES M,SOLT I,et al. Relation extraction for drug-druginteractions using ensemble learning[C]//Proceeding of the 1st challenge task on drug-drug interaction extraction. Huelva:Tranging,2011:11-18.
[10] LIU X,CHEN H.A research framework for pharmacovigilance in health social media:identification and evaluation of patient adverse drug event reports[J]. Journal of biomedical informatics,2015,58:268-279.
[11] BENTON A,UNGAR L,HILL S, et al. Identifying potential adverse effects using the web:a new approach to medical hypothesis generation[J]. Journal of biomedical informatics,2011,44(6):989-996.
[12] YANG C,JIANG L,ZHANG M. Social media mining for drug safety signal detection[C]//Proceedings of the 2012 international workshop on smart health and wellbeing. New York:ACM, 2012:33-40.
[13] 代菲,陈盛新,舒丽芯,等. 5种信号挖掘方法在药物不良反应检测中的分析和应用[J].中国医院药学杂志,2012,32(20):1674-1677.
[14] BUNESCU R,MOONEY R. A shortest path dependency kernel for relation extraction[C]//Proceedings of the conference on human language technology and empirical methods in natural language processing.Vancouver:Association for Computational Linguistics,2005:724-731.
[15] LI J,ZHANG Z,LI X.Kernel-based learning for biomedical relation extraction[J].Journal of the American Society for Information Science and Technology, 2008,59(5):756-769.
[16] VINCZE V,SZARVAS G,FARKAS R,et al. The BioScope corpus:biomedical texts annotated for uncertainty, negation and their scopes[J]. BMC bioinformatics,2008,9(S11):1-9.
[17] UZUNER Ö,GOLDSTEIN I,LUO Y. Identifying patient smoking status from medical discharge records[J].Journal of the American Medical Informatics Association,2008,15(1):14-24.
[18] LIU X,CHEN H. AZDrugMiner:an information extraction system for mining patient-reported adverse drug events in online patient forums[C]//Proceedings of the international conference on smart health(ICSH 2013). Berlin:Springer, 2013:134-150.
[19] JOACHIMST.Transductive inference for text classification using support vector machines[C]//Sixteenth international conference on machine learning. San Francisco:Morgan Kaufmann, 1999:200-209.
文章导航

/