专家视点

国外数据科学研究的回顾与展望

  • 王曰芬 ,
  • 谢清楠 ,
  • 宋小康
展开
  • 1. 南京理工大学经济管理学院 南京 210094;
    2. 南京理工大学知识产权学院 南京 210094
王曰芬(ORCID:0000-0002-7143-7766),教授,博士生导师,E-mail:yuefen163@mail.163.com;谢清楠,博士研究生;宋小康,硕士研究生。

收稿日期: 2016-06-16

  修回日期: 2016-07-02

  网络出版日期: 2016-07-20

基金资助

本文系国家自然科学基金“新研究领域科学文献传播网络生长及对传播效果影响研究”(项目编号:71373124)和南京理工大学科学研究基金(中央高校基本科研业务费专项资金资助)“大数据时代基于深度融合的创新型知识服务体系及其运行机制研究”(项目编号:30916011330)研究成果之一。

Review and Prospect of Overseas Research on Data Science

  • Wang Yuefen ,
  • Xie Qingnan ,
  • Song Xiaokang
Expand
  • 1. School of Economics and Management, Nanjing University of Science & Technology, Nanjing 210094;
    2. School of Intellectual Property, Nanjing University of Science & Technology, Nanjing 210094

Received date: 2016-06-16

  Revised date: 2016-07-02

  Online published: 2016-07-20

摘要

[目的/意义]随着大数据的发展,数据科学已经成为新的研究领域。厘清国外数据科学的发展与研究现状,可为我国今后的研究提供参考与借鉴。[方法/过程]利用Web of Science核心合集数据库对国外有关数据科学的文献进行检索,并对作者、研究方向等外部特征与关键词、主题等内容特征做计量分析;在此基础上,通过逐篇阅读与归类,从数据科学的内涵界定与应用方向两大方面进行综合分析。[结果/结论]最后,结合现有研究面临的问题与解决的方法,对数据科学的未来进行展望。

本文引用格式

王曰芬 , 谢清楠 , 宋小康 . 国外数据科学研究的回顾与展望[J]. 图书情报工作, 2016 , 60(14) : 5 -14 . DOI: 10.13266/j.issn.0252-3116.2016.14.001

Abstract

[Purpose/significance] Data science is becoming a new research field with the development of big data. In order to provide reference for the future research of our country, this paper is to clarify the development and research status of foreign data science.[Method/process] Firstly, we search and retrieve the foreign literature about data science from the core database supported by Web of Science, and we conduct the quantitative analysis on the external features like author and research direction et, and internal features like key words and themes et. After that, we conduct a comprehensive qualitative analysis from two aspects, content definition and application direction, by reading and classifying all articles.[Result/conclusion] Finally, we elaborate what data science will be faced with in the future by connecting the unresolved problems and development trend in nowadays research.

参考文献

[1] OHSUMI N. From data analysis to data science[C]//7th Conference of the International Federation of Classification Societies. Namur:Namur University, 2000:329-334.
[2] GU J F,Zhang L L.Data, DIKW, big data and data science[J]. Procedia computer science,2014(31):814-821.
[3] PROVOST F, FAWCETT T.Data science and its relationship to big data and data-driven decision making[J].Big data,2013, 1(1):51-59.
[4] CHEN C M.CiteSpace:visualizing patterns and trends in scientific literature[EB/OL].[2016-05-20].http://cluster.cis.drexel.edu/~cchen/citespace.
[5] Data science:history[EB/OL].[2016-06-10].https://en.wikipedia.org/wiki/Data_science.
[6] DHAR V, STEIN R. Seven methods for transforming corporate data into business intelligence[EB/OL].[2016-06-01]. http://emergentpublications.com/eco/ECO_papers/Issue1_2_33.pdf.
[7] FRAWLEY W J, PIATETSKY-SHAPIRO G, MATHEUS C J.Knowledge discovery in databases:an overview[J]. AI magazine, 1992, 13(3):57-70.
[8] BERRY M J, LINOFF G. Data mining techniques:for marketing, sales, and customer support[EB/OL].[2016-05-20]. http://dl.acm.org/citation.cfm?id=560675.
[9] GIL PRESS.A very short history of data science[EB/OL].[2016-06-02].http://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science/#6575961c69fd.
[10] CLEVELAND W S. Data science:an action plan for expanding the technical areas of the field of statistics[J]. International statistical review, 2001, 69(1):21-26.
[11] DHAR V. Data science and prediction[J]. Communications of the ACM, 2013, 56(12):64-73.
[12] AALST W V D, DAMIANI E. Processes meet big data:connecting data science with process science[J]. Services computing IEEE transactions on services computing, 2015, 8(6):810-819.
[13] MORAES R M De,MARTíNEZ L. Computational intelligence applications for data science[J]. Knowledge-based systems, 2015,87(C):1-2.
[14] LI T, LU J, LÓPEZ L M. Preface:intelligent techniques for data science[J]. International journal of intelligent systems, 2015, 30(8):851-853.
[15] MARUNGO F, ROBERTSON S, QUON H, et al. Creating a data science platform for developing complication risk models for personalized treatment planning in radiation oncology[C]//201548th Hawaii international conference on system sciences.Hawaii:IEEE Computer Society, 2015:3132-3140.
[16] MYERS K, WIEL S V. Discussion of ‘data science:an action plan for expanding the technical areas of the field of statistics’[J]. Statistical analysis and data mining, 2014, 7(6):420-422.
[17] BAUMER B. A data science course for undergraduates:thinking with data[J]. American statistician, 2015, 69(4):334-342.
[18] MATTMANN C A. Computing:a vision for data science[J]. Nature, 2013, 493(7433):473-475.
[19] HARDIN J, HOERL R, HORTON N J, et al. Data science in statistics curricula:preparing students to "think with data"[J]. American statistician, 2015, 69(4):343-353.
[20] MONDAL K. Design issues of big data parallelisms[M]//Information systems design and intelligent applications:proceedings of third international conference INDIA 2016, volume 2.Visakhapatnam:Springer India, 2016:209-217.
[21] GOLD M, MCCLARREN R, GAUGHAN C. The lessons Oscar taught us:data science and media & entertainment[J]. Big data, 2013, 1(2):105-109.
[22] CURME C, PREIS T, STANLEY H E, et al. Quantifying the semantics of search behavior before stock market moves[J]. Proceedings of the National Academy of Sciences of the United States of America, 2014, 111(32):11600-11605.
[23] JANKE A T, OVERBEEK D L, KOCHER K E, et al. Exploring the potential of predictive analytics and big data in emergency care[J]. Annals of emergency medicine, 2016, 67(2):227-236.
[24] MIRLETZ H M, PETERSON K A, MARTIN I T, et al. Degradation of transparent conductive oxides:interfacial engineering and mechanistic insights[J]. Solar energy materials & solar cells, 2015, 143:529-538.
[25] PREIS T, MOAT S H, STANLEY H E.Quantifying trading behavior in financial markets using Google trends[J].Scientific reports,2013,3:1-6.
[26] KALIDINDI S R, GRAEF M De.Materials data science:current status and future outlook[J].The annual review of materials research,2015. 45:171-193.
[27] MEINRENKEN C J, SAUERHAFT B C, GARVAN A N, et al. Combining life cycle assessment with data science to inform portfolio-level value-chain engineering[J]. Journal of industrial ecology, 2014, 18(5):641-651.
[28] KU J P, HICKS J L, HASTIE T, et al. The mobilize center:an NIH big data to knowledge center to advance human movement research and improve mobility[J]. Journal of the American Medical Informatics Association,2015,22(6):1120-1125.
[29] ASTA S, CURTOIS T, ZCAN E. A tensor based hyper-heuristic for nurse rostering[J]. Knowledge-based systems, 2016, 98:185-199.
[30] PILANIA G, MANNODI-KANAKKITHODI A, UBERUAGA B P, et al. Machine learning bandgaps of double perovskites[J]. Scientific reports, 2016, 6:1-10.
[31] PRUINELLI L, DELANEY C W, GARCIA A,et al.Nursing management minimum data set:cost-effective tool to demonstrate the value of nurse staffing in the big data science era[J]. Nursing economics, 2016, 34(2):66-71,89.
[32] VAN HORN J D, TOGA A W. Human neuroimaging as a "big data" science[J]. Brain imaging & behavior, 2013, 8(2):323-331.
[33] MARGOLIS R, DERR L, DUNN M, et al. The National Institutes of Health's big data to knowledge (BD2K) initiative:capitalizing on biomedical big data[J]. Journal of the American Medical Informatics Association, 2014, 21(6):957-958.
[34] STEPHENS Z D, LEE S Y, FAGHRI F, et al. Big data:astronomical or genomical?[J]. PLoS biology, 2015, 13(7):1-11.
[35] LANGSTON M A, LEVINE R S, KILBOURNE B J, et al. Scalable combinatorial tools for health disparities research[J]. International journal of environmental research & public health, 2014, 11(10):10419-10443.
[36] FRENCH R H, PODGORNIK R, PESHEK T J, et al. Degradation science:mesoscopic evolution and temporal analytics of photovoltaic energy materials[J]. Current opinion in solid state & materials science, 2015, 19(4):212-226.
[37] LORBERBAUM T, SAMPSON K J, WOOSLEY R L, et al. An integrative data science pipeline to identify novel drug interactions that prolong the QT interval[J]. Drug safety, 2016,39(5):433-441.
[38] BEDENKOV A, SHPINEV V, SUVOROV N, et al. consolidating russia and eurasia antibiotic resistance data for 1992-2014 using search engine[J]. Frontiers in microbiology, 2016, 7:1-6.
[39] KHOURY M J, LAM T K, IOANNIDIS J P A, et al. Transforming epidemiology for 21st century medicine and public health[J]. Cancer epidemiology, biomarkers & prevention, 2013, 22(4):508-516.
[40] DOCHERTY A B, LONE N I. Exploiting big data for critical care research[J]. Current opinion in critical care, 2015, 21(5):467-472.
[41] ALLEN B, BRESNAHAN J,CHILDERS L, et al.Software as a service for data scientists[J]. Communications of the ACM, 2012, 55(2):81-88.
[42] DUBOSSARSKY E, FRIEDMAN J H, ORMEROD J T, et al. Wavelet-based gradient boosting[J]. Statistics & computing, 2016,26(1):93-105.
[43] JAY C, HARPER S, DUNLOP I, et al. Natural language search interfaces:health data needs single-field variable search[J]. Journal of Medical Internet Research, 2016, 18(1):1-21.
[44] BOWMAN I, JOSHI S H, VAN HORN J D. Visual systems for interactive exploration and mining of large-scale neuroimaging data archives[J]. Frontiers in neuroinformatics, 2012, 6(4):1-13.
[45] PFEIFFER D U, STEVENS K B. Spatial and temporal epidemiological analysis in the big data era[J]. Preventive veterinary medicine, 2015, 122(1-2):213-220.
[46] JORDAN M I, MITCHELL T M. Machine learning:trends, perspectives, and prospects[J]. Science, 2015, 349(6245):255-260.

文章导航

/