Derivation of Similar Web Text and Data Provenance

  • Ni Jing ,
  • Meng Xianxue
  • 1. Economic Management School, Beijing Institute of Petrochemical Technology, Beijing 102617;
    2. Agricultural Institute of Information, Chinese Academy of Agricultural Sciences, Beijing 100081

Received date: 2016-03-22

  Revised date: 2016-06-18

  Online published: 2016-07-05


[Purpose/significance] To solve the problem for lacking of provenance metadata in existing web page, we put forward a method of automatic annotation.[Method/process] By clustering algorithm, automatic semantic annotation and linked data technology, combined with the PROV-POL data provenance model, the derivation of the Web page text entities are detected, through implementing the text level and attribute level data provenance structure.[Result/conclusion] Tests show that the semantic web technology and PROV model used to get the data provenance of web page text is feasible. The recall rate of clustering algorithm we applied needs to be improved. This method has a promising practical value for Web provenance.

Cite this article

Ni Jing , Meng Xianxue . Derivation of Similar Web Text and Data Provenance[J]. Library and Information Service, 2016 , 60(13) : 134 -140,148 . DOI: 10.13266/j.issn.0252-3116.2016.13.017


