

  • 王伟 ,
  • 冀宇强 ,
  • 王洪伟 ,
  • 郑丽娟
  • 1. 华侨大学工商管理学院 泉州 362021;
    2. 同济大学经济与管理学院 上海 200092;
    3. 聊城大学商学院 聊城 252000

收稿日期: 2017-07-02

  修回日期: 2017-09-09

  网络出版日期: 2017-11-20



Evaluating Chinese Answers' Quality in the Community QA System:A Case Study of Zhihu

  • Wang Wei ,
  • Ji Yuqiang ,
  • Wang Hongwei ,
  • Zheng Lijuan
  • 1. College of Business Administration, Huaqiao University, Quanzhou 362021;
    2. School of Economics and Management, Tongji University, Shanghai 200092;
    3. School of Business, Liaocheng University, Liaocheng 252000

Received date: 2017-07-02

  Revised date: 2017-09-09

  Online published: 2017-11-20




王伟 , 冀宇强 , 王洪伟 , 郑丽娟 . 中文问答社区答案质量的评价研究:以知乎为例[J]. 图书情报工作, 2017 , 61(22) : 36 -44 . DOI: 10.13266/j.issn.0252-3116.2017.22.005


[Purpose/significance] Online Q&A communities have become a major way to access high quality knowledge. It is meaningful to explore the quality of the answer in the Chinese question and answer community which promotes the dissemination of knowledge.[Method/process] In this paper, we focused on the largest Chinese Q&A community-Zhihu. Data mining and machine learning, logistic regression, support vector machine and random forest algorithms were employed to build three classification models with three-level progressive training to predict the answer quality. Then we constructed a feature set including structured features, text features and social features.[Result/conclusion] The experiment results show that the performance of three classification models has been improved significantly with the continuous enrichment of the feature system. We find that the random forest model always shows better performance than other models in the same feature level. Moreover, by analyzing the different kinds of feature combination, the random forest model with social features always outperforms the models without social features, which reflects the value of the social attributes in the evaluation of the answer quality. We conclude that it is reasonable to evaluate the answer quality from the answer itself and the writer's social attributes. The feature system we build can reflect the quality of the answers in a comprehensive way.


