delphi1987的个人博客分享 http://blog.sciencenet.cn/u/delphi1987

博文

如何筛选文献关键词进行领域知识分析?三种方法比较

已有 6674 次阅读 2016-1-30 16:57 |个人分类:论文交流|系统分类:论文交流

Selecting publication keywords for domain analysis in bibliometrics: A comparison of three methods

14年底投Journal of Informetrics的一篇文章,刊在JOI 2016年第1期上。在刊于《情报学报》的《领域知识分析中的关键词选择方法研究》基础上增加了TF-IDF方法,变成TF、TF-IDF、TF-KAI三种方法比较(初稿中还引入了LDA方法,但LDA结果接近TF,因此在修改中删掉了)。总体思路见共词分析方法优化的一个思路体系


Highlights
Selecting keywords as analysis object is important but lesser-noticed in bibliometrics.

Keywords should be investigated based on their status both inside and outside the domain.

The Keyword Activity Index is utilized to identify topical emphasis of a domain.

The keywords selected by TF, TF-IDF, and TF-KAI are evaluated both qualitatively and quantitatively.

TF-KAI performs better in selecting publication keywords for domain analysis.


Abstract

Publication keywords have been widely utilized to reveal the knowledge structure of research domains. An important but under-addressed problem is the decision of which keywords should be retained as analysis objects after a great number of keywords are gathered from domain publications. In this paper, we discuss the problems with the traditional term frequency (TF) method and introduce two alternative methods: TF-inverse document frequency (TF-IDF) and TF-Keyword Activity Index (TF-KAI). These two methods take into account keyword discrimination by considering their frequency both in and out of the domain. To test their performance, the keywords they select in China's Digital Library domain are evaluated both qualitatively and quantitatively. The evaluation results show that the TF-KAI method performs the best: it can retain keywords that match expert selection much better and reveal the research specialization of the domain with more details.


地址:Selecting publication keywords for domain analysis in bibliometrics: A comparison of three methods




https://wap.sciencenet.cn/blog-821540-953484.html

上一篇:【博士期间研究总结】共词分析方法优化的一个思路体系
收藏 IP: 171.41.24.*| 热度|

7 许培扬 赵星 武夷山 赵宇翔 章成志 刘桂锋 张恒

该博文允许注册用户评论 请点击登录 评论 (2 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-12-27 09:20

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部