登录    注册    忘记密码

详细信息

基于内容相似度的相关性评分算法对比分析研究    

Comparative analysis of correlation scoring algorithms based on content similarity

文献类型:期刊文献

中文题名:基于内容相似度的相关性评分算法对比分析研究

英文题名:Comparative analysis of correlation scoring algorithms based on content similarity

作者:鲍治国[1];王海安[1];胡士伟[1];马西锋[1]

第一作者:鲍治国

机构:[1]河南财经政法大学计算机与信息工程学院,河南郑州450046

第一机构:河南财经政法大学计算机与信息工程学院

年份:2022

卷号:36

期号:19

起止页码:52-55

中文期刊名:电子测试

外文期刊名:Electronic Test

语种:中文

中文关键词:文本相似度;BM25算法;TF-IDF算法;语义化分析

外文关键词:text similarity;BM25 algorithm;TF-IDF algorithm;Semantic analysis

摘要:目前实现智能化推荐功能,通常有两种方式,一种是基于用户的协同过滤推荐系统,另一种是基于内容相似度的推荐系统。采用协同过滤的推荐系统时,通常需要较为庞大的用户群体,本文主要选择基于内容相似度的推荐系统进行论述。在使用该系统时,往往需要对文档与对应的标题或语素进行相关性评分,通过评分对每位用户提供个性化的推荐,进而达到为每位用户提供更好地体验。这就会用到TF-IDF算法和BM25算法对文档进行相关性评分,本文对这两种方法的算法原理、优缺点以及改进方案展开论述,着重强调TF-IDF与BM25算法之间的区别与联系。
At present,there are usually two ways to realize the intelligent recommendation function.One is the user based collaborative filtering recommendation system,and the other is the content similarity based recommendation system.When using collaborative filtering recommendation system,it usually needs a relatively large user group.Therefore,this paper mainly discusses the recommendation system based on content similarity.When using the system,it is often necessary to score the relevance between the document and the corresponding title or morpheme,and provide personalized recommendations to each user through scoring,so as to provide a better experience for each user.This will use TF-IDF algorithm and BM25 algorithm to score the relevance of documents.This paper discusses the algorithm principle,advantages and disadvantages and improvement scheme of these two methods,and focuses on the difference and relationship between TF-IDF and BM25 algorithm.

参考文献:

正在载入数据...

版权所有©河南财经政法大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心