登录    注册    忘记密码

详细信息

结合扩充词典与自监督学习的网络评论情感分类    

Sentiment Classification of Network Reviews Combining Extended Dictionary and Self-supervised Learning

文献类型:期刊文献

中文题名:结合扩充词典与自监督学习的网络评论情感分类

英文题名:Sentiment Classification of Network Reviews Combining Extended Dictionary and Self-supervised Learning

作者:景丽[1];李曼曼[1];何婷婷[1]

第一作者:景丽

机构:[1]河南财经政法大学计算机与信息工程学院,郑州450046

第一机构:河南财经政法大学计算机与信息工程学院

年份:2020

卷号:47

期号:S02

起止页码:78-82

中文期刊名:计算机科学

外文期刊名:Computer Science

收录:CSTPCD;;北大核心:【北大核心2017】;CSCD:【CSCD_E2019_2020】;

基金:国家自然科学基金(61806073,31700858,61802110)。

语种:中文

中文关键词:网络评论;情感分类;词向量;情感词典;机器学习

外文关键词:Internet reviews;Sentiment classification;Word vectors;Sentiment dictionary;Machine learning

摘要:在高速发展的互联网时代,网络评论情感分析对分析舆情、监控电商有着重要作用。现有分类方法主要有情感词典方法和机器学习方法。情感词典方法过于依赖词典中的情感词,情感词典越完备,网络评论情感倾向越显著,分类效果越好,但对于情感倾向不易区分的评论,其分类效果欠佳。机器学习方法是一种有监督的方法,其分类效果依赖于大量事先标注的语料,目前语料标注是通过人工完成,工作量极大。文中综合了情感词典和机器学习两种方法的特点,构建了一个网络评论情感分类模型,利用相关领域网络评论对情感词典进行扩充,基于情感词典方法的分类结果,通过自监督学习训练一个分类器,进而提高情感倾向模糊文本的分类正确率。实验表明,与情感词典方法和机器学习方法相比,所提模型在酒店评论、京东评论两个数据集上都获得了更好的情感分类效果。
In the rapidly developing Internet era,sentiment analysis of online reviews plays an important role in analyzing public opinion and monitoring e-commerce.Existing classification methods mainly include sentiment dictionary methods and machine learning methods.The sentiment dictionary method relies too much on the sentiment words in the dictionary.The more complete the sentiment dictionary,the more pronounced the sentiment tendency of online comments and the better classification effect.The classification effect of comments is not good when the sentiment tendencies are not easy to distinguish.The machine learning method is a supervised method,and its classification effect relies on a large number of pre-annotated corpora.Currently,the corpus annotation is done manually,and the workload is extremely large.This paper combines characteristics of the two methods to build a new sentiment classification model of network reviews.First,the sentiment dictionary is expanded based on the domain of online reviews,and the sentiment value of each online comment is calculated according to the extended sentiment dictionary.According to the preset sentiment threshold,the comments with significant is sentiment tendencies and higher accuracy are selected as the definite set,and the rest that are not easily distinguished are used as uncertain sets.The classification result of the definite set is directly determined by the sentiment value.Second,according to the definite set from the sentiment dictionary method,a classifier is trained through self-supervised learning,and the training data do not require manual annotation.Finally,the trained classifier is used to classify the uncertain set again,and an improved algorithm is used to improve the classification result of the uncertain set.Experiments show that,compared with the sentiment dictionary method and the machine learning method,the proposed model achieves a better sentiment classification effect for the sentiment classification of hotel reviews and Jingdong reviews.

参考文献:

正在载入数据...

版权所有©河南财经政法大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心