详细信息
时空大数据分布式增量IMSTDCA聚类方法研究
Research on the distributed incremental IMSTDCA clustering method on spatio-temporal big data
文献类型:期刊文献
中文题名:时空大数据分布式增量IMSTDCA聚类方法研究
英文题名:Research on the distributed incremental IMSTDCA clustering method on spatio-temporal big data
作者:李欣[1,2];孟德友[1,2]
第一作者:李欣
机构:[1]河南财经政法大学中原经济区"三化"协调发展河南省协同创新中心;[2]河南财经政法大学资源与环境学院
第一机构:河南财经政法大学
年份:2017
卷号:26
期号:11
起止页码:12-17
中文期刊名:测绘工程
外文期刊名:Engineering of Surveying and Mapping
收录:CSTPCD;;CSCD:【CSCD_E2017_2018】;
基金:国家自然科学基金资助项目(41501178);河南财经政法大学博士科研启动基金资助项目(800257)
语种:中文
中文关键词:时空数据;大数据;聚类分析;增量聚类;时空邻域
外文关键词:spatio-temporal data; big data; cluster analysis; incremental clustering; spatio-temporalneighborhood
摘要:时空聚类分析是对时空大数据进行利用的一种有效手段,目前传统聚类算法存在着大规模分布数据难以处理,海量数据处理时间较长,确定参数困难,聚类质量较差等缺陷。因此,提出一种分布式增量聚类流程DICP,利用广域网分布增量聚类方法,避免大量数据的传输拷贝,有效提升聚类运算效率。对于DICP流程中的时空数据聚类算法本身,研究了一种大数据环境下的IMSTDCA时空数据聚类算法,借助密度聚类的思想,通过时空数据的聚集趋势预分析、时空数据聚类算法,以及时空数据聚类结果评价3个步骤完成聚类分析,实现时空大数据的快速高效信息挖掘。
Spatio-temporal clustering analysis is an effective means of using spatio-temporal big data. At present, the traditional clustering algorithm has some disadvantages, for which it's difficult to deal with massive data, it takes much time to process massive data, it's difficult to confirm the parameters, and the quality of clustering result is low. Therefore, a method, named distributed incremental clustering process (DICP) based on MapReduce is proposed in this paper, which can avoid the transferring and copying of large amounts of data, and greatly improve the efficiency of clustering operation. This paper studies IMSTDCA spatio-temporal data clustering algorithm on big data in DICP. This clustering algorithm makes clustering with the help of density clustering, including three steps, the analysis of gathered trend of spatio-temporal data, the spatio-temporal data clustering algorithm, and the evaluation of spatio-temporal data clustering result. This clustering algorithm can obtain valuable information from spatio-temporal big data in a fast and efficient way.
参考文献:
正在载入数据...