登录    注册    忘记密码

详细信息

基于通勤时间距离的流形聚类与可视化  ( EI收录)  

Manifold Clustering and Visualization with Commute Time Distance

文献类型:期刊文献

中文题名:基于通勤时间距离的流形聚类与可视化

英文题名:Manifold Clustering and Visualization with Commute Time Distance

作者:邵超[1];张啸剑[1]

第一作者:邵超

机构:[1]河南财经政法大学计算机与信息工程学院

第一机构:河南财经政法大学计算机与信息工程学院

年份:2015

卷号:52

期号:8

起止页码:1757-1767

中文期刊名:计算机研究与发展

外文期刊名:Journal of Computer Research and Development

收录:CSTPCD;;EI(收录号:20153901305126);Scopus(收录号:2-s2.0-84941985228);北大核心:【北大核心2014】;CSCD:【CSCD2015_2016】;

基金:国家自然科学基金项目(61202285)

语种:中文

中文关键词:流形学习;等距映射;聚类;邻域大小;通勤时间距离

外文关键词:manifold learning; isometric mapping (ISOMAP) ; clustering; neighborhood size; commutetime distance

摘要:现有流形学习算法能比较好地学习和可视化高维数据的低维非线性流形结构,但对难以高效选取的邻域大小参数还比较敏感,且要求数据良好采样于单一流形.为了降低流形学习算法对邻域大小参数的敏感程度,并实现对多流形数据的良好聚类与可视化,提出了1种新的基于通勤时间距离的流形学习算法——CTD-ISOMAP(commute time distance isometric mapping).和欧氏距离相比,通勤时间距离以概率的形式综合考虑了邻域图上2点间的所有连接路径,不但更加鲁棒,而且还能表达数据的内在几何结构.因此,CTD-ISOMAP算法采用通勤时间距离能比较好地识别并删除邻域图中可能存在的"短路"边以及不同流形之间的连接边,从而能在更大的邻域大小参数范围内实现对流形数据的良好可视化,并提高对多流形数据的聚类效果.最后的实验结果证实了该算法的有效性.
The existing manifold learning algorithms can effectively learn and visualize the low- dimensional nonlinear manifold structure of high-dimensional data. However, most efforts to date select the neighborhood size in sensitivity and difficulty, and require sampling the data from a single manifold. To reduce the sensitivity of manifold learning algorithms to the neighborhood size, and address the effective visualization and clustering of multi-manifold data, this paper employs the commute time distance to propose a novel manifold learning algorithm, called CTD-ISOMAP (commute time distance isometric mapping). Compared with Euclidean distance, commute time distance probabilistically synthesizes all the paths connecting any two points in the neighborhood graph. Consequently, it takes into account the intrinsic nonlinear geometric structure for the given data, while still providing the robust results, and then is suitable to identify the shortcut edges and the inter-manifold edges possibly existed in the neighborhood graph. CTD-ISOMAP with the commute time distance, therefore, effectively eliminates the shortcut edges in the neighborhood graph, so that each output achieves the low-dimensional nonlinear manifold structure in the much wider range of the neighborhood size, and eliminates the inter-manifold edges in the neighborhood graph to boost the clustering on multi-manifold data obtained by spectral clustering. Finally, our experimental study verifies the effectiveness of CTD-ISOMAP.

参考文献:

正在载入数据...

版权所有©河南财经政法大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心