Indexing and Search of Order-Preserving Submatrix for Gene Expression Data  ( SCI-EXPANDED收录 EI收录)  


英文题名:Indexing and Search of Order-Preserving Submatrix for Gene Expression Data

作者:Jiang, Tao[1];Chen, Bolin[2];Li, Juntao[3];Xu, Guoyu[1]

第一作者:Jiang, Tao

通讯作者:Jiang, T[1]

机构:[1]Henan Univ Econ & Law, Sch Comp & Informat Engn, Zhengzhou 450046, Peoples R China;[2]Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China;[3]Henan Normal Univ, Sch Math & Informat Sci, Xinxiang 453007, Henan, Peoples R China


通讯机构:[1]corresponding author), Henan Univ Econ & Law, Sch Comp & Informat Engn, Zhengzhou 450046, Peoples R China.|[1048412]河南财经政法大学计算机与信息工程学院;[10484]河南财经政法大学;






基金:This work was supported in part by the National Natural Science Foundation of China under Grant 61702161, Grant 61602386, Grant 61972320, Grant 91746115, and Grant 61602153, in part by the Key Research and Development and Promotion Program of He'nan Province of China under Grant 182102210213, Grant 182102210020, and Grant 172102210171, in part by the Key Research Fund for Higher Education of He'nan Province of China under Grant 18A520003, Grant 18A520015, Grant 19A413005, and Grant 18B510004, and in part by the Natural Science Foundation of Shaanxi Province of China under Grant 2017JQ6008.


外文关键词:Gene expression data; online sharing queries; OPSM; pfTree; pIndex

摘要:Bicluster pattern discovery plays a key role in analysis of gene expression data. One vital model of bicluster mining is Order-Preserving SubMatrix (OPSM), which finds similar tendency of some genes on some conditions. Most of the OPSM discovery methods are batch mining techniques and not suitable for low latency data query. To make data analysis efficient and effective, in this paper, we first propose a prefix-tree based indexing method pfTree, then give an optimization technique pIndex that employs row and column header tables to search the positive, negative and time-delayed OPSMs. Meanwhile, we present an online sharing query technique to accelerate the frequent searches. Finally, we conduct extensive experiments and compare our methods with the existing approaches. Experimental results demonstrate the efficiency and effectiveness of the proposed methods.



