详细信息
Discovering top-k patterns with differential privacy-an accurate approach ( SCI-EXPANDED收录)
文献类型:期刊文献
英文题名:Discovering top-k patterns with differential privacy-an accurate approach
作者:Zhang, Xiaojian[1,2];Meng, Xiaofeng[1]
第一作者:Zhang, Xiaojian;张啸剑
通讯作者:Zhang, XJ[1]
机构:[1]Renmin Univ China, Sch Informat, Beijing 100872, Peoples R China;[2]Henan Univ Econ & Law, Sch Comp & Informat Engn, Zhengzhou 450002, Peoples R China
第一机构:Renmin Univ China, Sch Informat, Beijing 100872, Peoples R China
通讯机构:[1]corresponding author), Renmin Univ China, Sch Informat, Beijing 100872, Peoples R China.
年份:2014
卷号:8
期号:5
起止页码:816-827
外文期刊名:FRONTIERS OF COMPUTER SCIENCE
收录:;WOS:【SCI-EXPANDED(收录号:WOS:000343707700010)】;
基金:This research was partially supported by the National Natural Science Foundation of China (Grant Nos. 61379050, 91224008), the National 863 High-tech Program (2013AA013204), Specialized Research Fund for the Doctoral Program of Higher Education(20130004130001)
语种:英文
外文关键词:frequent pattern mining; differential privacy; constrained inference
摘要:Frequent pattern mining discovers sets of items that frequently appear together in a transactional database; these can serve valuable economic and research purposes. However, if the database contains sensitive data (e. g., user behavior records, electronic health records), directly releasing the discovered frequent patterns with support counts will carry significant risk to the privacy of individuals. In this paper, we study the problem of how to accurately find the top-k frequent patterns with noisy support counts on transactional databases while satisfying differential privacy. We propose an algorithm, called differentially private frequent pattern (DFP-Growth), that integrates a Laplace mechanism and an exponential mechanism to avoid privacy leakage. We theoretically prove that the proposed method is (lambda, delta)-useful and differentially private. To boost the accuracy of the returned noisy support counts, we take consistency constraints into account to conduct constrained inference in the post-processing step. Extensive experiments, using several real datasets, confirm that our algorithm generates highly accurate noisy support counts and top-k frequent patterns.
参考文献:
正在载入数据...