引用本文
  • 王文灏,邱佳宝,袁曙光,陈驰.一个噪音鲁棒的数据集隐私保护指纹方案[J].信息安全学报,已采用    [点击复制]
  • wangwenhao,qiujiabao,yuanshuguang,chenchi.A noise-resistant and privacy-preserving fingerprinting scheme for datasets[J].Journal of Cyber Security,Accept   [点击复制]
【打印本页】 【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

过刊浏览    高级检索

本文已被:浏览 6271次   下载 4664  
一个噪音鲁棒的数据集隐私保护指纹方案
王文灏, 邱佳宝, 袁曙光, 陈驰
0
(中国科学院信息工程研究所)
摘要:
差分隐私(Differential privacy, DP)指纹技术近来被广泛研究并被用来进行数据集分发中的隐私和版权保护。DP 指纹技术 结合了数字指纹技术和隐私保护模型: ε -DP。前者可用于叛徒追踪,定位非法分发数据集的数据接收者;后者可用于保护数 据集中的敏感信息,并提供一定的数据统计分析准确性。目前的 DP 指纹方案存在两方面的问题:1.基于 ε -entry-level DP 实现 的 DPFP 方案[4]在发布数据集时会发布主键属性,因此该方案对数据集的隐私保护很弱。2.基于 ε -DP 实现的 SNFP-DP 方案[3] 虽然对数据集提供了很强的隐私保护,但它通过计算方差来进行指纹检测,所以该方案噪音攻击的鲁棒性低。 本文为数据集提出了一个隐私保护定义 ε -multiset-level DP,并设计了一个满足该定义的噪音鲁棒 DP 指纹方案-NLAP。 ε -multiset-level DP 解决了 ε -DP 中存在的数据集表示问题,即更改 ε -DP 数据集 S 中数据的排列顺序会令 S 不满足 ε -DP,并 且 ε -multiset-level DP 可以对数据集提供和 ε -DP 一样的隐私保护。本文提出的 DP 指纹方案通过向数据集添加服从 Laplace 分 布的噪声实现隐私保护,在加入噪声的同时记录数据集中每条数据添加的噪音范围,通过这种方式对数据集进行指纹,从而实 现数据集的版权保护。我们的方案对噪音的鲁棒性由指纹的检测方式保证,通过将噪音数据集与原始数据集对应位置相减的方 式,将偏移量大于 θ 的 Lapalce 噪音收集起来,计算噪音的均值进行指纹检测,由于无偏的噪音不影响均值的期望,这样 NLAP 实现了对噪音攻击的可证明鲁棒性。在隐私保护能力上我们的方案优于 DPFP 方案。我们和基于 ε -DP 实现的 SNFP-DP 方案进 行了全面的对比。鲁棒性实验结果表明,在噪音攻击下,NLAP 的指纹恢复率相比于 SNFP-DP 方案提升了 4 倍,同时我们进 行了鲁棒性理论分析,为我们的实验结果提供了理论支撑。可用性实验结果表明,在各种指标下, NLAP 相比于 SNFP-DP 都 有显著的提升。
关键词:  数据集  差分隐私  叛徒追踪
DOI:10.19363/J.cnki.cn10-1380/tn.2024.08.19
投稿时间:2022-11-18修订日期:2023-02-12
基金项目:
A noise-resistant and privacy-preserving fingerprinting scheme for datasets
wangwenhao, qiujiabao, yuanshuguang, chenchi
(Institute of Information Engineering, Chinese Academy of Sciences)
Abstract:
Differentially private (DP) fingerprinting has recently been widely studied and used for privacy and copyright protection in dataset distribution. DP fingerprinting technology combines digital fingerprinting techniques and privacy models: -DP. The former can be used for traitor tracking to locate data recipients who illegally distributed datasets; the latter can be used to protect sensitive information in datasets and provide some accuracy in the statistical analysis of data. The current DP fingerprinting schemes have two weaknesses: 1. The DPFP scheme based on ε -entry-level DP implementation will publish primary key attributes when distributing datasets, so the scheme is weak in protecting the privacy of datasets. 2. The SNFP-DP scheme based on ε -DP implementation provides strong privacy protection for the dataset, but it performs fingerprint detection by computing the variance, so the robustness of the noise attack of this scheme is low. In this paper, we propose a privacy model ε -multiset-level DP for datasets and design a noise-robust DP fingerprinting scheme (NLAP) that satisfies ε -multiset-level DP solves the dataset representation problem in ε -DP, i.e., changing the ordering of data in the ε -DP dataset causes not satisfying ε -DP, and ε - multiset-level DP can provide the same privacy protection for datasets as ε -DP. The DP fingerprinting scheme proposed in this paper achieves privacy protection by adding noise obeying Laplace distribution to the dataset and recording the range of noise added to each data item in the dataset while adding noise. The robustness of our scheme to noise is ensured by the fingerprinting detection method by subtracting the noisy dataset from the corresponding position of the original dataset, collecting the Laplace noise with offset greater than θ and calculating the mean value of the noise for fingerprinting. Since unbiased noise does not affect the expectation of the mean value, such an NLAP achieves provable robustness to noise attacks. Our scheme outperforms the DPFP scheme in terms of privacy-preserving capability. We perform a comprehensive comparison with the SNFP-DP scheme based on the ε -DP implementation. The robustness experimental results show that the fingerprint recovery rate of NLAP under noise attacks is improved by 4 times compared to the SNFP-DP scheme, and we also perform a robustness theory analysis to provide theoretical support for our experimental results. The usability experimental results show that NLAP has a significant improvement over SNFP-DP under various metrics.
Key words:  数据集  差分隐私  叛徒追踪