详细信息
- 来源站点
- ArXiv CS.AI
- 作者
- Osama Zafar, Alexander Nemecek, Yiqian Zhang, Wenbiao Li, Debargha Ganguly, Vikash Singh, Vipin Chaudhary, Erman Ayday
- 文章类型
- NEWS
- 语言
- en
- 发布日期
- 2026-06-02
摘要
arXiv:2605.17034v2 Announce Type: replace-cross Abstract: Standard PII filters often miss contextual data leakage in RAG systems, such as non-regulated attribute clusters that collectively identify individuals. We introduce a Privacy Policy Enforcement (PPE) framework using dual one-class density estimators with fused text embeddings and a calibrated abstain region for out-of-distribution inputs. Using an axis-stratified, multi-LLM synthetic data pipeline across medicine, finance, and law, we found that traditional Gaussian Mixture baselines fail on borderline-safe stress tests by focusing on linguistic register rather than content. Our proposed T3+OCSVM detector, trained on safe and borderline-safe data, achieves a borderline AUROC of 0.93+ while reducing false positives by 44-55 percentage points and maintaining millisecond latency.
相关事件
暂无数据
相关公司
暂无数据
相关人物
暂无数据
相关产品
暂无数据