Forgive or forget: Understanding the context of hate in audio retrieval systems 文章

ArXiv CS.CL2026-06-05NEWSen作者: Arghya Pal, Sailaja Rajanala, Raphael C. -W. Phan, Shekhar Nayak

摘要

arXiv:2606.05857v1 Announce Type: new Abstract: Handling toxic retrieval in text-to-audio systems is challenging due to contextual dependencies. Existing strategies (e.g., rephrasing, summarization) risk altering intent or omitting details. We propose a post hoc causal debiasing framework with a sentiment-controlled mediator to preserve semantic relevance while suppressing harmful speech. Our approach is model-agnostic and integrates seamlessly with existing retrieval pipelines. We introduce two variants: Forgive, which re-ranks and filters toxic audio via logit adjustment, and Forget, which generates counterfactual toxic prompts to mitigate harmful retrievals. Experiments show consistent toxicity reduction with minimal loss in retrieval accuracy, improving both safety and reliability.

Forgive or forget: Understanding the context of hate in audio retrieval systems 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (2)