Blind PRNG Hijacking: An Undetectable Integrity-Preserving Attack Against LLM Watermarking 文章

ArXiv CS.AI2026-05-28NEWSen作者: Ziyang You, Huilong He, Xiaoke Yang, Xuxing Lu

摘要

arXiv:2605.28632v1 Announce Type: cross Abstract: Cryptographic watermarking is a leading defense for attributing text generated by large language models (LLMs). Existing schemes, including KGW, Unigram, and DipMark, derive their security guarantees from the assumption that the underlying pseudo-random number generator (PRNG) is trustworthy. This work introduces SeedHijack, the first supply-chain attack on LLM watermarking that is simultaneously (i) blind -- requiring no knowledge of the watermark key, detector, or model logits, (ii) integrity-preserving -- amplifying rather than erasing the watermark signal, and (iii) orthogonal to detection -- the attack-induced bias is statistically independent of all content-side detector statistics, ensuring that amplification and evasion coexist without trade-off. Rather than perturbing generated text, SeedHijack replaces the PRNG at the supply-chain layer, biasing green-list selection without altering output tokens or degrading text quality.

Blind PRNG Hijacking: An Undetectable Integrity-Preserving Attack Against LLM Watermarking 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (6)