The Attentional White Bear Effect in Transformer Language Models 事件

Name: The Attentional White Bear Effect in Transformer Language Models
Start: 2026-05-28

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

The Attentional White Bear Effect in Transformer Language Models arXiv:2605.28639v1 Announce Type: new Abstract: Instruction-based suppression is widely used to prevent language models from generating prohibited content, yet it remains unclear whether suppression reduces internal representation or merely suppresses expression. We investigate this question through representational probing, attention analysis, and behavioral semantic leakage experiments across multiple transformer models. We find

人工智能

关系图谱

The Attentional White Bear Effect in Transformer Language Models 事件

相关公司查看全部 (10)

相关人物

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)