MENTIS: What Belief Changes Under Alignment? Measuring Multi-Scale Latent Torsion in Language Models 文章

ArXiv CS.CL2026-06-02NEWSen作者: Partha Pratim Saha, Samarth Raina, Mayur Parvatikar, Amit Dhanda, Vinija Jain, Aman Chadha, Amitava Das

MENTIS: What Belief Changes Under Alignment? Measuring Multi-Scale Latent Torsion in Language Models · 相关技术

相关技术