"I've Seen How This Goes": Characterizing Diversity via Progressive Conditional Surprise 文章

ArXiv CS.CL2026-06-02NEWSen作者: Matthew Khoriaty, David Williams-King, Shi Feng

摘要

arXiv:2606.01811v1 Announce Type: new Abstract: Measuring the diversity of creative outputs is central to evaluating post-training mode collapse, comparing decoding strategies, and quantifying creative behavior in both AI and human writing. We propose a new approach to measuring diversity using in-context learning, of which the ``Decan'' metric, $D_{Ca_n} = C \times a_n$, is the working instance we evaluate: a per-byte score read off the per-token log-probabilities of a base model $\theta$ in a \emph{single forward pass} per permutation, with no embedding model, no reference corpus, and no human labels. This approach is grounded in information theory, makes use of language model in-context learning to detect a wide range of similarities between any number of inputs, and obviates the need to train a special-purpose model. The same pipeline scores AI samples and human-written response sets, with diversity treated as a property of (responses, prompt, scoring model).

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据