Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text 文章

ArXiv CS.CL2026-06-04NEWSen作者: Tianyang Zhou, Wenbo Chen, Pierre Jinghong Liang, Leman Akoglu

摘要

arXiv:2605.29076v2 Announce Type: replace Abstract: LLMs have advanced text classification, yet existing paradigms face a trade-off: supervised (label only) fine-tuning is scalable but offers limited reasoning on complex text and lacks broader model transparency, while discrete prompt optimization offers human-readable instructions but struggles with performance and scalability. We introduce eXTC (eXplainable Text Classifier) with three progressive stages: (1) learning a Standard Operating Procedure (SOP, or rulebook) in natural language via a new Structured Prompt Optimization algorithm; (2) SOP-grounded reasoning distillation from a large teacher LLM into a compact LM; and (3) expanding reasoning capabilities beyond the initial SOP via reinforcement learning.

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (1)

相关技术查看全部 (1)