GAPD: Gold-Action Policy Distillation for Agentic Reinforcement Learning in Knowledge Base Question Answering 事件

Name: GAPD: Gold-Action Policy Distillation for Agentic Reinforcement Learning in Knowledge Base Question Answering
Start: 2026-05-29

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

GAPD: Gold-Action Policy Distillation for Agentic Reinforcement Learning in Knowledge Base Question Answering arXiv:2605.29584v1 Announce Type: new Abstract: Reinforcement learning (RL) is a natural fit for agentic knowledge base question answering (KBQA), where a model must issue executable actions, observe knowledge-base feedback, and eventually return an answer. However, current RL-based KBQA systems mainly optimize sparse rewards from the final answer, leaving intermediate action errors wea

人工智能

关系图谱

GAPD: Gold-Action Policy Distillation for Agentic Reinforcement Learning in Knowledge Base Question Answering 事件

相关公司查看全部 (10)

相关人物查看全部 (3)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)