K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts 事件

Name: K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts
Start: 2026-06-02

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts arXiv:2606.02404v1 Announce Type: new Abstract: Frontier model evaluations are shifting from foundational capabilities (e.g., instruction following and reasoning) toward compositional, agentic ones, but Korean agentic benchmarks remain scarce. We introduce K-BrowseComp, a web-browsing agent benchmark grounded in Korean contexts, consisting of 400 problems. The 300-problem K-BrowseComp-Verified subset is manually constructe

人工智能

关系图谱

K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts 事件

相关公司查看全部 (10)

相关人物查看全部 (1)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)