When Single Answer Is Not Enough: Rethinking Single-Step Retrosynthesis Benchmarks for LLMs 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

When Single Answer Is Not Enough: Rethinking Single-Step Retrosynthesis Benchmarks for LLMs arXiv:2602.03554v2 Announce Type: replace-cross Abstract: Recent progress has expanded the use of large language models (LLMs) in drug discovery, including synthesis planning. However, objective evaluation of retrosynthesis performance remains limited. Existing benchmarks and metrics typically rely on published synthetic procedures and Top-K accuracy based on single ground-truth, which does not capture t