ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models 事件

PRODUCT_LAUNCH2026-06-03影响: MEDIUM

ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models arXiv:2606.03157v1 Announce Type: new Abstract: Large language models (LLMs) have been widely adopted in healthcare, yet they still encounter significant challenges in complex clinical decision-making scenarios. Existing benchmarks primarily assess LLM performance in single-course settings and lack systematic evaluation in multi-course scenarios, where a patient's condition evolves over time. To address

ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models · 相关技术