BenGER: Benchmarking LLM Systems on Subsumption-Based Legal Reasoning in German Law 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

BenGER: Benchmarking LLM Systems on Subsumption-Based Legal Reasoning in German Law arXiv:2605.28183v1 Announce Type: new Abstract: We introduce the BenGER (Benchmark for German Law) dataset for evaluating LLM systems on subsumption-based legal reasoning in German law. The BenGER dataset consists of three components: 596 exam-style free-text legal case tasks across multiple levels of legal education and 531 short doctrinal reasoning tasks. We evaluate 12 contemporary LLM systems -- closed flags