BenGER Platform: A Collaborative Web Platform for End-to-End Benchmarking of German Legal Tasks 事件

OPEN_SOURCE2026-05-28影响: MEDIUM

BenGER Platform: A Collaborative Web Platform for End-to-End Benchmarking of German Legal Tasks arXiv:2604.13583v3 Announce Type: replace Abstract: Evaluating large language models (LLMs) for legal reasoning requires workflows that span task design, expert annotation, model execution, and metric-based evaluation. In practice, these steps are split across platforms and scripts, limiting transparency, reproducibility, and participation by non-technical legal experts. We present the BenGER (Benchm