Rethinking Layer Redundancy: Calibration Matters More Than Search in LLM Depth Pruning 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

Rethinking Layer Redundancy: Calibration Matters More Than Search in LLM Depth Pruning arXiv:2604.24938v3 Announce Type: replace-cross Abstract: Depth pruning improves the inference efficiency of large language models by removing Transformer blocks. Prior work typically treats layer redundancy as an inherent structural property of pretrained networks, emphasizing importance criteria and search algorithms to identify removable layers. In this study, we empirically investigate depth pruning from