ChunkLLM: A Lightweight Pluggable Framework for Accelerating LLMs Inference 事件

ACQUISITION2026-05-26影响: HIGH

ChunkLLM: A Lightweight Pluggable Framework for Accelerating LLMs Inference arXiv:2510.02361v2 Announce Type: replace Abstract: Transformer-based large models excel in natural language processing and computer vision, but face severe computational inefficiencies due to the self-attention's quadratic complexity with input tokens. Recently, researchers have proposed a series of methods based on block selection and compression to alleviate this problem, but they either have issues with semantic inc