AgentCompile: An LLM-Guided Compiler for Direct CUDA Inference 事件
PRODUCT_LAUNCH2026-06-09影响: MEDIUM
AgentCompile: An LLM-Guided Compiler for Direct CUDA Inference arXiv:2606.07665v1 Announce Type: cross Abstract: Transformer inference increasingly depends on specialized compiler and runtime support, but real model graphs still require semantic decisions about which regions are worth specializing and which CUDA implementation families are plausible. We present AgentCompile, an LLM-guided CUDA inference compiler that uses LLM outputs only as advisory search metadata. Given compiler-derived regi