GEO-Bench: Benchmarking Ranking Manipulation in Generative Engine Optimization 文章

ArXiv CS.AI2026-05-29NEWSen作者: Ojas Nimase, Zhe Chen, Gengpei Qi, Yue Zhao, Xiyang Hu

摘要

arXiv:2605.29107v1 Announce Type: cross Abstract: Large language models (LLMs) increasingly rank products, documents, and recommendations for user queries, which makes manipulating these rankings a growing concern for fairness and information integrity. Research on generative engine optimization (GEO) has produced many manipulation methods, but each is evaluated on its own dataset with its own metrics, so their relative strength and detectability stay unclear. We present GEO-Bench, a benchmark that evaluates GEO ranking-manipulation attacks under one protocol. It unifies black-box prompt-based attacks (TAP, Zero-Shot), white-box gradient-based attacks (STS, RAF, StealthRank), and ten white-hat C-SEO strategies. We score every method on five datasets against a fixed open-weight ranker (Llama-3.1-8B-Instruct), using metrics for both effectiveness (NRG, Success@{\alpha}, Promote@{\alpha}) and stealth (keyword violation rate, perplexity ratio).