Chunking Methods on Retrieval-Augmented Generation - Effectiveness Evaluation Against Computational Cost and Limitations 文章

ArXiv CS.CL2026-06-02NEWSen作者: Mateusz \'Smigielski (Department of Artificial Intelligence, Faculty of Information and Communication Technology, Wroc{\l}aw University of Science and Technology, Wroc{\l}aw 50-370, Poland), Micha{\l} Rajkowski (Department of Artificial Intelligence, Faculty of Information and Communication Technology, Wroc{\l}aw University of Science and Technology, Wroc{\l}aw 50-370, Poland), Mateusz Zbrocki (Department of Artificial Intelligence, Faculty of Information and Communication Technology, Wroc{\l}aw University of Science and Technology, Wroc{\l}aw 50-370, Poland), Micha{\l} Bernacki-Janson (Department of Artificial Intelligence, Faculty of Information and Communication Technology, Wroc{\l}aw University of Science and Technology, Wroc{\l}aw 50-370, Poland), Karol Kunicki (Department of Artificial Intelligence, Faculty of Information and Communication Technology, Wroc{\l}aw University of Science and Technology, Wroc{\l}aw 50-370, Poland), Julianna Godziszewska (Department of Artificial Intelligence, Faculty of Information and Communication Technology, Wroc{\l}aw University of Science and Technology, Wroc{\l}aw 50-370, Poland), Maciej Piasecki (Department of Artificial Intelligence, Faculty of Information and Communication Technology, Wroc{\l}aw University of Science and Technology, Wroc{\l}aw 50-370, Poland), Konrad Wojtasik (Department of Artificial Intelligence, Faculty of Information and Communication Technology, Wroc{\l}aw University of Science and Technology, Wroc{\l}aw 50-370, Poland)

摘要

arXiv:2606.00881v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has demonstrated significant capabilities in enhancing the performance of Large Language Models (LLMs). One of the key tasks in RAG systems is the chunking process. Traditionally, fixed-size chunking and semantic chunking have been the standard approaches. However, interest in chunking strategies has been increasing, leading to a growing number of proposed methods that often claim improved performance over these conventional techniques. Many of these approaches are tailored to specific use cases and data types, with limited evidence of their effectiveness across diverse scenarios. As a result, it remains challenging to directly compare different techniques and assess their relative strengths.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据