TriEval: A Resource-Efficient Pipeline for LLM Bias, Toxicity, and Truthfulness Assessment 文章

ArXiv CS.AI2026-06-03NEWSen作者: Akshatha Srikantha, Manpreet Singh, Yash Jajoo, Shyamal Lakhanpal

摘要

arXiv:2606.03036v1 Announce Type: new Abstract: LLMs have evolved from basic chatbots to the backbone of the AI ecosystem, now widely used in healthcare, schools, and government services. The domain-wide adoption of LLMs necessitates continuous evaluation to ensure their safety and fairness. Common issues encountered after deploying LLMs include inconsistent outputs and hallucinations of incorrect information. Although numerous LLM evaluation tools exist, most are limited to testing a single parameter at a time or require massive computational resources that are not accessible to most researchers. TriEval addresses these challenges by evaluating LLM outputs across multiple parameters, including bias, toxicity, and truthfulness together, while minimizing computing resources. The pipeline is compatible with both open- and closed-source models and runs on a standard laptop without a GPU cluster. TriEval has been tested on four models: Llama 3 8B, Mistral 7B, Gemma 2 9B, and Claude Haiku.

TriEval: A Resource-Efficient Pipeline for LLM Bias, Toxicity, and Truthfulness Assessment 文章

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (10)

相关技术查看全部 (1)