A Comparative Analysis of Large Language Models to Evaluate Robustness and Reliability in Adversarial Conditions

Takeshi Goto; Kensuke Ono; Akira Morita

doi:10.36227/techrxiv.171173447.70655950/v1

loading page

A Comparative Analysis of Large Language Models to Evaluate Robustness and Reliability in Adversarial Conditions

Takeshi Goto,
Kensuke Ono,
Akira Morita

Abstract

This study went on a comprehensive evaluation of four prominent Large Language Models (LLMs)-Google Gemini, Mistral 8x7B, ChatGPT-4, and Microsoft Phi-1.5-to assess their robustness and reliability under a variety of adversarial conditions. Utilizing the Microsoft PromptBench dataset, the research investigates each model's performance against syntactic manipulations, semantic alterations, and contextually misleading cues. The findings reveal notable differences in model resilience, highlighting the distinct strengths and weaknesses of each LLM in responding to adversarial challenges. Comparative analysis underscores the necessity for multifaceted evaluation approaches to enhance model resilience, suggesting future research directions involving the augmentation of training datasets with adversarial examples and the exploration of advanced natural language understanding algorithms. This study contributes to the ongoing discourse in LLM research by providing insights into model vulnerabilities and advocating for comprehensive strategies to bolster LLM robustness against the evolving landscape of adversarial threats.

21 Mar 2024Submitted to TechRxiv

29 Mar 2024Published in TechRxiv

Abstract

Peer review timeline