🧠 AI⚪ NeutralImportance 7/10

Persuade Me if You Can: A Framework for Evaluating Persuasion Effectiveness and Susceptibility Among Large Language Models

arXiv – CS AI|Nimet Beyza Bozdag, Shuhaib Mehri, Gokhan Tur, Dilek Hakkani-T\"ur|May 28, 2026 at 04:00 AM

🤖AI Summary

Researchers introduce PMIYC, an automated framework for evaluating how effectively LLMs can persuade others and how susceptible they are to persuasion. Testing across multiple models reveals significant performance variations—GPT-4o shows 50% greater resistance to misinformation persuasion than Llama-3.3-70B, while o1-mini emerges as both persuasive and resistant, providing critical data for AI safety and alignment development.

Analysis

This research addresses a fundamental tension in large language model development: the same capabilities that enable beneficial persuasion and communication can be weaponized for manipulation, disinformation, and adversarial attacks. The PMIYC framework automates evaluation of persuasion dynamics across multi-agent scenarios, replacing expensive human annotation with scalable automated assessment. This methodological advancement matters because understanding LLM vulnerabilities to social engineering directly impacts deployment safety in customer-facing applications and critical infrastructure.

The findings reveal model-specific security profiles that align with broader AI safety concerns. GPT-4o's superior robustness against misinformation suggests superior training safeguards, while Llama-3.3-70B's greater susceptibility highlights vulnerabilities in open-weight models that enterprises increasingly adopt. The emergence of o1-mini as simultaneously persuasive yet resistant suggests architectural or training approaches that could inform future safety protocols. These performance differentials have immediate implications for enterprise deployments, where susceptibility to prompt injection, jailbreaking, and adversarial inputs creates operational risk.

For the AI industry, this framework establishes quantifiable benchmarks for persuasion resistance—a previously unmeasured safety dimension. Organizations selecting models for sensitive applications can now evaluate susceptibility to manipulation alongside traditional performance metrics. The validated alignment with human assessment strengthens PMIYC's credibility as an industry standard, similar to how benchmarks like MMLU and HELM became selection criteria. As AI systems gain autonomy in decision-making, understanding their vulnerability to social engineering becomes as critical as measuring accuracy. This work bridges the gap between theoretical AI alignment research and practical safety evaluation.

Key Takeaways

→PMIYC framework automates scalable evaluation of LLM persuasion effectiveness and susceptibility, replacing costly human annotation.
→GPT-4o demonstrates 50% greater resistance to misinformation persuasion than Llama-3.3-70B, indicating significant security variance across models.
→Model selection for sensitive applications should now factor persuasion resistance alongside performance metrics.
→Open-weight models show greater susceptibility to adversarial persuasion compared to proprietary alternatives.
→Persuasion resistance emerges as a measurable safety dimension critical for AI alignment and enterprise deployment security.

Mentioned in AI

Models

GPT-4OpenAI

ClaudeAnthropic

LlamaMeta

#llm-safety #ai-alignment #persuasion-robustness #gpt-4o #llama-3.3 #adversarial-testing #ai-benchmark #model-evaluation

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AIMay 6

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AIMay 6

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AIMay 6

Persuade Me if You Can: A Framework for Evaluating Persuasion Effectiveness and Susceptibility Among Large Language Models

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge