14 articles tagged with #ai-robustness. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBearisharXiv β CS AI Β· 2d ago7/10
π§ Researchers introduce TEMPLATEFUZZ, a fuzzing framework that systematically exploits vulnerabilities in LLM chat templatesβa previously overlooked attack surface. The method achieves 98.2% jailbreak success rates on open-source models and 90% on commercial LLMs, significantly outperforming existing prompt injection techniques while revealing critical security gaps in production AI systems.
AIBullisharXiv β CS AI Β· 2d ago7/10
π§ Researchers introduce Criticality-Aware Adversarial Training (CAAT), a parameter-efficient method that identifies and fine-tunes only the most robustness-critical parameters in Vision Transformers, achieving 94.3% of standard adversarial training robustness while tuning just 6% of model parameters. This breakthrough addresses the computational bottleneck preventing large-scale adversarial training deployment.
AINeutralarXiv β CS AI Β· Apr 107/10
π§ Researchers introduce WildToolBench, a new benchmark for evaluating large language models' ability to use tools in real-world scenarios. Testing 57 LLMs reveals that none exceed 15% accuracy, exposing significant gaps in current models' agentic capabilities when facing messy, multi-turn user interactions rather than simplified synthetic tasks.
AIBullisharXiv β CS AI Β· Mar 177/10
π§ Researchers propose RESQ, a three-stage framework that enhances both security and reliability of quantized deep neural networks through specialized fine-tuning techniques. The framework demonstrates up to 10.35% improvement in attack resilience and 12.47% in fault resilience while maintaining competitive accuracy across multiple neural network architectures.
AINeutralarXiv β CS AI Β· Mar 167/10
π§ Researchers developed a testing framework to evaluate how reliably AI agents maintain consistent reasoning when inputs are semantically equivalent but differently phrased. Their study of seven foundation models across 19 reasoning problems found that larger models aren't necessarily more robust, with the smaller Qwen3-30B-A3B achieving the highest stability at 79.6% invariant responses.
AIBullisharXiv β CS AI Β· Mar 127/10
π§ Researchers propose ROVA, a new training framework that improves vision-language models' robustness in real-world conditions by up to 24% accuracy gains. The framework addresses performance degradation from weather, occlusion, and camera motion that can cause up to 35% accuracy drops in current models.
AIBullisharXiv β CS AI Β· Mar 97/10
π§ Researchers developed Sysformer, a novel approach to safeguard large language models by adapting system prompts rather than fine-tuning model parameters. The method achieved up to 80% improvement in refusing harmful prompts while maintaining 90% compliance with safe prompts across 5 different LLMs.
AIBearisharXiv β CS AI Β· Mar 57/10
π§ Researchers developed SycoEval-EM, a framework testing how large language models resist patient pressure for inappropriate medical care in emergency settings. Testing 20 LLMs across 1,875 encounters revealed acquiescence rates of 0-100%, with models more vulnerable to imaging requests than opioid prescriptions, highlighting the need for adversarial testing in clinical AI certification.
AIBullisharXiv β CS AI Β· Mar 37/103
π§ Researchers propose Causal Delta Embeddings, a new method for learning robust AI representations from image pairs that improves out-of-distribution performance. The approach focuses on representing interventions in causal models rather than just scene variables, achieving significant improvements in synthetic and real-world benchmarks without additional supervision.
AIBullisharXiv β CS AI Β· Apr 76/10
π§ Researchers introduce CONTXT, a lightweight neural network adaptation method that improves AI model performance when deployed on data different from training data. The technique uses simple additive and multiplicative transforms to modulate internal representations, providing consistent gains across both discriminative and generative models including LLMs.
AIBullisharXiv β CS AI Β· Mar 166/10
π§ Researchers developed Q-DIG, a red-teaming method that uses Quality Diversity techniques to identify diverse language instruction failures in Vision-Language-Action models for robotics. The approach generates adversarial prompts that expose vulnerabilities in robot behavior and improves task success rates when used for fine-tuning.
AIBearisharXiv β CS AI Β· Mar 37/108
π§ Researchers introduced the Synthetic Web Benchmark, revealing that frontier AI language models fail catastrophically when exposed to high-plausibility misinformation in search results. The study shows current AI agents struggle to handle conflicting information sources, with accuracy collapsing despite access to truthful content.
AINeutralarXiv β CS AI Β· Mar 35/104
π§ Researchers propose SCER (Spurious Correlation-Aware Embedding Regularization), a new deep learning approach that improves AI model robustness by regularizing feature representations to suppress spurious correlations. The method demonstrates superior performance in worst-group accuracy across vision and language tasks compared to existing state-of-the-art approaches.
AINeutralarXiv β CS AI Β· Mar 175/10
π§ Researchers evaluated the semantic fragility of text-to-audio generation systems, finding that small changes in prompts can lead to substantial variations in generated audio output. While larger models like MusicGen-large showed better semantic consistency, all models exhibited persistent divergence in acoustic and temporal characteristics even when semantic similarity remained high.