🧠

AI

21,458 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

21458 articles

AIBullisharXiv – CS AI · Mar 37/108

🧠

Exact and Asymptotically Complete Robust Verifications of Neural Networks via Quantum Optimization

Researchers have developed quantum optimization models for robust verification of deep neural networks against adversarial attacks. The approach provides exact verification for ReLU networks and asymptotically complete verification for networks with general activation functions like sigmoid and tanh.

AINeutralarXiv – CS AI · Mar 37/106

🧠

Verifier-Bound Communication for LLM Agents: Certified Bounds on Covert Signaling

Researchers present CLBC, a new protocol to prevent AI language model agents from hiding coordination in seemingly compliant messages. The system uses verifier-bound communication where messages must pass through a small verifier with proof-bound envelopes to be admitted to transcript state.

AINeutralarXiv – CS AI · Mar 36/107

🧠

Challenges in Enabling Private Data Valuation

Researchers identify fundamental conflicts between data privacy and data valuation methods used in AI training. The study shows that differential privacy requirements often destroy the fine-grained distinctions needed for effective data valuation, particularly for rare or influential examples.

AIBullisharXiv – CS AI · Mar 37/107

🧠

MuonRec: Shifting the Optimizer Paradigm Beyond Adam in Scalable Generative Recommendation

Researchers introduce MuonRec, a new optimization framework for recommendation systems that significantly outperforms the widely-used Adam/AdamW optimizers. The framework reduces training steps by 32.4% on average while improving ranking quality by 12.6% in NDCG@10 metrics across traditional and generative recommenders.

AINeutralarXiv – CS AI · Mar 36/107

🧠

When Metrics Disagree: Automatic Similarity vs. LLM-as-a-Judge for Clinical Dialogue Evaluation

Researchers fine-tuned the Llama 2 7B model using real patient-doctor interaction transcripts to improve medical query responses, but found significant discrepancies between automatic similarity metrics and GPT-4 evaluations. The study highlights the challenges in evaluating AI medical models and recommends human medical expert review for proper validation.

AIBullisharXiv – CS AI · Mar 36/108

🧠

AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution

Researchers introduced AlignVAR, a new visual autoregressive framework for image super-resolution that delivers 10x faster inference with 50% fewer parameters than leading diffusion-based approaches. The system addresses key challenges in image reconstruction through improved spatial consistency and hierarchical constraints, establishing a more efficient paradigm for high-quality image enhancement.

AIBullisharXiv – CS AI · Mar 36/108

🧠

RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment

Researchers introduced RAISE, a training-free evolutionary framework that improves text-to-image generation by adaptively refining outputs based on prompt complexity. The system achieves state-of-the-art alignment scores while reducing computational costs by 30-80% compared to existing methods.

AIBullisharXiv – CS AI · Mar 36/107

🧠

Polynomial Surrogate Training for Differentiable Ternary Logic Gate Networks

Researchers introduce Polynomial Surrogate Training (PST) to enable differentiable ternary logic gate networks, reducing parameters by 2,187x while maintaining performance. The method extends beyond binary logic gates to ternary systems with an UNKNOWN state for uncertainty handling, training 2-3x faster than binary networks.

AIBullisharXiv – CS AI · Mar 36/106

🧠

Stepwise Penalization for Length-Efficient Chain-of-Thought Reasoning

Researchers developed SWAP (Step-wise Adaptive Penalization), a new AI training method that makes large reasoning models more efficient by reducing unnecessary steps in chain-of-thought reasoning. The technique reduces reasoning length by 64.3% while improving accuracy by 5.7%, addressing the costly problem of AI models 'overthinking' during problem-solving.

AIBullisharXiv – CS AI · Mar 36/1012

🧠

Efficient Flow Matching for Sparse-View CT Reconstruction

Researchers developed FMCT/EFMCT, a new Flow Matching-based framework for CT medical imaging reconstruction that significantly improves computational efficiency over existing diffusion models. The method uses deterministic ordinary differential equations and velocity field reuse to reduce neural network evaluations while maintaining reconstruction quality.

AIBullisharXiv – CS AI · Mar 36/107

🧠

LiaisonAgent: An Multi-Agent Framework for Autonomous Risk Investigation and Governance

Researchers introduce LiaisonAgent, an autonomous multi-agent cybersecurity system built on the QWQ-32B reasoning model that automates risk investigation and governance for Security Operations Centers. The system achieves 97.8% success rate in tool-calling and 95% accuracy in risk judgment while reducing manual investigation overhead by 92.7%.

AINeutralarXiv – CS AI · Mar 37/109

🧠

Universal NP-Hardness of Clustering under General Utilities

Researchers prove that clustering problems in machine learning are universally NP-hard, providing theoretical explanation for why clustering algorithms often produce unstable results. The study demonstrates that major clustering methods like k-means and spectral clustering inherit fundamental computational intractability, explaining common failure modes like local optima.

AIBullisharXiv – CS AI · Mar 37/108

🧠

VisRef: Visual Refocusing while Thinking Improves Test-Time Scaling in Multi-Modal Large Reasoning Models

Researchers developed VisRef, a new framework that improves visual reasoning in large AI models by re-injecting relevant visual tokens during the reasoning process. The method avoids expensive reinforcement learning fine-tuning while achieving up to 6.4% performance improvements on visual reasoning benchmarks.

AIBullisharXiv – CS AI · Mar 36/106

🧠

Stateful Token Reduction for Long-Video Hybrid VLMs

Researchers developed a new token reduction method for hybrid vision-language models that process long videos, achieving 3.8-4.2x speedup while retaining only 25% of visual tokens. The approach uses progressive reduction and unified scoring for both attention and Mamba blocks, maintaining near-baseline accuracy on long-context video benchmarks.

$NEAR

AIBearisharXiv – CS AI · Mar 37/109

🧠

Physical Evaluation of Naturalistic Adversarial Patches for Camera-Based Traffic-Sign Detection

Researchers evaluated Naturalistic Adversarial Patches (NAPs) that can fool autonomous vehicle traffic sign detection systems in physical environments. The study used a custom dataset and YOLOv5 model to generate patches that successfully reduced STOP sign detection confidence across various real-world testing conditions.

AIBullisharXiv – CS AI · Mar 37/107

🧠

Your Inference Request Will Become a Black Box: Confidential Inference for Cloud-based Large Language Models

Researchers propose Talaria, a new confidential inference framework that protects client data privacy when using cloud-hosted Large Language Models. The system partitions LLM operations between client-controlled environments and cloud GPUs, reducing token reconstruction attacks from 97.5% to 1.34% accuracy while maintaining model performance.

AINeutralarXiv – CS AI · Mar 37/106

🧠

Formal Analysis and Supply Chain Security for Agentic AI Skills

Researchers developed SkillFortify, the first formal analysis framework for securing AI agent skill supply chains, addressing critical vulnerabilities exposed by attacks like ClawHavoc that infiltrated over 1,200 malicious skills. The framework achieved 96.95% F1 score with 100% precision and zero false positives in detecting malicious AI agent skills.

AIBullisharXiv – CS AI · Mar 36/1010

🧠

Efficient Long-Horizon GUI Agents via Training-Free KV Cache Compression

Researchers developed ST-Lite, a training-free KV cache compression framework that accelerates GUI agents by 2.45x while using only 10-20% of the cache budget. The solution addresses memory and latency constraints in Vision-Language Models for autonomous GUI interactions through specialized attention pattern optimization.

AIBullisharXiv – CS AI · Mar 36/108

🧠

RLShield: Practical Multi-Agent RL for Financial Cyber Defense with Attack-Surface MDPs and Real-Time Response Orchestration

Researchers have developed RLShield, a multi-agent reinforcement learning system designed to automate cyber defense in financial institutions. The system uses AI to coordinate real-time responses across multiple assets and services during cyberattacks, balancing containment speed with operational costs and business disruption.

AIBullisharXiv – CS AI · Mar 36/107

🧠

ThreatFormer-IDS: Robust Transformer Intrusion Detection with Zero-Day Generalization and Explainable Attribution

Researchers developed ThreatFormer-IDS, a Transformer-based intrusion detection system that achieves robust cybersecurity monitoring for IoT and industrial networks. The system demonstrates superior performance in detecting zero-day attacks while providing explainable threat attribution, achieving 99.4% AUC-ROC on benchmark tests.

AIBullisharXiv – CS AI · Mar 37/107

🧠

NNiT: Width-Agnostic Neural Network Generation with Structurally Aligned Weight Spaces

Researchers introduced Neural Network Diffusion Transformers (NNiTs), a new approach that generates neural network parameters in a width-agnostic manner by treating weight matrices as tokenized patches. The method achieves over 85% success on unseen network architectures in robotics tasks, solving key challenges in generative modeling of neural networks.

AINeutralarXiv – CS AI · Mar 37/108

🧠

SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond

Researchers introduce SafeSci, a comprehensive framework for evaluating safety in large language models used for scientific applications. The framework includes a 0.25M sample benchmark and 1.5M sample training dataset, revealing critical vulnerabilities in 24 advanced LLMs while demonstrating that fine-tuning can significantly improve safety alignment.

AINeutralarXiv – CS AI · Mar 37/107

🧠

SKeDA: A Generative Watermarking Framework for Text-to-video Diffusion Models

Researchers propose SKeDA, a new watermarking framework for text-to-video AI models that addresses content authenticity and copyright protection concerns. The system uses shuffle-key-based sampling and differential attention to maintain watermark robustness against video distortions while preserving generation quality.

AIBullisharXiv – CS AI · Mar 36/107

🧠

Zero-Shot and Supervised Bird Image Segmentation Using Foundation Models: A Dual-Pipeline Approach with Grounding DINO~1.5, YOLOv11, and SAM~2.1

Researchers developed a dual-pipeline framework for bird image segmentation using foundation models including Grounding DINO 1.5, YOLOv11, and SAM 2.1. The supervised pipeline achieved state-of-the-art results with 0.912 IoU on the CUB-200-2011 dataset, while the zero-shot pipeline achieved 0.831 IoU using only text prompts.

AIBullisharXiv – CS AI · Mar 36/109

🧠

Engineering FAIR Privacy-preserving Applications that Learn Histories of Disease

Researchers successfully developed a privacy-preserving healthcare AI application that runs entirely in web browsers without downloads, using ONNX and JavaScript SDK for client-side inference. The project demonstrates how generative AI models for predicting disease risk can be deployed securely while maintaining data privacy in sensitive medical applications.

← PrevPage 559 of 859Next →