🧠

AI

12,999 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

12999 articles

AIBullisharXiv – CS AI · Mar 37/108

🧠

Exact and Asymptotically Complete Robust Verifications of Neural Networks via Quantum Optimization

Researchers have developed quantum optimization models for robust verification of deep neural networks against adversarial attacks. The approach provides exact verification for ReLU networks and asymptotically complete verification for networks with general activation functions like sigmoid and tanh.

AIBullisharXiv – CS AI · Mar 36/107

🧠

Polynomial Surrogate Training for Differentiable Ternary Logic Gate Networks

Researchers introduce Polynomial Surrogate Training (PST) to enable differentiable ternary logic gate networks, reducing parameters by 2,187x while maintaining performance. The method extends beyond binary logic gates to ternary systems with an UNKNOWN state for uncertainty handling, training 2-3x faster than binary networks.

AINeutralarXiv – CS AI · Mar 36/107

🧠

When Metrics Disagree: Automatic Similarity vs. LLM-as-a-Judge for Clinical Dialogue Evaluation

Researchers fine-tuned the Llama 2 7B model using real patient-doctor interaction transcripts to improve medical query responses, but found significant discrepancies between automatic similarity metrics and GPT-4 evaluations. The study highlights the challenges in evaluating AI medical models and recommends human medical expert review for proper validation.

AIBullisharXiv – CS AI · Mar 36/106

🧠

Stepwise Penalization for Length-Efficient Chain-of-Thought Reasoning

Researchers developed SWAP (Step-wise Adaptive Penalization), a new AI training method that makes large reasoning models more efficient by reducing unnecessary steps in chain-of-thought reasoning. The technique reduces reasoning length by 64.3% while improving accuracy by 5.7%, addressing the costly problem of AI models 'overthinking' during problem-solving.

AINeutralarXiv – CS AI · Mar 37/109

🧠

Universal NP-Hardness of Clustering under General Utilities

Researchers prove that clustering problems in machine learning are universally NP-hard, providing theoretical explanation for why clustering algorithms often produce unstable results. The study demonstrates that major clustering methods like k-means and spectral clustering inherit fundamental computational intractability, explaining common failure modes like local optima.

AINeutralarXiv – CS AI · Mar 36/107

🧠

Challenges in Enabling Private Data Valuation

Researchers identify fundamental conflicts between data privacy and data valuation methods used in AI training. The study shows that differential privacy requirements often destroy the fine-grained distinctions needed for effective data valuation, particularly for rare or influential examples.

AINeutralarXiv – CS AI · Mar 36/108

🧠

Transformers Remember First, Forget Last: Dual-Process Interference in LLMs

Research analyzing 39 large language models reveals they exhibit proactive interference (remembering early information over recent) unlike humans who typically show retroactive interference. The study found this pattern is universal across all tested LLMs, with larger models showing better resistance to retroactive interference but unchanged proactive interference patterns.

AIBearisharXiv – CS AI · Mar 37/109

🧠

Physical Evaluation of Naturalistic Adversarial Patches for Camera-Based Traffic-Sign Detection

Researchers evaluated Naturalistic Adversarial Patches (NAPs) that can fool autonomous vehicle traffic sign detection systems in physical environments. The study used a custom dataset and YOLOv5 model to generate patches that successfully reduced STOP sign detection confidence across various real-world testing conditions.

AIBullisharXiv – CS AI · Mar 36/106

🧠

Stateful Token Reduction for Long-Video Hybrid VLMs

Researchers developed a new token reduction method for hybrid vision-language models that process long videos, achieving 3.8-4.2x speedup while retaining only 25% of visual tokens. The approach uses progressive reduction and unified scoring for both attention and Mamba blocks, maintaining near-baseline accuracy on long-context video benchmarks.

$NEAR

AIBearisharXiv – CS AI · Mar 37/107

🧠

CaptionFool: Universal Image Captioning Model Attacks

Researchers have developed CaptionFool, a universal adversarial attack that can manipulate AI image captioning models by modifying just 1.2% of image patches. The attack achieves 94-96% success rates in forcing models to generate arbitrary captions, including offensive content that can bypass content moderation systems.

AIBullisharXiv – CS AI · Mar 36/107

🧠

LiaisonAgent: An Multi-Agent Framework for Autonomous Risk Investigation and Governance

Researchers introduce LiaisonAgent, an autonomous multi-agent cybersecurity system built on the QWQ-32B reasoning model that automates risk investigation and governance for Security Operations Centers. The system achieves 97.8% success rate in tool-calling and 95% accuracy in risk judgment while reducing manual investigation overhead by 92.7%.

AINeutralarXiv – CS AI · Mar 37/107

🧠

Forgetting is Competition: Rethinking Unlearning as Representation Interference in Diffusion Models

Researchers introduce SurgUn, a surgical unlearning method for text-to-image diffusion models that enables precise removal of specific visual concepts while preserving other capabilities. The approach addresses challenges in copyright compliance and content policy enforcement by applying targeted weight-space updates based on retroactive interference theory.

AIBullisharXiv – CS AI · Mar 36/1012

🧠

Efficient Flow Matching for Sparse-View CT Reconstruction

Researchers developed FMCT/EFMCT, a new Flow Matching-based framework for CT medical imaging reconstruction that significantly improves computational efficiency over existing diffusion models. The method uses deterministic ordinary differential equations and velocity field reuse to reduce neural network evaluations while maintaining reconstruction quality.

AIBullisharXiv – CS AI · Mar 37/108

🧠

VisRef: Visual Refocusing while Thinking Improves Test-Time Scaling in Multi-Modal Large Reasoning Models

Researchers developed VisRef, a new framework that improves visual reasoning in large AI models by re-injecting relevant visual tokens during the reasoning process. The method avoids expensive reinforcement learning fine-tuning while achieving up to 6.4% performance improvements on visual reasoning benchmarks.

AIBullisharXiv – CS AI · Mar 36/107

🧠

Toward Graph-Tokenizing Large Language Models with Reconstructive Graph Instruction Tuning

Researchers have developed RGLM, a new approach to improve how large language models understand and process graph data by incorporating explicit graph supervision alongside text instructions. The method addresses limitations in existing Graph-Tokenizing LLMs that rely too heavily on text supervision, leading to underutilization of graph context.

AINeutralarXiv – CS AI · Mar 37/106

🧠

Verifier-Bound Communication for LLM Agents: Certified Bounds on Covert Signaling

Researchers present CLBC, a new protocol to prevent AI language model agents from hiding coordination in seemingly compliant messages. The system uses verifier-bound communication where messages must pass through a small verifier with proof-bound envelopes to be admitted to transcript state.

AIBullisharXiv – CS AI · Mar 36/109

🧠

Engineering FAIR Privacy-preserving Applications that Learn Histories of Disease

Researchers successfully developed a privacy-preserving healthcare AI application that runs entirely in web browsers without downloads, using ONNX and JavaScript SDK for client-side inference. The project demonstrates how generative AI models for predicting disease risk can be deployed securely while maintaining data privacy in sensitive medical applications.

AIBullisharXiv – CS AI · Mar 36/107

🧠

Zero-Shot and Supervised Bird Image Segmentation Using Foundation Models: A Dual-Pipeline Approach with Grounding DINO~1.5, YOLOv11, and SAM~2.1

Researchers developed a dual-pipeline framework for bird image segmentation using foundation models including Grounding DINO 1.5, YOLOv11, and SAM 2.1. The supervised pipeline achieved state-of-the-art results with 0.912 IoU on the CUB-200-2011 dataset, while the zero-shot pipeline achieved 0.831 IoU using only text prompts.

AIBullisharXiv – CS AI · Mar 36/107

🧠

ThreatFormer-IDS: Robust Transformer Intrusion Detection with Zero-Day Generalization and Explainable Attribution

Researchers developed ThreatFormer-IDS, a Transformer-based intrusion detection system that achieves robust cybersecurity monitoring for IoT and industrial networks. The system demonstrates superior performance in detecting zero-day attacks while providing explainable threat attribution, achieving 99.4% AUC-ROC on benchmark tests.

AIBearisharXiv – CS AI · Mar 37/109

🧠

Hidden in the Metadata: Stealth Poisoning Attacks on Multimodal Retrieval-Augmented Generation

Researchers have discovered MM-MEPA, a new attack method that can poison multimodal AI systems by manipulating only metadata while leaving visual content unchanged. The attack achieves up to 91% success rate in disrupting AI retrieval systems and proves resistant to current defense strategies.

AIBullisharXiv – CS AI · Mar 36/108

🧠

RLShield: Practical Multi-Agent RL for Financial Cyber Defense with Attack-Surface MDPs and Real-Time Response Orchestration

Researchers have developed RLShield, a multi-agent reinforcement learning system designed to automate cyber defense in financial institutions. The system uses AI to coordinate real-time responses across multiple assets and services during cyberattacks, balancing containment speed with operational costs and business disruption.

AINeutralarXiv – CS AI · Mar 36/106

🧠

Summer-22B: A Systematic Approach to Dataset Engineering and Training at Scale for Video Foundation Model

Researchers documented their experience training Summer-22B, a video foundation model developed from scratch using 50 million clips. The report details engineering challenges, dataset curation methods, and architectural decisions, emphasizing that dataset engineering consumed the majority of development effort.

AIBullisharXiv – CS AI · Mar 37/107

🧠

NNiT: Width-Agnostic Neural Network Generation with Structurally Aligned Weight Spaces

Researchers introduced Neural Network Diffusion Transformers (NNiTs), a new approach that generates neural network parameters in a width-agnostic manner by treating weight matrices as tokenized patches. The method achieves over 85% success on unseen network architectures in robotics tasks, solving key challenges in generative modeling of neural networks.

AIBearisharXiv – CS AI · Mar 37/107

🧠

Reverse CAPTCHA: Evaluating LLM Susceptibility to Invisible Unicode Instruction Injection

Researchers developed 'Reverse CAPTCHA,' a framework that tests how large language models respond to invisible Unicode-encoded instructions embedded in normal text. The study found that AI models can follow hidden instructions that humans cannot see, with tool use dramatically increasing compliance rates and different AI providers showing distinct preferences for encoding schemes.

AINeutralarXiv – CS AI · Mar 36/108

🧠

Exploring the AI Obedience: Why is Generating a Pure Color Image Harder than CyberPunk?

Researchers have identified a 'Paradox of Simplicity' in AI models where they excel at complex tasks but fail at simple ones like generating pure color images. A new benchmark called VIOLIN has been introduced to evaluate AI obedience and alignment with instructions across different complexity levels.

$RNDR

← PrevPage 234 of 520Next →