🧠

AI

21,455 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

21455 articles

AINeutralarXiv – CS AI · Mar 36/107

🧠

Theory of Code Space: Do Code Agents Understand Software Architecture?

Researchers introduce Theory of Code Space (ToCS), a new benchmark that evaluates AI agents' ability to understand software architecture across multi-file codebases. The study reveals significant performance gaps between frontier LLM agents and rule-based baselines, with F1 scores ranging from 0.129 to 0.646.

AIBearisharXiv – CS AI · Mar 36/106

🧠

LangGap: Diagnosing and Closing the Language Gap in Vision-Language-Action Models

Researchers reveal that state-of-the-art Vision-Language-Action (VLA) models largely ignore language instructions despite achieving 95% success on standard benchmarks. The new LangGap benchmark exposes significant language understanding deficits, with targeted data augmentation only partially addressing the fundamental challenge of diverse instruction comprehension.

AIBullisharXiv – CS AI · Mar 36/108

🧠

AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution

Researchers introduced AlignVAR, a new visual autoregressive framework for image super-resolution that delivers 10x faster inference with 50% fewer parameters than leading diffusion-based approaches. The system addresses key challenges in image reconstruction through improved spatial consistency and hierarchical constraints, establishing a more efficient paradigm for high-quality image enhancement.

AIBullisharXiv – CS AI · Mar 36/108

🧠

IdGlow: Dynamic Identity Modulation for Multi-Subject Generation

IdGlow introduces a new AI framework for generating images with multiple subjects that preserves individual identities while creating coherent scenes. The system uses a two-stage approach with Flow Matching diffusion models and addresses the challenge of maintaining identity fidelity during complex transformations like age changes.

AINeutralarXiv – CS AI · Mar 36/107

🧠

DeepAFL: Deep Analytic Federated Learning

Researchers propose DeepAFL, a new federated learning approach that uses gradient-free analytical solutions to address heterogeneity and scalability issues in traditional gradient-based FL systems. The method incorporates deep residual blocks with closed-form solutions, achieving 5.68%-8.42% performance improvements over existing baselines across benchmark datasets.

AIBullisharXiv – CS AI · Mar 37/107

🧠

Enhancing Molecular Property Predictions by Learning from Bond Modelling and Interactions

Researchers introduce DeMol, a new dual-graph framework for molecular property prediction that explicitly models both atoms and chemical bonds to achieve superior accuracy. The approach addresses limitations of conventional atom-centric models by incorporating bond-level phenomena like resonance and stereoselectivity, establishing new state-of-the-art results across multiple benchmarks.

$ATOM

AIBearisharXiv – CS AI · Mar 37/108

🧠

MIDAS: Multi-Image Dispersion and Semantic Reconstruction for Jailbreaking MLLMs

Researchers have developed MIDAS, a new jailbreaking framework that successfully bypasses safety mechanisms in Multimodal Large Language Models by dispersing harmful content across multiple images. The technique achieved an 81.46% average attack success rate against four closed-source MLLMs by extending reasoning chains and reducing security attention.

$LINK

AIBullisharXiv – CS AI · Mar 37/107

🧠

Whisper-MLA: Reducing GPU Memory Consumption of ASR Models based on MHA2MLA Conversion

Researchers introduce Whisper-MLA, a modified version of OpenAI's Whisper speech recognition model that uses Multi-Head Latent Attention to reduce GPU memory consumption by up to 87.5% while maintaining accuracy. The innovation addresses a key scalability issue with transformer-based ASR models when processing long-form audio.

AIBearisharXiv – CS AI · Mar 37/108

🧠

Are LLMs Reliable Code Reviewers? Systematic Overcorrection in Requirement Conformance Judgement

Research reveals that Large Language Models (LLMs) systematically fail at code review tasks, frequently misclassifying correct code as defective when matching implementations to natural language requirements. The study found that more detailed prompts actually increase misjudgment rates, raising concerns about LLM reliability in automated development workflows.

AIBullisharXiv – CS AI · Mar 36/108

🧠

IDER: IDempotent Experience Replay for Reliable Continual Learning

Researchers propose IDER (Idempotent Experience Replay), a new continual learning method that addresses catastrophic forgetting in neural networks while improving prediction reliability. The approach uses idempotent properties to help AI models retain previously learned knowledge when acquiring new tasks, with demonstrated improvements in accuracy and reduced computational overhead.

AIBearisharXiv – CS AI · Mar 37/106

🧠

Learning to Attack: A Bandit Approach to Adversarial Context Poisoning

Researchers developed AdvBandit, a new black-box adversarial attack method that can exploit neural contextual bandits by poisoning context data without requiring access to internal model parameters. The attack uses bandit theory and inverse reinforcement learning to adaptively learn victim policies and optimize perturbations, achieving higher victim regret than existing methods.

AIBearisharXiv – CS AI · Mar 37/107

🧠

CaptionFool: Universal Image Captioning Model Attacks

Researchers have developed CaptionFool, a universal adversarial attack that can manipulate AI image captioning models by modifying just 1.2% of image patches. The attack achieves 94-96% success rates in forcing models to generate arbitrary captions, including offensive content that can bypass content moderation systems.

AIBullisharXiv – CS AI · Mar 36/106

🧠

CIRCUS: Circuit Consensus under Uncertainty via Stability Ensembles

Researchers introduce CIRCUS, a new method for discovering mechanistic circuits in AI models that addresses uncertainty and brittleness issues in current approaches. The technique creates ensemble attribution graphs and extracts consensus circuits that are 40x smaller while maintaining explanatory power, validated on Gemma-2-2B and Llama-3.2-1B models.

AIBullisharXiv – CS AI · Mar 36/108

🧠

A Polynomial-Time Axiomatic Alternative to SHAP for Feature Attribution

Researchers have developed ESENSC_rev2, a polynomial-time alternative to SHAP for AI feature attribution that offers similar accuracy with significantly improved computational efficiency. The method uses cooperative game theory and provides theoretical foundations through axiomatic characterization, making it suitable for high-dimensional explainability tasks.

AIBullisharXiv – CS AI · Mar 37/108

🧠

WirelessAgent++: Automated Agentic Workflow Design and Benchmarking for Wireless Networks

Researchers propose WirelessAgent++, an automated framework for designing AI agent workflows in wireless networks using Monte Carlo Tree Search. The system achieves superior performance on wireless tasks with test scores up to 97%, outperforming existing methods by up to 31% while maintaining low computational costs under $5 per task.

AIBullisharXiv – CS AI · Mar 36/107

🧠

ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models

Researchers propose ArtiFixer, a two-stage pipeline using auto-regressive diffusion models to enhance 3D reconstruction quality. The method addresses scalability and quality issues in existing approaches by training a bidirectional generative model with opacity mixing, then distilling it into a causal auto-regressive model that generates hundreds of frames in a single pass.

AIBullisharXiv – CS AI · Mar 36/108

🧠

RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment

Researchers introduced RAISE, a training-free evolutionary framework that improves text-to-image generation by adaptively refining outputs based on prompt complexity. The system achieves state-of-the-art alignment scores while reducing computational costs by 30-80% compared to existing methods.

AIBullisharXiv – CS AI · Mar 37/107

🧠

What Do Visual Tokens Really Encode? Uncovering Sparsity and Redundancy in Multimodal Large Language Models

Researchers developed EmbedLens, a tool to analyze how multimodal large language models process visual information, finding that only 60% of visual tokens carry meaningful image-specific information. The study reveals significant inefficiencies in current MLLM architectures and proposes optimizations through selective token pruning and mid-layer injection.

AIBearisharXiv – CS AI · Mar 36/108

🧠

Atomicity for Agents: Exposing, Exploiting, and Mitigating TOCTOU Vulnerabilities in Browser-Use Agents

Researchers identified widespread TOCTOU (time of check to time of use) vulnerabilities in browser-use agents, where web pages change between planning and execution phases, potentially causing unintended actions. A study of 10 popular open-source agents revealed these security flaws are common, prompting development of a lightweight mitigation strategy based on pre-execution validation.

AIBullisharXiv – CS AI · Mar 36/107

🧠

HydroShear: Hydroelastic Shear Simulation for Tactile Sim-to-Real Reinforcement Learning

HydroShear is a new tactile simulation system for robotics that enables zero-shot sim-to-real transfer of reinforcement learning policies by accurately modeling force, shear, and stick-slip transitions. The system achieved 93% success rate across four dexterous manipulation tasks, significantly outperforming existing vision-based tactile simulation methods.

AIBullisharXiv – CS AI · Mar 37/107

🧠

ROKA: Robust Knowledge Unlearning against Adversaries

Researchers introduce ROKA, a new machine unlearning method that prevents knowledge contamination and indirect attacks on AI models. The approach uses 'Neural Healing' to preserve important knowledge while forgetting targeted data, providing theoretical guarantees for knowledge preservation during unlearning.

AIBullisharXiv – CS AI · Mar 36/109

🧠

Taxonomy-Aware Representation Alignment for Hierarchical Visual Recognition with Large Multimodal Models

Researchers propose TARA (Taxonomy-Aware Representation Alignment), a new method to improve Large Multimodal Models' ability to recognize visual categories in hierarchical taxonomies. The approach aligns visual features with biology foundation models to enable better recognition of both known and novel biological categories.

AIBullisharXiv – CS AI · Mar 37/107

🧠

An Interpretable Local Editing Model for Counterfactual Medical Image Generation

Researchers developed InstructX2X, a new AI model for generating counterfactual medical images that provides interpretable explanations and prevents unintended modifications. The model achieves state-of-the-art performance in creating high-quality chest X-ray images with visual guidance maps for medical applications.

AIBullisharXiv – CS AI · Mar 37/107

🧠

FastBUS: A Fast Bayesian Framework for Unified Weakly-Supervised Learning

Researchers propose FastBUS, a new Bayesian framework for weakly-supervised machine learning that addresses computational inefficiencies in existing methods. The framework uses probabilistic transitions and belief propagation to achieve state-of-the-art results while delivering up to hundreds of times faster processing speeds than current general methods.

AIBullisharXiv – CS AI · Mar 37/107

🧠

MuonRec: Shifting the Optimizer Paradigm Beyond Adam in Scalable Generative Recommendation

Researchers introduce MuonRec, a new optimization framework for recommendation systems that significantly outperforms the widely-used Adam/AdamW optimizers. The framework reduces training steps by 32.4% on average while improving ranking quality by 12.6% in NDCG@10 metrics across traditional and generative recommenders.

← PrevPage 558 of 859Next →