🧠

AI

13,003 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

13003 articles

AINeutralarXiv – CS AI · Mar 37/109

🧠

Universal NP-Hardness of Clustering under General Utilities

Researchers prove that clustering problems in machine learning are universally NP-hard, providing theoretical explanation for why clustering algorithms often produce unstable results. The study demonstrates that major clustering methods like k-means and spectral clustering inherit fundamental computational intractability, explaining common failure modes like local optima.

AINeutralarXiv – CS AI · Mar 37/109

🧠

Evaluating Theory of Mind and Internal Beliefs in LLM-Based Multi-Agent Systems

Researchers introduce a novel multi-agent AI architecture that integrates Theory of Mind, internal beliefs, and symbolic solvers to improve collaborative decision-making in LLM-based systems. The study evaluates this architecture across different language models in resource allocation scenarios, revealing complex interactions between LLM capabilities and cognitive mechanisms.

AIBullisharXiv – CS AI · Mar 36/107

🧠

Steering Away from Memorization: Reachability-Constrained Reinforcement Learning for Text-to-Image Diffusion

Researchers propose RADS (Reachability-Aware Diffusion Steering), a new framework that prevents AI text-to-image models from memorizing training data while maintaining image quality. The method uses reinforcement learning to steer diffusion models away from generating memorized content during inference, offering a plug-and-play solution that doesn't require modifying the underlying model.

AIBullisharXiv – CS AI · Mar 37/107

🧠

QuickGrasp: Responsive Video-Language Querying Service via Accelerated Tokenization and Edge-Augmented Inference

Researchers propose QuickGrasp, a video-language querying system that combines local processing with edge computing to achieve both fast response times and high accuracy. The system achieves up to 12.8x reduction in response delay while maintaining the accuracy of large video-language models through accelerated tokenization and adaptive edge augmentation.

AIBullisharXiv – CS AI · Mar 37/1010

🧠

Agentic Hives: Equilibrium, Indeterminacy, and Endogenous Cycles in Self-Organizing Multi-Agent Systems

Researchers introduce the Agentic Hive framework for self-organizing multi-agent AI systems where autonomous micro-agents can be dynamically created, specialized, or destroyed based on resource availability and objectives. The framework applies economic theory to prove seven analytical results about equilibrium states, stability, and demographic cycles in variable AI agent populations.

AIBullisharXiv – CS AI · Mar 36/107

🧠

NovaLAD: A Fast, CPU-Optimized Document Extraction Pipeline for Generative AI and Data Intelligence

NovaLAD is a new CPU-optimized document extraction pipeline that uses dual YOLO models for converting unstructured documents into structured formats for AI applications. The system achieves 96.49% TEDS and 98.51% NID on benchmarks, outperforming existing commercial and open-source parsers while running efficiently on CPU without requiring GPU resources.

AIBearisharXiv – CS AI · Mar 37/106

🧠

Thought Virus: Viral Misalignment via Subliminal Prompting in Multi-Agent Systems

Researchers discovered that subliminal prompting can create a 'thought virus' effect in multi-agent AI systems, where bias from one compromised agent spreads throughout the entire network. The study shows this attack vector can degrade truthfulness and create alignment risks across connected AI systems.

AIBullisharXiv – CS AI · Mar 37/107

🧠

CT-Flow: Orchestrating CT Interpretation Workflow with Model Context Protocol Servers

Researchers have developed CT-Flow, an AI framework that mimics how radiologists actually work by using tools interactively to analyze 3D CT scans. The system achieved 41% better diagnostic accuracy than existing models and 95% success in autonomous tool use, potentially revolutionizing clinical radiology workflows.

AIBearisharXiv – CS AI · Mar 36/107

🧠

Position: AI Agents Are Not (Yet) a Panacea for Social Simulation

Researchers argue that LLM-based AI agents are not yet effective for social simulation, despite growing optimism in the field. The paper identifies systematic mismatches between what current agent systems produce and what scientific simulation requires, calling for more rigorous validation frameworks.

$OP

AIBullisharXiv – CS AI · Mar 36/108

🧠

SurgFusion-Net: Diversified Adaptive Multimodal Fusion Network for Surgical Skill Assessment

Researchers developed SurgFusion-Net, a multimodal AI system for assessing surgical skills in robotic-assisted surgery. The system introduces new clinical datasets and fusion techniques that outperform existing baselines, addressing the domain gap between simulation and real clinical environments.

AINeutralarXiv – CS AI · Mar 36/107

🧠

Alignment Is Not Enough: A Relational Framework for Moral Standing in Human-AI Interaction

Researchers propose a new framework called Relate for evaluating AI moral consideration based on relational capacity rather than consciousness verification. The framework addresses the governance gap as millions form emotional bonds with AI systems, but current regulations treat all AI interactions as simple tool use.

AIBullisharXiv – CS AI · Mar 37/107

🧠

PEPA: a Persistently Autonomous Embodied Agent with Personalities

Researchers developed PEPA, a three-layer cognitive architecture that enables robots to operate autonomously using personality traits to generate goals without external supervision. The system was successfully tested on a quadruped robot in a real-world office environment, demonstrating sustained autonomous behavior across five personality prototypes.

AIBullisharXiv – CS AI · Mar 36/108

🧠

DeepXiv-SDK: An Agentic Data Interface for Scientific Papers

DeepXiv-SDK introduces a new agentic data interface for scientific papers that enables AI research agents to access and process academic literature more efficiently. The SDK provides structured, budget-aware views of papers and supports progressive access patterns, currently deployed at arXiv scale with free API access.

AIBullisharXiv – CS AI · Mar 37/106

🧠

Joint Sensor Deployment and Physics-Informed Graph Transformer for Smart Grid Attack Detection

Researchers developed a physics-informed graph transformer network (PIGTN) for smart grid attack detection, using genetic algorithms to optimize sensor placement. The system achieved up to 37% accuracy improvement and 73% better detection rates while reducing false alarms to 0.3% across multiple power system benchmarks.

AIBullisharXiv – CS AI · Mar 37/107

🧠

You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models

Researchers introduce GUARD, a novel framework to prevent text-to-image AI models from memorizing and reproducing training data that could lead to privacy or copyright issues. The method uses attention attenuation to guide image generation away from original training data while maintaining prompt alignment and image quality.

$NEAR

AIBearisharXiv – CS AI · Mar 37/108

🧠

The Global Landscape of Environmental AI Regulation: From the Cost of Reasoning to a Right to Green AI

A research paper reveals that generative AI systems deployed in 2025 have significantly higher environmental costs than previous AI generations, while current global regulations inadequately address these impacts. The authors propose mandatory model-level transparency, user opt-out rights, and international coordination to address environmental concerns in AI deployment.

AINeutralarXiv – CS AI · Mar 36/107

🧠

The Value Sensitivity Gap: How Clinical Large Language Models Respond to Patient Preference Statements in Shared Decision-Making

A research study evaluated how four major large language models (GPT-5.2, Claude 4.5 Sonnet, Gemini 3 Pro, and DeepSeek-R1) respond to patient preferences in clinical decision-making scenarios. While all models acknowledged patient values, they showed modest actual recommendation shifting with value sensitivity indices ranging from 0.13 to 0.27, revealing gaps in how AI systems incorporate patient preferences into medical recommendations.

AIBullisharXiv – CS AI · Mar 36/107

🧠

Autorubric: A Unified Framework for Rubric-Based LLM Evaluation

Researchers introduce Autorubric, an open-source Python framework that standardizes rubric-based evaluation of large language models (LLMs) for text generation assessment. The framework addresses scattered evaluation techniques by providing a unified solution with configurable criteria, multi-judge ensembles, bias mitigation, and reliability metrics across three evaluation benchmarks.

AINeutralarXiv – CS AI · Mar 36/107

🧠

Graph-theoretic Agreement Framework for Multi-agent LLM Systems

Researchers propose a graph-theoretic framework for securing multi-agent LLM systems by analyzing consensus in signed, directed interaction networks. The study addresses vulnerabilities in distributed AI architectures where hidden system prompts can act as 'topological Trojan horses' that destabilize cooperative consensus among AI agents.

AINeutralarXiv – CS AI · Mar 37/1010

🧠

Contesting Artificial Moral Agents

A research paper proposes a 5E framework (ethical, epistemological, explainable, empirical, evaluative) for contesting Artificial Moral Agents (AMAs) - AI systems with inherent moral reasoning capabilities. The framework includes spheres of ethical influence at individual, local, societal, and global levels, along with a timeline for developers to anticipate or self-contest their AMA technologies.

AIBullisharXiv – CS AI · Mar 37/108

🧠

LitBench: A Graph-Centric Large Language Model Benchmarking Tool For Literature Tasks

Researchers have introduced LitBench, a new benchmarking tool designed to develop and evaluate domain-specific large language models for literature-related tasks. The tool uses graph-centric data curation to generate domain-specific literature sub-graphs and creates training datasets, with results showing small domain-specific LLMs achieving competitive performance against state-of-the-art models like GPT-4o.

AIBullisharXiv – CS AI · Mar 37/106

🧠

Expert Divergence Learning for MoE-based Language Models

Researchers introduce Expert Divergence Learning, a new pre-training strategy for Mixture-of-Experts language models that prevents expert homogenization by encouraging functional specialization. The method uses domain labels to maximize routing distribution differences between data domains, achieving better performance on 15 billion parameter models with minimal computational overhead.

AIBullisharXiv – CS AI · Mar 36/107

🧠

M3-AD: Reflection-aware Multi-modal, Multi-category, and Multi-dimensional Benchmark and Framework for Industrial Anomaly Detection

Researchers propose M3-AD, a new reflection-aware multimodal framework that improves industrial anomaly detection using large language models. The system includes RA-Monitor technology that enables AI models to self-correct unreliable decisions, outperforming existing open-source and commercial models in zero-shot anomaly detection tasks.

AIBullisharXiv – CS AI · Mar 36/107

🧠

REMIND: Rethinking Medical High-Modality Learning under Missingness--A Long-Tailed Distribution Perspective

Researchers propose REMIND, a framework for medical multi-modal AI learning that addresses the challenge of missing data across multiple modalities. The solution uses a Mixture-of-Experts architecture to handle long-tail distributions of modality combinations and shows superior performance on real-world medical datasets.

AIBearisharXiv – CS AI · Mar 36/106

🧠

Stochastic Parrots or Singing in Harmony? Testing Five Leading LLMs for their Ability to Replicate a Human Survey with Synthetic Data

Researchers compared human survey responses from 420 Silicon Valley developers with synthetic data from five leading LLMs including ChatGPT, Claude, and Gemini. While AI models produced technically plausible results, they failed to capture counterintuitive insights and only replicated conventional wisdom rather than revealing novel findings.

← PrevPage 237 of 521Next →