🧠

AI

21,451 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

21451 articles

AIBullisharXiv – CS AI · Mar 36/104

🧠

AISSISTANT: Human-AI Collaborative Review and Perspective Research Workflows in Data Science

Researchers introduce AIssistant, an open-source framework that combines human expertise with AI agents to streamline scientific review and perspective paper creation in data science. The system uses 15 specialized LLM-driven agents across two workflows and demonstrates 65.7% time savings while maintaining research quality through strategic human oversight.

AIBullisharXiv – CS AI · Mar 36/105

🧠

Re4: Scientific Computing Agent with Rewriting, Resolution, Review and Revision

Researchers have developed Re4, a multi-agent AI framework that uses three specialized LLMs (Consultant, Reviewer, and Programmer) working collaboratively to solve scientific computing problems. The system employs a rewriting-resolution-review-revision process that significantly improves bug-free code generation and reduces non-physical solutions in mathematical and scientific reasoning tasks.

$LINK

AIBullisharXiv – CS AI · Mar 36/104

🧠

A Message Passing Realization of Expected Free Energy Minimization

Researchers developed a message passing approach for Expected Free Energy minimization that transforms complex combinatorial search problems into tractable inference problems. The method enables more efficient AI agent planning and exploration under uncertainty, outperforming conventional approaches in test environments.

AIBullisharXiv – CS AI · Mar 36/103

🧠

See, Think, Act: Teaching Multimodal Agents to Effectively Interact with GUI by Identifying Toggles

Researchers have developed State-aware Reasoning (StaR), a new multimodal AI method that significantly improves AI agents' ability to interact with graphical user interfaces, particularly with toggle controls. The method enables agents to better perceive current states and execute instructions accordingly, improving toggle execution accuracy by over 30%.

AINeutralarXiv – CS AI · Mar 35/103

🧠

Behavioral Generative Agents for Energy Operations

Researchers developed behavioral generative agents powered by large language models to simulate consumer decision-making in energy operations. The study found these AI agents can model heterogeneous customer behavior and provide insights into rare events like blackouts, offering a scalable tool for energy policy analysis.

AIBullisharXiv – CS AI · Mar 36/103

🧠

Meta-Adaptive Prompt Distillation for Few-Shot Visual Question Answering

Researchers developed a meta-learning approach for Large Multimodal Models (LMMs) that uses distilled soft prompts to improve few-shot visual question answering performance. The method outperformed traditional in-context learning by 21.2% and parameter-efficient finetuning by 7.7% on VQA tasks.

AINeutralarXiv – CS AI · Mar 36/103

🧠

Beyond RLHF and NLHF: Population-Proportional Alignment under an Axiomatic Framework

Researchers have developed a new preference learning framework that addresses bias in AI alignment by ensuring policies reflect true population distributions rather than just majority opinions. The approach uses social choice theory principles and has been validated on both recommendation tasks and large language model alignment.

AIBullisharXiv – CS AI · Mar 36/104

🧠

FAuNO: Semi-Asynchronous Federated Reinforcement Learning Framework for Task Offloading in Edge Systems

Researchers have developed FAuNO, a new federated reinforcement learning framework that uses asynchronous processing to optimize task distribution in edge computing networks. The system employs an actor-critic architecture where local nodes learn specific dynamics while a central critic coordinates overall system performance, demonstrating superior results in reducing latency and task loss compared to existing methods.

AIBullisharXiv – CS AI · Mar 36/104

🧠

Reason Like a Radiologist: Chain-of-Thought and Reinforcement Learning for Verifiable Report Generation

Researchers introduce BoxMed-RL, a new AI framework that uses chain-of-thought reasoning and reinforcement learning to generate spatially verifiable radiology reports. The system mimics radiologist workflows by linking visual findings to precise anatomical locations, achieving 7% improvement over existing methods in key performance metrics.

$LINK

AINeutralarXiv – CS AI · Mar 36/103

🧠

The First Impression Problem: Internal Bias Triggers Overthinking in Reasoning Models

Researchers identified 'internal bias' as a key cause of overthinking in AI reasoning models, where models form preliminary guesses that conflict with systematic reasoning. The study found that excessive attention to input questions triggers redundant reasoning steps, and current mitigation methods have proven ineffective.

AIBullisharXiv – CS AI · Mar 36/104

🧠

Endowing Embodied Agents with Spatial Reasoning Capabilities for Vision-and-Language Navigation

Researchers introduce BrainNav, a bio-inspired navigation framework that mimics biological spatial cognition to enhance Vision-and-Language Navigation in mobile robots. The system addresses spatial hallucination issues when transferring from simulation to real-world environments, demonstrating superior performance in zero-shot real-world testing.

AINeutralarXiv – CS AI · Mar 35/104

🧠

SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents

Researchers introduced SimuHome, a high-fidelity smart home simulator and benchmark with 600 episodes for testing LLM-based smart home agents. The system uses the Matter protocol standard and enables time-accelerated simulation to evaluate how AI agents handle device control, environmental monitoring, and workflow scheduling in smart homes.

AIBearisharXiv – CS AI · Mar 36/104

🧠

SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs

Researchers introduced SimpleToM, a benchmark revealing that state-of-the-art language models can infer mental states but struggle to apply that knowledge for behavior prediction and judgment. The study exposes a critical gap between explicit Theory of Mind inference and implicit application in real-world scenarios.

AIBearisharXiv – CS AI · Mar 36/104

🧠

Who Gets Cited Most? Benchmarking Long-Context Numerical Reasoning on Scientific Articles

Researchers introduced SciTrek, a new benchmark for testing large language models' ability to perform numerical reasoning across long scientific documents. The benchmark reveals significant challenges for current LLMs, with the best model achieving only 46.5% accuracy at 128K tokens, and performance declining as context length increases.

$COMP

AIBullisharXiv – CS AI · Mar 36/103

🧠

Token-Importance Guided Direct Preference Optimization

Researchers propose Token-Importance Guided Direct Preference Optimization (TI-DPO), a new framework for aligning Large Language Models with human preferences. The method uses hybrid weighting mechanisms and triplet loss to achieve more accurate and robust AI alignment compared to existing Direct Preference Optimization approaches.

AIBullisharXiv – CS AI · Mar 36/104

🧠

A Contemporary Overview: Trends and Applications of Large Language Models on Mobile Devices

Large language models (LLMs) are increasingly being deployed on mobile devices, enabling applications like voice assistants, real-time translation, and intelligent recommendations. Advancements in hardware and 5G infrastructure allow for efficient local inference while improving data privacy and reducing cloud dependency.

AINeutralarXiv – CS AI · Mar 36/103

🧠

Theoretical Foundations of Superhypergraph and Plithogenic Graph Neural Networks

Researchers have developed theoretical foundations for SuperHyperGraph Neural Networks (SHGNNs) and Plithogenic Graph Neural Networks, extending traditional graph neural networks to handle complex hierarchical structures and multi-valued attributes. These advanced frameworks aim to better model uncertainty and higher-order interactions in complex networks beyond the capabilities of standard graph neural networks.

AIBullisharXiv – CS AI · Mar 35/104

🧠

Electric Vehicle User Charging Behavior Analysis Integrating Psychological and Environmental Factors: A Statistical-Driven LLM based Agent Approach

Researchers developed a novel framework using large language models (LLMs) to analyze electric vehicle taxi driver charging behavior by integrating psychological traits and environmental factors. The study demonstrates that LLMs can reliably simulate real-world charging decisions across multiple urban environments, providing insights for optimizing charging infrastructure and energy policy.

AINeutralarXiv – CS AI · Mar 36/103

🧠

LLMs as Strategic Actors: Behavioral Alignment, Risk Calibration, and Argumentation Framing in Geopolitical Simulations

A research study evaluated six state-of-the-art large language models in geopolitical crisis simulations, comparing their decision-making to human behavior. The study found that LLMs initially mirror human decisions but diverge over time, consistently exhibiting cooperative, stability-focused strategies with limited adversarial reasoning.

AIBullisharXiv – CS AI · Mar 36/103

🧠

FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding

FluxMem is a new training-free framework for streaming video understanding that uses hierarchical memory compression to reduce computational costs. The system achieves state-of-the-art performance on video benchmarks while reducing latency by 69.9% and GPU memory usage by 34.5%.

AIBullisharXiv – CS AI · Mar 36/104

🧠

LiftAvatar: Kinematic-Space Completion for Expression-Controlled 3D Gaussian Avatar Animation

LiftAvatar is a new AI system that enhances 3D avatar animation by completing sparse monocular video observations in kinematic space using expression-controlled video diffusion Transformers. The technology addresses limitations in 3D Gaussian Splatting-based avatars by generating high-quality, temporally coherent facial expressions from single or multiple reference images.

AIBullisharXiv – CS AI · Mar 36/103

🧠

Detection-Gated Glottal Segmentation with Zero-Shot Cross-Dataset Transfer and Clinical Feature Extraction

Researchers developed a detection-gated AI pipeline combining YOLOv8 and U-Net for accurate glottal segmentation in medical videoendoscopy. The system achieved state-of-the-art performance with zero-shot transfer capabilities across different clinical datasets, enabling real-time extraction of vocal function biomarkers at 35 frames per second.

AINeutralarXiv – CS AI · Mar 36/103

🧠

Scaling Retrieval Augmented Generation with RAG Fusion: Lessons from an Industry Deployment

Research on production RAG systems reveals that retrieval fusion techniques like multi-query retrieval and reciprocal rank fusion increase raw document recall but fail to improve end-to-end performance due to re-ranking limits and context constraints. The study found fusion variants actually decreased accuracy from 0.51 to 0.48 while adding latency overhead without corresponding benefits.

AIBullisharXiv – CS AI · Mar 36/104

🧠

Cognitive Prosthetic: An AI-Enabled Multimodal System for Episodic Recall in Knowledge Work

Researchers have developed the Cognitive Prosthetic Multimodal System (CPMS), an AI-enabled proof-of-concept that helps knowledge workers recall workplace experiences by capturing speech, physiological signals, and gaze behavior into queryable episodic memories. The system processes data locally for privacy and allows natural language queries to retrieve past workplace interactions based on semantic content, time, attention, or physiological state.

AIBullisharXiv – CS AI · Mar 36/104

🧠

"When to Hand Off, When to Work Together": Expanding Human-Agent Co-Creative Collaboration through Concurrent Interaction

Researchers developed CLEO, an AI system that enables real-time collaborative context awareness between humans and AI agents by interpreting concurrent user actions on shared artifacts. A study with professional designers identified key interaction patterns and decision factors for when to delegate work to AI versus collaborate directly.

← PrevPage 552 of 859Next →