🧠

AI

12,712 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

12712 articles

AINeutralarXiv – CS AI · Apr 146/10

🧠

The Missing Knowledge Layer in Cognitive Architectures for AI Agents

Researchers identify a critical architectural gap in leading AI agent frameworks (CoALA and JEPA), which lack an explicit Knowledge layer with distinct persistence semantics. The paper proposes a four-layer decomposition model with fundamentally different update mechanics for knowledge, memory, wisdom, and intelligence, with working implementations demonstrating feasibility.

AINeutralarXiv – CS AI · Apr 146/10

🧠

From Agent Loops to Structured Graphs:A Scheduler-Theoretic Framework for LLM Agent Execution

Researchers propose SGH (Structured Graph Harness), a framework that replaces iterative Agent Loops with explicit directed acyclic graphs (DAGs) for LLM agent execution. The approach addresses structural weaknesses in current agent design by enforcing immutable execution plans, separating planning from recovery, and implementing strict escalation protocols, trading some flexibility for improved controllability and verifiability.

AINeutralarXiv – CS AI · Apr 146/10

🧠

From Attribution to Action: A Human-Centered Application of Activation Steering

Researchers introduce an interactive workflow combining Sparse Autoencoders (SAE) and activation steering to make AI explainability actionable for practitioners. Through expert interviews with debugging tasks on CLIP, the study reveals that activation steering enables hypothesis testing and intervention-based debugging, though practitioners emphasize trust in observed model behavior over explanation plausibility and identify risks like ripple effects and limited generalization.

$XRP

AINeutralarXiv – CS AI · Apr 146/10

🧠

Agentic Driving Coach: Robustness and Determinism of Agentic AI-Powered Human-in-the-Loop Cyber-Physical Systems

Researchers propose a reactor-model-of-computation approach using the Lingua Franca framework to address nondeterminism challenges in AI-powered human-in-the-loop cyber-physical systems. The study uses an agentic driving coach as a case study to demonstrate how foundation models like LLMs can be deployed in safety-critical applications while maintaining deterministic behavior despite unpredictable human and environmental variables.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Retrieval Is Not Enough: Why Organizational AI Needs Epistemic Infrastructure

Researchers present OIDA, a framework that adds epistemic structure to organizational knowledge systems by tracking commitment strength, contradiction status, and gaps in understanding. The framework introduces a QUESTION primitive that surfaces organizational ignorance with increasing urgency, addressing a capability absent from current retrieval-augmented generation (RAG) systems.

AINeutralarXiv – CS AI · Apr 146/10

🧠

The Paradox of Professional Input: How Expert Collaboration with AI Systems Shapes Their Future Value

A research paper examines the paradox where professionals collaborating with AI systems to enhance their capabilities risk accelerating automation of their own expertise. The analysis proposes frameworks for professionals to preserve and transform their value while codifying tacit knowledge, with implications for education and organizational policy.

AIBullisharXiv – CS AI · Apr 146/10

🧠

MCERF: Advancing Multimodal LLM Evaluation of Engineering Documentation with Enhanced Retrieval

Researchers introduce MCERF, a multimodal retrieval framework that combines vision-language models with LLM reasoning to improve question-answering from engineering documents. The system achieves a 41.1% relative accuracy improvement over baseline RAG systems by handling complex multimodal content like tables, diagrams, and dense technical text through adaptive routing and hybrid retrieval strategies.

AINeutralarXiv – CS AI · Apr 146/10

🧠

SRBench: A Comprehensive Benchmark for Sequential Recommendation with Large Language Models

SRBench introduces a comprehensive evaluation framework for Sequential Recommendation models that combines Large Language Models with traditional neural network approaches. The benchmark addresses critical gaps in existing evaluation methodologies by incorporating fairness, stability, and efficiency metrics alongside accuracy, while establishing fair comparison mechanisms between LLM-based and neural network-based recommendation systems.

🏢 Meta

AIBullisharXiv – CS AI · Apr 146/10

🧠

AEG: A Baremetal Framework for AI Acceleration via Direct Hardware Access in Heterogeneous Accelerators

Researchers introduce AEG, a bare-metal runtime framework that enables high-performance machine learning inference on heterogeneous AI accelerators without OS overhead. The system achieves 9.2× higher compute efficiency and uses 11× fewer hardware tiles than Linux-based alternatives, demonstrating significant potential for edge AI deployment optimization.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Neuro-Symbolic Strong-AI Robots with Closed Knowledge Assumption: Learning and Deductions

This academic paper proposes a neuro-symbolic approach for AGI robots combining neural networks with formal logic reasoning using Belnap's 4-valued logic system. The framework enables robots to handle unknown information, inconsistencies, and paradoxes while maintaining controlled security through axiom-based logic inference.

AIBullisharXiv – CS AI · Apr 146/10

🧠

Tuning Qwen2.5-VL to Improve Its Web Interaction Skills

Researchers fine-tuned Qwen2.5-VL-32B, a leading open-source vision-language model, to improve its ability to autonomously perform web interactions through visual input alone. Using a two-stage training approach that addresses cursor localization, instruction sensitivity, and overconfidence bias, the model's success rate on single-click web tasks improved from 86% to 94%.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Characterizing Performance-Energy Trade-offs of Large Language Models in Multi-Request Workflows

Researchers present the first systematic study of performance-energy trade-offs in multi-request LLM inference workflows, using NVIDIA A100 GPUs and vLLM/Parrot serving systems. The study identifies batch size as the most impactful optimization lever, though effectiveness varies by workload type, and reveals that workflow-aware scheduling can reduce energy consumption under power constraints.

🏢 Nvidia

AINeutralarXiv – CS AI · Apr 146/10

🧠

Assessing the Pedagogical Readiness of Large Language Models as AI Tutors in Low-Resource Contexts: A Case Study of Nepal's K-10 Curriculum

A comprehensive study evaluates four state-of-the-art LLMs (GPT-4o, Claude Sonnet 4, Qwen3-235B, Kimi K2) for use as AI tutors in Nepal's K-10 curriculum, revealing significant pedagogical gaps despite high technical accuracy. The research identifies critical failure modes including inability to simplify complex concepts for young learners and poor cultural contextualization, concluding that current LLMs require human oversight and curriculum-specific fine-tuning before classroom deployment in low-resource regions.

🧠 GPT-4🧠 Claude🧠 Sonnet

AINeutralarXiv – CS AI · Apr 146/10

🧠

Explainability and Certification of AI-Generated Educational Assessments

Researchers propose a comprehensive framework for making AI-generated educational assessments transparent, explainable, and certifiable through self-rationalization, attribution analysis, and post-hoc verification. The framework introduces a metadata schema and traffic-light certification workflow designed to meet institutional accreditation standards, with proof-of-concept testing on 500 computer science questions demonstrating improved transparency and reduced instructor workload.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Assessing Model-Agnostic XAI Methods against EU AI Act Explainability Requirements

Researchers have developed a framework to assess how well existing explainable AI (XAI) methods comply with the EU AI Act's transparency requirements. The study bridges the gap between current XAI techniques and regulatory mandates by proposing a scoring system that translates expert qualitative assessments into quantitative compliance metrics, helping practitioners navigate AI regulation in European markets.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Adoption and Effectiveness of AI-Based Anomaly Detection for Cross Provider Health Data Exchange

A research study presents a readiness framework and practical deployment strategy for AI-based anomaly detection in multi-provider healthcare environments. The research combines organizational assessment criteria with machine learning performance evaluation, demonstrating that hybrid rule-based and isolation forest approaches optimize both detection coverage and alert efficiency in cross-provider EHR systems.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Agentic AI in Engineering and Manufacturing: Industry Perspectives on Utility, Adoption, Challenges, and Opportunities

A qualitative study of 30+ industry interviews reveals that agentic AI adoption in engineering and manufacturing is progressing cautiously, with near-term value concentrated in structured, repetitive tasks and data synthesis. Adoption barriers stem primarily from fragmented data infrastructures, legacy system integration challenges, and organizational gaps rather than model capability limitations, requiring robust verification frameworks and human-in-the-loop governance before higher-order automation can scale.

AINeutralarXiv – CS AI · Apr 146/10

🧠

From Understanding to Creation: A Prerequisite-Free AI Literacy Course with Technical Depth Across Majors

George Mason University's UNIV 182 course demonstrates that AI literacy education can achieve both technical depth and broad accessibility without prerequisites. The course uses a five-part pedagogical framework including structured problem-solving pipelines, ethics integration, peer critique sessions, cumulative portfolios, and AI tutoring agents to guide non-technical undergraduates from conceptual understanding to building functional AI systems.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Deliberative Alignment is Deep, but Uncertainty Remains: Inference time safety improvement in reasoning via attribution of unsafe behavior to base model

Researchers demonstrate that deliberative alignment—a method for improving LLM safety by distilling reasoning from stronger models—still allows unsafe behaviors from base models to persist despite learning safer reasoning patterns. They propose a Best-of-N sampling technique that reduces attack success rates by 28-35% across multiple benchmarks while maintaining utility.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Do We Still Need GraphRAG? Benchmarking RAG and GraphRAG for Agentic Search Systems

A new benchmark study (RAGSearch) evaluates whether agentic search systems can reduce the need for expensive GraphRAG pipelines by dynamically retrieving information across multiple rounds. Results show agentic search significantly improves standard RAG performance and narrows the gap to GraphRAG, though GraphRAG retains advantages for complex multi-hop reasoning tasks when preprocessing costs are considered.

🏢 Meta

AINeutralarXiv – CS AI · Apr 146/10

🧠

Human-like Working Memory Interference in Large Language Models

Researchers discovered that large language models exhibit working memory limitations similar to humans, encoding multiple memory items in entangled representations that require interference control rather than direct retrieval. This finding reveals a shared computational constraint between biological and artificial systems, suggesting that working memory capacity may be a fundamental bottleneck in intelligent systems rather than a limitation unique to biological brains.

AINeutralarXiv – CS AI · Apr 146/10

🧠

A Comparative Theoretical Analysis of Entropy Control Methods in Reinforcement Learning

Researchers present a theoretical framework comparing entropy control methods in reinforcement learning for LLMs, showing that covariance-based regularization outperforms traditional entropy regularization by avoiding policy bias and achieving asymptotic unbiasedness. This analysis addresses a critical scaling challenge in RL-based LLM training where rapid policy entropy collapse limits model performance.

AINeutralarXiv – CS AI · Apr 146/10

🧠

ConfigSpec: Profiling-Based Configuration Selection for Distributed Edge--Cloud Speculative LLM Serving

ConfigSpec introduces a profiling-based framework for optimizing distributed LLM inference across edge-cloud systems using speculative decoding. The research reveals that no single configuration can simultaneously optimize throughput, cost efficiency, and energy efficiency—requiring dynamic, device-aware configuration selection rather than fixed deployments.

AINeutralarXiv – CS AI · Apr 146/10

🧠

A-IO: Adaptive Inference Orchestration for Memory-Bound NPUs

A-IO addresses critical memory-bound bottlenecks in LLM deployment on NPU platforms like Ascend 910B by tackling the 'Model Scaling Paradox' and limitations of current speculative decoding techniques. The research reveals that static single-model deployment strategies and kernel synchronization overhead significantly constrain inference performance on heterogeneous accelerators.

AINeutralarXiv – CS AI · Apr 146/10

🧠

Explainable Human Activity Recognition: A Unified Review of Concepts and Mechanisms

A comprehensive review examines explainable AI methods for human activity recognition (HAR) systems across wearable, ambient, and physiological sensors. The paper addresses the critical gap between deep learning's performance improvements and the opacity that limits real-world deployment, proposing a unified framework for understanding XAI mechanisms in HAR applications.

← PrevPage 145 of 509Next →