AI × Crypto News Feed

Real-time AI-curated news from 34,840+ articles across 50+ sources. Sentiment analysis, importance scoring, and key takeaways — updated every 15 minutes.

34840 articles

AINeutralarXiv – CS AI · 16h ago6/10

🧠

UTS at PsyDefDetect: Multi-Agent Councils and Absence-Based Reasoning for Defense Mechanism Classification

Researchers from UTS achieved second place in a psychological defense mechanism classification competition using a multi-agent AI system that identifies defense patterns through absence-based reasoning rather than presence detection. The system combines Gemini 2.5 agents with fine-tuned Qwen models to achieve an F1 score of 0.406, addressing critical biases in minority class prediction through structured ensemble methods.

🧠 Gemini

AINeutralarXiv – CS AI · 16h ago6/10

🧠

Probing Routing-Conditional Calibration in Attention-Residual Transformers

Researchers question whether routing traces in Attention-Residual transformers provide genuine evidence of improved post-hoc calibration beyond standard confidence metrics. Through rigorous statistical testing with matched controls, the study finds that routing-specific features offer minimal stable evidence of better calibration, suggesting previous claims of calibration improvements may reflect methodological artifacts rather than true model improvements.

AIBullisharXiv – CS AI · 16h ago6/10

🧠

AI-Care: A Conversational Agentic System for Task Coordination in Alzheimer's Disease Care

AI-Care is a conversational AI system designed to help individuals with Alzheimer's disease and related dementia manage daily tasks through natural language interaction, reducing cognitive barriers to using digital tools. The system prioritizes safety through caregiver-verified records and controlled clarification flows, with preliminary pilot testing showing positive user trust and task completion outcomes.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

PDEAgent-Bench: A Multi-Metric, Multi-Library Benchmark for PDE Solver Generation

Researchers introduced PDEAgent-Bench, the first comprehensive benchmark for evaluating AI systems that generate numerical solvers from partial differential equations (PDEs). The benchmark contains 645 test cases across multiple PDE families and finite-element libraries, revealing that while current LLMs can produce runnable code, they substantially fail when accuracy and efficiency requirements are enforced.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

TIDE-Bench: Task-Aware and Diagnostic Evaluation of Tool-Integrated Reasoning

Researchers introduce TIDE-Bench, a comprehensive evaluation benchmark for tool-integrated reasoning (TIR) systems that assess how well large language models leverage external tools. The benchmark addresses critical gaps in existing evaluations by combining traditional tasks with novel experimental design and interactive scenarios, measuring not just accuracy but tool efficiency and inference costs.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

Absurd World: A Simple Yet Powerful Method to Absurdify the Real-world for Probing LLM Reasoning Capabilities

Researchers introduce Absurd World, a benchmarking framework that tests large language models' logical reasoning by creating logically coherent but unrealistic scenarios derived from real-world problems. The framework reveals whether LLMs can reason independently of learned patterns by breaking down real-world models into symbols, actions, sequences, and events, then systematically altering them while preserving underlying logic.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

CodeClinic: Evaluating Automation of Coding Skills for Clinical Reasoning Agents

CodeClinic introduces a benchmark for evaluating whether large language model agents can autonomously generate clinical skills rather than relying on pre-built tool libraries. The research demonstrates that an offline autoformalization pipeline converting clinical guidelines into Python libraries improves consistency and reduces token usage by 40% compared to zero-shot code generation.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

Done, But Not Sure: Disentangling World Completion from Self-Termination in Embodied Agents

Researchers introduce VIGIL, an evaluation framework that separately measures whether embodied AI agents correctly complete tasks and properly report success, rather than conflating execution failures with commitment failures. Testing across 20 models reveals significant performance gaps in terminal commitment despite similar task execution, highlighting a critical blind spot in current AI agent benchmarking.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

A Reconfigurable Multiplier Architecture for Error-Resilient Applications in RISC-V Core

Researchers have developed a reconfigurable multiplier architecture for RISC-V processors that dynamically adjusts between exact and approximate computation modes to optimize energy efficiency in neural network inference. The design achieves 44-68% power reduction depending on mode while maintaining computational performance, with demonstrated energy consumption of 1.21 pJ/instruction for matrix multiplication operations.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

PrepBench: How Far Are We from Natural-Language-Driven Data Preparation?

Researchers introduce PrepBench, a new benchmark for evaluating how well large language models can handle natural language-driven data preparation tasks. The benchmark reveals that despite recent LLM advances, current models still struggle significantly with translating user intent into executable data preparation workflows, particularly when handling ambiguous requirements and complex real-world datasets.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

MoPO: Incorporating Motion Prior for Occluded Human Mesh Recovery

Researchers introduce MoPO, a novel method for recovering human mesh models from occluded images by leveraging motion prediction from pose sequences. The approach combines spatial-temporal occlusion detection with lightweight motion prediction to estimate hidden body parts, achieving state-of-the-art results on occlusion benchmarks while reducing temporal inconsistencies.

AINeutralarXiv – CS AI · 16h ago5/10

🧠

Trajectory Supervision for Continual Tool-Use Learning in LLMs

Researchers demonstrate that preserving API request/response trajectories during continual learning significantly improves tool-use performance in language models. Fine-tuning Llama 3.1 8B on sequential API domains shows trajectory supervision achieves 56.9% accuracy versus 39.2% without intermediate context, though at a 25.1% token cost increase.

🧠 Llama

AINeutralarXiv – CS AI · 16h ago6/10

🧠

Core-Halo Decomposition: Decentralizing Large-Scale Fixed-Point Problems

Researchers propose Core-Halo decomposition, a novel approach to solving large-scale fixed-point problems in decentralized systems that separates write ownership from read-only evaluation context. Unlike standard strict decomposition methods that create structural bias by truncating dependencies, Core-Halo aligns with block-dependence structures to enable faithful implementation of the original fixed-point problem across distributed multi-agent systems while maintaining parallelism benefits.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

Unpredictability dissociates from structured control in language agents

Researchers demonstrate that unpredictability in language agents does not equate to effective control, finding that structured decision-making mechanisms significantly outperform stochastic sampling across 74,352 test cases. The study challenges assumptions about randomness and control in AI systems, with implications for agent reliability and interpretability.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

Mirror, Mirror on the Wall: Can VLM Agents Tell Who They Are at All?

Researchers introduced a benchmark testing whether vision-language model (VLM) agents can recognize themselves in mirrors, a cognitive capability that emerges only in some animal species. Results show self-identification through reflection occurs mainly in stronger VLMs, while weaker models fail to extract self-relevant information despite viewing their reflections, revealing that language-based self-reference alone does not guarantee grounded self-understanding.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

Value-Decomposed Reinforcement Learning Framework for Taxiway Routing with Hierarchical Conflict-Aware Observations

Researchers present CaTR, a reinforcement learning framework that optimizes real-time taxiway routing and conflict avoidance for multiple aircraft at airports. The system uses hierarchical traffic representation and value-decomposed learning to balance safety and efficiency, demonstrating superior performance compared to traditional planning and optimization methods while maintaining practical computational speed.

AINeutralarXiv – CS AI · 16h ago5/10

🧠

Functional Stable Model Semantics and Answer Set Programming Modulo Theories

Researchers demonstrate how functional stable model semantics enhances Answer Set Programming Modulo Theories (ASPMT), enabling integration of intensional functions that derive values from other predicates rather than pre-defined sources. The framework allows tight ASPMT programs to translate into SMT instances, extending the theoretical foundations of logic programming.

AINeutralarXiv – CS AI · 16h ago5/10

🧠

Weighted Rules under the Stable Model Semantics

Researchers introduce weighted rules under stable model semantics, combining logic programming with probabilistic methods similar to Markov Logic Networks. This advancement enables answer set programs to handle inconsistencies, rank solutions, assign probabilities, and perform statistical inference—moving beyond the deterministic limitations of traditional logic-based systems.

AINeutralarXiv – CS AI · 16h ago5/10

🧠

Cplus2ASP: Computing Action Language C+ in Answer Set Programming

Cplus2ASP Version 2 is a new system that translates action language C+ into answer set programming, offering significant performance improvements over the Causal Calculator through modern ASP solving techniques. The tool supports incremental execution, external atoms via Lua integration, and extensible translations for other action languages, making it relevant for automated reasoning and planning applications.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

WindINR: Latent-State INR for Fast Local Wind Query and Correction in Complex Terrain

WindINR is a machine learning framework that enables fast, localized wind forecasting in complex terrain by using implicit neural representations to query wind conditions at specific user-defined locations rather than generating dense grid-based forecasts. The system achieves 2.6x speedup in corrections by updating only a compact latent state instead of retraining full networks, making it practical for real-time wind estimation applications.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

Don't Click That: Teaching Web Agents to Resist Deceptive Interfaces

Researchers introduce DUDE, a framework that teaches AI web agents to resist deceptive interface elements through hybrid-reward learning and experience summarization. The accompanying RUC benchmark demonstrates the framework reduces susceptibility to deception by 53.8% while preserving task performance, addressing a critical vulnerability in autonomous GUI interaction systems.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

Strategic commitments shape collective cybersecurity under AI inequality

Researchers present a game-theoretic model showing that unequal access to AI-powered cybersecurity tools creates persistent vulnerabilities, with weak defenders unable to afford strong protection. They propose that targeted subsidies for committed defenders adopting advanced AI defenses significantly improve overall system resilience and suppress attacks more effectively than commitment alone.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

EvoPref: Multi-Objective Evolutionary Optimization Discovers Diverse LLM Alignments Beyond Gradient Descent

Researchers introduce EvoPref, a multi-objective evolutionary algorithm that optimizes LLM alignment across multiple objectives using population-based methods rather than traditional gradient descent. The approach demonstrates 18% improvement in preference coverage and 47% reduction in preference collapse while maintaining competitive alignment quality compared to gradient-based methods like ORPO.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

PPU-Bench:Real World Benchmark for Personalized Partial Unlearning in Vision Language Models

Researchers introduce PPU-Bench, a benchmark for testing personalized partial unlearning in multimodal AI models, addressing the challenge of selectively removing sensitive memorized information while preserving model utility. The study reveals significant trade-offs between forgetting target knowledge and retaining non-target facts, proposing Boundary-Aware Optimization as a solution for fine-grained factual control.

AINeutralarXiv – CS AI · 16h ago6/10

🧠

Deterministic Decomposition of Stochastic Generative Dynamics

Researchers propose Bridge Matching, a novel framework that decomposes stochastic generative model dynamics into deterministic transport and diffusion-induced osmotic effects. This decomposition enables more interpretable and controllable generative sampling by separately parameterizing how probability mass moves versus how stochastic fluctuations affect the process.

← PrevPage 416 of 1394Next →