AI Pulse News

Models, papers, tools. 18,075 articles with AI-powered sentiment analysis and key takeaways.

18075 articles

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Learning When to Remember: Risk-Sensitive Contextual Bandits for Abstention-Aware Memory Retrieval in LLM-Based Coding Agents

Researchers introduce RSCB-MC, a risk-sensitive contextual bandit system that improves how LLM-based coding agents decide whether to use external memory for debugging tasks. Rather than treating memory retrieval as a simple similarity-matching problem, the system treats it as a safety-critical control problem, achieving 62.5% success rate with zero false positives in testing.

AIBullisharXiv – CS AI · 3d ago6/10

🧠

BoostLoRA: Growing Effective Rank by Boosting Adapters

BoostLoRA introduces a gradient-boosting framework that enables parameter-efficient fine-tuning adapters to grow their effective rank iteratively, allowing ultra-low-parameter models to match or exceed full fine-tuning performance across mathematical reasoning, code generation, and protein classification tasks. The method merges adapters with zero inference overhead while maintaining minimal per-round parameter costs.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Pragmos: A Process Agentic Modeling System

Pragmos is a research prototype that combines Large Language Models with human expertise to create business process models through interactive, iterative workflows. Rather than fully automating process modeling, the system decomposes complex tasks into manageable steps with explicit documentation, complementing LLM reasoning with specialized tools to ensure sound and comprehensible outputs.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

COHERENCE: Benchmarking Fine-Grained Image-Text Alignment in Interleaved Multimodal Contexts

Researchers introduced COHERENCE, a new benchmark for evaluating Multimodal Large Language Models (MLLMs) on their ability to understand fine-grained image-text alignment in interleaved contexts—such as documents with mixed text and images. The benchmark contains 6,161 high-quality questions across four domains and includes error analysis to identify specific capability gaps in current models.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Beyond the Mean: Within-Model Reliable Change Detection for LLM Evaluation

Researchers adapted clinical psychology's Reliable Change Index to evaluate LLM performance across model versions, revealing that aggregate accuracy gains mask substantial item-level volatility. Testing Llama 3→3.1 and Qwen 2.5→3 showed bidirectional changes with large effect sizes, where improvements in low-accuracy domains offset deteriorations in high-accuracy ones, suggesting current evaluation methods underestimate model instability.

🧠 Llama

AINeutralarXiv – CS AI · 3d ago6/10

🧠

AdaBFL: Multi-Layer Defensive Adaptive Aggregation for Bzantine-Robust Federated Learning

Researchers propose AdaBFL, a Byzantine-robust federated learning method that uses adaptive multi-layer defense mechanisms to protect distributed machine learning systems from poisoning attacks by malicious clients. The approach balances defense against multiple attack types without requiring server-side dataset access, with proven convergence properties on non-IID data.

AI × CryptoNeutralarXiv – CS AI · 3d ago6/10

🤖

Optimal Stop-Loss and Take-Profit Parameterization for Autonomous Trading Agent Swarm

A research paper demonstrates that exit strategy optimization—specifically tuning stop-loss and take-profit parameters—materially improves risk-adjusted returns for autonomous crypto trading systems. The study analyzed 900+ historical trades and found that tighter loss limits, earlier profit capture, and closer trailing stops outperform fixed exit rules, while acknowledging methodological challenges when backtesting on volatile market periods.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior

Researchers have created Cognitive Digital Shadows (CDS), a 190,000-record synthetic dataset of LLM-generated responses on controversial societal topics, designed to measure how language models shift their outputs based on persona prompting and sociodemographic attributes. The dataset enables systematic auditing of LLM bias, alignment, and social sensitivity across 19 different models.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Why Self-Supervised Encoders Want to Be Normal

Researchers develop a theoretical framework connecting Information Bottleneck principles to encoder-decoder learning through rate-distortion analysis, showing optimal representations form soft clusters on probability manifolds. The work introduces Sketched Isotropic Gaussian Regularization (SIGReg) as a principled regularizer for self-supervised, semi-supervised, and supervised learning without requiring variational bounds.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Position-Aware Drafting for Inference Acceleration in LLM-Based Generative List-Wise Recommendation

Researchers introduce PAD-Rec, a lightweight module that optimizes speculative decoding for LLM-based recommendation systems by incorporating position-aware embeddings. The approach achieves up to 3.1x speedup in inference while preserving recommendation quality, addressing the latency bottleneck in generative list-wise recommendations.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Test Before You Deploy: Governing Updates in the LLM Supply Chain

Researchers propose a deployment-side governance framework for managing Large Language Model updates, addressing the problem of silent behavioral changes in hosted LLM services that lack explicit versioning. The framework combines production contracts, risk-category-based testing, and compatibility gates to prevent regressions in functionality, safety, and performance.

AIBullisharXiv – CS AI · 3d ago6/10

🧠

CastFlow: Learning Role-Specialized Agentic Workflows for Time Series Forecasting

Researchers introduce CastFlow, a dynamic agentic framework that applies large language models to time series forecasting through multi-stage workflows combining planning, action, and reflection. The system uses role-specialized agents—a general-purpose LLM paired with a fine-tuned domain-specific model—to iteratively refine forecasts using ensemble methods and contextual memory, demonstrating superior performance over existing static generative approaches.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

AI Inference as Relocatable Electricity Demand: A Latency-Constrained Energy-Geography Framework

Researchers present a framework for optimizing AI inference workload placement across geographically distributed data centers by treating computation as relocatable electricity demand. The model balances latency constraints against energy costs and carbon intensity, revealing that workload flexibility significantly expands execution geography but faces practical friction from migration costs, regulatory limits, and network constraints.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future

A comprehensive survey examines how large language models can assist or automate peer review processes across academia, synthesizing techniques for review generation, post-review tasks, and evaluation methods. The research catalogs datasets and modeling approaches while addressing ethical concerns and practical implementation challenges for integrating AI into scholarly publishing workflows.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Learning from Disagreement: Clinician Overrides as Implicit Preference Signals for Clinical AI in Value-Based Care

Researchers propose a framework that treats clinician overrides of AI recommendations as preference signals for training clinical decision-support systems in value-based care settings. The approach combines preference learning with capability modeling to improve AI alignment with patient outcomes rather than encounter economics, addressing a failure mode called suppression bias.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

MIFair: A Mutual-Information Framework for Intersectionality and Multiclass Fairness

Researchers introduce MIFair, a machine learning framework using mutual information to assess and mitigate bias in AI systems, with particular strength in handling intersectionality and multiclass classification. The framework consolidates diverse fairness metrics into a unified approach and demonstrates effectiveness on real-world datasets while maintaining predictive performance.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems

A research paper investigates factors that lead organizations to abandon AI systems during development or post-deployment, finding that ethical concerns represent only one of six drivers. The study reveals that practical constraints—including resource limitations, organizational dynamics, and regulatory pressures—often outweigh ethical considerations in non-development decisions, suggesting responsible AI research should broaden its focus beyond ethics-centric approaches.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

TopBench: A Benchmark for Implicit Prediction and Reasoning over Tabular Question Answering

Researchers introduce TopBench, a benchmark dataset of 779 samples designed to evaluate how well Large Language Models handle implicit prediction tasks over tabular data—queries requiring inference from historical patterns rather than simple data retrieval. Testing reveals current LLMs struggle with intent recognition and default to lookup-based approaches, indicating that accurate intent disambiguation is critical before predictive reasoning can succeed.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Towards Neuro-symbolic Causal Rule Synthesis, Verification, and Evaluation Grounded in Legal and Safety Principles

Researchers present a neuro-symbolic framework that combines first-order logic, causal models, and deep reinforcement learning to automatically synthesize, verify, and maintain safety-critical rule-based systems. The system uses LLMs to translate human-specified legal and safety principles into formal logical rules, with validation pipelines ensuring consistency and safety before deployment in autonomous systems.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

DEFault++: Automated Fault Detection, Categorization, and Diagnosis for Transformer Architectures

Researchers introduce DEFault++, an AI diagnostic system that automatically detects, categorizes, and identifies root causes of faults in transformer neural networks across 45 different failure mechanisms. The tool achieves over 96% accuracy in fault detection and demonstrates practical value in helping developers fix issues correctly 46% more often than without assistance.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

PRISM: Pre-alignment via Black-box On-policy Distillation for Multimodal Reinforcement Learning

Researchers introduce PRISM, a three-stage training pipeline that addresses distributional drift in large multimodal models by inserting a distribution-alignment stage between supervised fine-tuning and reinforcement learning. The method uses a Mixture-of-Experts discriminator to correct perception and reasoning errors, achieving 4.4-6.0 percentage point improvements on multimodal benchmarks compared to standard SFT-to-RLVR approaches.

🧠 Gemini

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Imitation Game for Adversarial Disillusion with Chain-of-Thought Reasoning in Generative AI

Researchers propose a novel defense framework against adversarial attacks on AI systems using chain-of-thought reasoning and multimodal generative agents. The approach, based on an 'imitation game' paradigm, successfully neutralizes both deductive and inductive adversarial illusions across white-box and black-box attack scenarios, addressing a critical vulnerability in modern AI systems.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Chronology of Multi-Agent Interactions for Provenance of Evolving Information

Researchers propose a novel system for tracking provenance in multi-agent AI systems by creating chronological records of contributions during content generation. The approach uses 'symbolic chronicles'—timestamped records similar to forensic chain-of-custody documentation—enabling attribution without relying on internal memory or external metadata, addressing accountability challenges in collaborative AI.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

AI Models for Depressive Disorder Detection and Diagnosis: A Review

A comprehensive review of 55 studies examines AI methods for detecting and diagnosing Major Depressive Disorder, revealing trends toward graph neural networks for brain connectivity analysis, large language models for linguistic data, and multimodal fusion approaches. The survey highlights how AI can address the subjectivity in clinical depression diagnosis while advancing computational psychiatry through improved explainability and fairness.

AINeutralarXiv – CS AI · 3d ago6/10

🧠

Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI

Researchers have published a comprehensive survey on Physical AI that bridges the gap between physical perception and symbolic physics reasoning in AI systems. The work advocates for next-generation world models that integrate physical laws, embodied reasoning, and generative approaches to create AI systems with genuine understanding of physical phenomena rather than pure pattern recognition.

← PrevPage 209 of 723Next →