AIBullisharXiv – CS AI · Jun 107/10
🧠Researchers propose Dropout-GRPO, a method that addresses a fundamental limitation in training latent-reasoning language models by introducing structured stochasticity through dropout masks. The technique enables Group Relative Policy Optimization to work effectively with continuous hidden states rather than discrete tokens, improving performance on mathematical reasoning tasks.
AIBullisharXiv – CS AI · Jun 57/10
🧠Researchers propose Agentic Monte Carlo (AMC), a novel method for optimizing black-box LLM agents without API access by using Sequential Monte Carlo sampling to steer agents toward optimal behavior. The technique bridges the gap between reinforcement learning and Bayesian inference, demonstrating competitive performance against RL baselines while maintaining the black-box model architecture.
AIBullisharXiv – CS AI · May 127/10
🧠Researchers introduce BaLoRA, a Bayesian extension of Low-Rank Adaptation that improves fine-tuning of large AI models by adding uncertainty quantification while narrowing the accuracy gap with full fine-tuning. The method uses input-adaptive parameterization with minimal computational overhead and demonstrates stronger performance across language, vision, and materials science tasks.
AIBullisharXiv – CS AI · May 117/10
🧠Researchers introduce a novel training strategy for neural posterior estimation that decouples representation learning from posterior modeling, enabling amortized inference on large observation sets by training only on pairs of examples. The approach dramatically reduces computational requirements while maintaining or improving performance across diverse benchmarks, making scalable Bayesian inference practical for real-world applications.
AIBullisharXiv – CS AI · May 117/10
🧠Researchers propose a novel uncertainty quantification method for Prior-Data Fitted Networks (PFNs), emerging foundation models for tabular data prediction, using martingale posteriors to provide calibrated confidence estimates. The technique is tuning-free, computationally efficient, and mathematically proven to converge, addressing a significant limitation in PFNs' practical applicability.
AIBullisharXiv – CS AI · Apr 147/10
🧠Researchers demonstrate that variational Bayesian methods significantly improve Vision Language Models' reliability for Visual Question Answering tasks by enabling selective prediction with reduced hallucinations and overconfidence. The proposed Variational VQA approach shows particular strength at low error tolerances and offers a practical path to making large multimodal models safer without proportional computational costs.
AIBullisharXiv – CS AI · Apr 137/10
🧠Researchers introduce a hybrid framework combining probabilistic models with large language models to improve social reasoning in AI agents, achieving a 67% win rate against human players in the game Avalon—a breakthrough in AI's ability to infer beliefs and intentions from incomplete information.
AIBullisharXiv – CS AI · Mar 117/10
🧠Researchers have developed Variational Mixture-of-Experts Routing (VMoER), a Bayesian framework that enables uncertainty quantification in large-scale AI models while adding less than 1% computational overhead. The method improves routing stability by 38%, reduces calibration error by 94%, and increases out-of-distribution detection by 12%.
AINeutralarXiv – CS AI · Feb 277/105
🧠Researchers have identified flaws in existing test-time guidance methods for diffusion models that prevent proper Bayesian posterior sampling. They propose new estimators that enable calibrated inference, significantly outperforming previous methods on Bayesian tasks and matching state-of-the-art results in black hole image reconstruction.
AINeutralarXiv – CS AI · Feb 277/105
🧠Researchers establish theoretical connections between Random Network Distillation (RND), deep ensembles, and Bayesian inference for uncertainty quantification in deep learning models. The study proves that RND's uncertainty signals are equivalent to deep ensemble predictive variance and can mirror Bayesian posterior distributions, providing a unified theoretical framework for efficient uncertainty quantification methods.
AINeutralarXiv – CS AI · Jun 106/10
🧠Researchers demonstrate that latent diffusion models (LDMs) can efficiently parameterize subsurface geological models for data assimilation, but reveal a critical trade-off: ensemble Kalman methods preserve geological realism poorly while Monte Carlo sampling methods achieve better uncertainty quantification at higher computational cost, with fast surrogate models enabling practical implementation.
AINeutralarXiv – CS AI · Jun 105/10
🧠Researchers present a novel stochastic filtering methodology called factored conditional filters for tracking states and estimating parameters in high-dimensional systems. The approach decomposes complex state spaces into lower-dimensional subspaces, enabling efficient computation while maintaining approximation accuracy. Applications include epidemic tracking and parameter estimation in large contact networks.
AINeutralarXiv – CS AI · Jun 95/10
🧠A research paper presents quantitative approaches to Promise Theory applied to autonomous agent systems, integrating Bayesian probability and Active Inference frameworks. The work explores how Promise Theory can address computational coordination challenges and enable agent alignment at scale, with applications across software, machine learning, biology, and engineering domains.
AINeutralarXiv – CS AI · Jun 95/10
🧠Researchers propose Bayesian Selective Latent Inference (BSLI), a machine learning method that uses wastewater surveillance data to monitor influenza spread in communities before clinical cases are reported. The system intelligently decides whether additional data sources are needed or if abstention is appropriate, improving disease monitoring accuracy while managing data acquisition costs.
AIBullisharXiv – CS AI · Jun 96/10
🧠Researchers introduce HARP (Hierarchical Active Region Pruning), a novel training-efficient method for selecting optimal data when finetuning large language models. The approach reduces computational costs by 7x while maintaining or improving model performance by using hierarchical organization and Bayesian inference to evaluate representative subsets rather than exhaustively training on all data.
AINeutralarXiv – CS AI · Jun 96/10
🧠Researchers present a cost-aware method for optimizing speculative execution in LLM-agent workflows, addressing the challenge of reducing idle time while managing per-token billing costs. The approach combines five design decisions—including predictive execution, dual-rate pricing, Bayesian probability estimation, and a configurable latency-cost tradeoff—with safeguards ensuring only side-effect-free operations proceed speculatively.
AINeutralarXiv – CS AI · Jun 56/10
🧠Researchers present the 2-Step Agent framework to model how decision makers learn from ML-based decision support systems. The study reveals that even when ML models are well-specified and agents behave rationally, misaligned prior beliefs can cause ML-DS to produce worse outcomes than no support at all, highlighting critical risks in deploying AI for high-stakes decisions.
$MKR
AINeutralarXiv – CS AI · Jun 56/10
🧠Researchers propose a Bayesian hierarchical model with embedding-space clustering to correct fundamental flaws in LLM benchmarking methodology. The approach addresses two critical issues—insufficient evaluation samples and non-independent test prompts—improving performance metric accuracy by 4-73% in mean absolute errors, particularly relevant for adversarial robustness evaluation.
AINeutralarXiv – CS AI · Jun 25/10
🧠This academic article examines the historical evolution of probability theory as a reflection of changing human rationality, tracing its development from games of chance to modern Bayesian inference. It argues that contemporary scientific reasoning requires integrating probability with fuzzy logic and deep learning to address uncertainty, vagueness, and inference beyond what probability alone can formalize.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers developed a Bayesian machine learning framework to model malaria dynamics in Ghana using health facility data from 2014-2023, achieving 99.58% accuracy in capturing non-linear, age-specific disease patterns. The model forecasts a gradual resurgence in malaria cases through 2026, with projections ranging from 137,000-149,000 cases in children under five and 348,000-375,000 in older populations, enabling data-driven public health decision-making.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers introduce CASSM, a Bayesian framework that combines Kalman filtering with model selection to improve neural dynamics modeling on modern datasets. The method addresses computational complexity and uncertainty calibration challenges, offering competitive performance with deep networks while maintaining better uncertainty quantification, particularly for datasets with fewer trials than recorded neurons.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers compared how human children and large language models approach inductive reasoning tasks under uncertainty, finding both similarities and critical differences in their information-seeking strategies. While LLMs replicate children's adaptive responses to environmental structure, they exhibit distinct biases toward over-observation and instruction compliance, suggesting fundamentally different underlying computational principles govern their decision-making.
AIBullisharXiv – CS AI · Jun 26/10
🧠Researchers propose a Bayesian stopping strategy that reduces LLM inference costs by up to 50% while maintaining answer accuracy. The method samples multiple LLM responses and stops once sufficient consistency is detected, using an efficient L-aggregated policy that tracks only the top 3 answer frequencies and achieves theoretical optimality.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers propose Bayesian Non-Negative Reward Model (BNRM), a framework that addresses reward hacking vulnerabilities in reinforcement learning from human feedback (RLHF) systems used to align large language models. The approach combines non-negative factor analysis with preference modeling to create more robust, interpretable reward systems resistant to biases and distribution shifts.
AINeutralarXiv – CS AI · May 286/10
🧠Researchers propose Sequential Bayesian Belief Tracking (SBBT), a framework for estimating the reliability of long reasoning chains in large language models before final answers are known. The study finds that probability calibration and ranking performance respond differently to various evidence types: scalar scores improve calibration metrics, while structural observations are needed for ranking tasks.