AIBullisharXiv – CS AI · 2d ago7/10
🧠Researchers propose EELMA, an algorithm that uses information-theoretic empowerment to evaluate language model agents at scale without manual benchmarking. The method measures an agent's ability to influence future states through its actions and demonstrates strong correlation with task performance across text-based, web, and tool-use environments.
AIBullisharXiv – CS AI · May 127/10
🧠Researchers propose a self-captioning workflow with a Multimodal Interaction Gate to improve vision language models by amplifying redundant information between vision and text modalities. The approach addresses hallucination and robustness issues by converting unique modal interactions into shared redundancies, reducing visual-induced errors by 38.3% and improving consistency by 16.8%.
AIBearisharXiv – CS AI · May 97/10
🧠Researchers propose a unified dynamical systems model of human-AI co-evolution, showing that increased reliance on LLMs creates feedback loops between human cognition, data quality, and model capability. The analysis identifies three regimes including a 'degenerative convergence' where over-reliance on AI leads to reduced diversity and an information bottleneck, suggesting AI trajectory depends as much on human behavioral dynamics as on model design.
AIBearisharXiv – CS AI · May 97/10
🧠Researchers have identified a critical architectural flaw in large vision-language models: attention mechanisms are largely redundant and misallocate computational resources, with random attention weights performing comparably to learned ones. This finding challenges fundamental assumptions about Transformer design and suggests current LVLMs inefficiently process visual information despite their scale.
AINeutralarXiv – CS AI · May 77/10
🧠Researchers identify the 'Reasoning Trap,' a fundamental information-theoretic limitation where multi-agent language model debates preserve answer accuracy while degrading reasoning quality. The study introduces the Supported Faithfulness Score metric and Evidence-Grounded Socratic Reasoning framework, demonstrating that closed-system reasoning protocols following standard multi-agent debate structures inevitably lose information fidelity according to the Data Processing Inequality.
AINeutralarXiv – CS AI · Apr 107/10
🧠Researchers introduce the Informational Buildup Framework (IBF), a new approach to continual learning that eliminates catastrophic forgetting by treating information as structural alignment rather than stored parameters. The framework demonstrates superior performance across multiple domains including chess and image classification, achieving near-zero forgetting without requiring raw data replay.
AIBearisharXiv – CS AI · Apr 77/10
🧠Researchers prove a fundamental theoretical limit in AI safety verification using Kolmogorov complexity theory. They demonstrate that no finite formal verifier can certify all policy-compliant AI instances of arbitrarily high complexity, revealing intrinsic information-theoretic barriers beyond computational constraints.
AINeutralarXiv – CS AI · Mar 177/10
🧠This research review examines methodologies for addressing AI systems' challenges with limited training data through uncertainty quantification and synthetic data augmentation. The paper presents formal approaches including Bayesian learning frameworks, information-theoretic bounds, and conformal prediction methods to improve AI performance in data-scarce environments like robotics and healthcare.
AIBullisharXiv – CS AI · Mar 167/10
🧠Researchers introduce a novel optimization framework that integrates the Minimum Description Length (MDL) principle directly into deep neural network training dynamics. The method uses geometrically-grounded cognitive manifolds with coupled Ricci flow to create autonomous model simplification while maintaining data fidelity, with theoretical guarantees for convergence and practical O(N log N) complexity.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers developed an information-theoretic framework to measure when multi-agent AI systems exhibit coordinated behavior beyond individual agents. The study found that specific prompt designs can transform collections of AI agents into coordinated collectives that mirror human group intelligence principles.
AINeutralarXiv – CS AI · Feb 277/106
🧠Researchers identified a fundamental limitation in multimodal LLMs where decoders trained on text cannot effectively utilize non-text information like speaker identity or visual textures, despite this information being preserved through all model layers. The study demonstrates this 'modality collapse' is due to decoder design rather than encoding failures, with experiments showing targeted training can improve specific modality accessibility.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers provide the first rigorous theoretical analysis of temperature scaling, a widely-used technique for controlling uncertainty in machine learning models. The study reveals that while temperature scaling reliably increases entropy in classifiers, it does not necessarily increase diversity in large language models as commonly claimed, and establishes temperature scaling as the unique linear calibration method that preserves hard predictions.
AIBullisharXiv – CS AI · 3d ago6/10
🧠Researchers introduce InfoNoise, an adaptive noise scheduling method for diffusion model training that dynamically reallocates computational resources toward the most informative denoising levels. By estimating conditional-entropy-rate profiles during training, the approach matches or exceeds fixed schedules on image benchmarks while achieving up to 3x computational efficiency gains on diverse tasks including DNA and language generation.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers propose a novel emergent communication framework for 6G agentic AI networks that enables autonomous agents to learn their own communication protocols while accounting for physical networking constraints. The framework applies information-theoretic principles to quantify trade-offs between task-relevant information and computational complexity, with experimental validation showing improved generalization performance.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers discover that neural networks across different modalities (vision, point clouds, language) converge toward shared representations, with non-language modalities systematically moving toward language's neighborhood structure rather than vice versa. Using directional analysis, they attribute this asymmetry to language representations occupying more compact feature space, proposing that language serves as the asymptotic attractor in multimodal representation learning.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers present Neural Information Causality (Neural-IC), a theoretical framework that formalizes how neural network representations function as communication channels under query-separated computation. The work establishes operational bounds on information leakage through bottlenecks and demonstrates that quantum advantages in specific architectures depend on fair query-conditioned access rather than total information capacity.
🏢 Meta
AINeutralarXiv – CS AI · May 126/10
🧠Researchers propose Information Density as a quantitative framework for optimizing IoT sensor networks by enabling virtual sensing through AI. Using spatial, temporal, and cross-modal correlations, the system can replace physical sensors with computational models while maintaining sub-4% error margins, demonstrated via Madrid's smart city infrastructure.
AIBullisharXiv – CS AI · May 96/10
🧠Researchers introduced BALAR, a Bayesian algorithm that enables large language models to engage in structured multi-turn dialogue by actively reasoning about missing information and strategically asking clarifying questions. The system demonstrated significant performance improvements across three diverse benchmarks—14.6% to 38.5% higher accuracy—without requiring fine-tuning, suggesting a more principled approach to interactive AI reasoning.
AINeutralarXiv – CS AI · May 96/10
🧠SANEmerg is a new multi-agent emergent communication framework designed to optimize networking in AI-native systems by enabling autonomous agents to develop task-specific communication protocols. The framework addresses bandwidth and computational constraints through intelligent message prioritization and complexity regularization, demonstrating significant performance improvements over existing solutions.
AIBullisharXiv – CS AI · May 96/10
🧠Researchers propose WARDEN, an information-theoretic adversarial training framework that improves Large Language Model robustness against prompt attacks by dynamically reweighting adversarial examples using f-divergence principles. The method achieves comparable computational efficiency to existing approaches while substantially reducing attack success rates, advancing the scalability of AI safety mechanisms.
AINeutralarXiv – CS AI · May 76/10
🧠Researchers reveal that large language models develop distinct hierarchical processing stages (Local, Intermediate, Global) determined by architecture family rather than model size. Using information theory, they demonstrate that Llama and Qwen models show dramatically different brittleness patterns across layers, with architectural design — not scaling — as the primary driver of model behavior.
🧠 Llama
AINeutralarXiv – CS AI · May 16/10
🧠Researchers develop a theoretical framework connecting Information Bottleneck principles to encoder-decoder learning through rate-distortion analysis, showing optimal representations form soft clusters on probability manifolds. The work introduces Sketched Isotropic Gaussian Regularization (SIGReg) as a principled regularizer for self-supervised, semi-supervised, and supervised learning without requiring variational bounds.
AINeutralarXiv – CS AI · Apr 146/10
🧠Researchers develop a new information-theoretic framework that handles heavy-tailed data distributions, addressing limitations in classical generalization bounds used in machine learning. The work applies specifically to reinforcement learning from human feedback (RLHF) and stochastic gradient optimization, where traditional KL-divergence tools fail due to non-existent moment generating functions.
AINeutralarXiv – CS AI · Apr 146/10
🧠Researchers reveal that unified multimodal models (UMMs) combining language and vision capabilities fail to achieve genuine synergy, exhibiting divergent information patterns that undermine reasoning transfer to image synthesis. An information-theoretic framework analyzing ten models shows pseudo-unification stems from asymmetric encoding and conflicting response patterns, with only models implementing contextual prediction achieving stronger text-to-image reasoning.
AINeutralarXiv – CS AI · Apr 146/10
🧠Researchers introduce R-EMID, an information-theoretic metric to diagnose how distribution shifts degrade role-playing model performance in real-world deployments. The framework reveals that user shifts pose the greatest generalization risk, while co-evolving reinforcement learning provides the most effective mitigation strategy.