AINeutralarXiv – CS AI · May 116/10
🧠Researchers demonstrate that physics-informed machine learning can predict fluid flows in industrial stirred tanks with significantly less training data than purely data-driven approaches. The study reveals diminishing returns in accuracy beyond moderate dataset sizes, with physics-based constraints proving most valuable in low-data regimes.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers developed PSP-HDC, a graph-structured hyperdimensional computing framework for predicting material properties in 3D microstructure fabrication with sparse, heterogeneous data. The approach achieves 91% accuracy while providing inherent explainability—a critical advantage over conventional machine learning models that struggle with limited datasets and poor generalization.
AINeutralarXiv – CS AI · May 116/10
🧠TopoPrune introduces a topology-based framework for data pruning that addresses instability issues in geometric methods by leveraging intrinsic data structure rather than extrinsic geometry. The approach combines manifold approximation with persistent homology to achieve high accuracy at extreme pruning rates (90%) while maintaining robustness across architectures and noise conditions.
AINeutralarXiv – CS AI · Apr 206/10
🧠Researchers introduced Distribution Shift Alignment (DSA), a novel fine-tuning method that enables large language models to more accurately simulate human survey responses by learning distribution patterns rather than memorizing training data. DSA outperforms existing methods across five public datasets and reduces required real-world data by 53-69%, offering significant cost savings for large-scale survey research.
AIBullisharXiv – CS AI · Mar 36/106
🧠Researchers developed VisNec, a framework that identifies which training samples truly require visual reasoning for multimodal AI instruction tuning. The method achieves equivalent performance using only 15% of training data by filtering out visually redundant samples, potentially making multimodal AI training more efficient.
AIBullisharXiv – CS AI · Mar 26/1014
🧠Researchers propose a data-efficient framework to convert generative Multimodal Large Language Models into universal embedding models without extensive pre-training. The method uses hierarchical embedding prompts and Self-aware Hard Negative Sampling to achieve competitive performance on embedding benchmarks using minimal training data.
AIBullisharXiv – CS AI · Mar 27/1014
🧠Researchers propose MetaAPO, a new framework for aligning large language models with human preferences that dynamically balances online and offline training data. The method uses a meta-learner to evaluate when on-policy sampling is beneficial, resulting in better performance while reducing online annotation costs by 42%.
AIBullisharXiv – CS AI · Feb 276/105
🧠Researchers introduced NoRD (No Reasoning for Driving), a Vision-Language-Action model for autonomous driving that achieves competitive performance using 60% less training data and no reasoning annotations. The model incorporates Dr. GRPO algorithm to overcome difficulty bias issues in reinforcement learning, demonstrating successful results on Waymo and NAVSIM benchmarks.
AIBullisharXiv – CS AI · Apr 65/10
🧠Researchers propose a new framework using Large Language Models for causal graph discovery that requires only linear queries instead of quadratic, making it more efficient for larger datasets. The method uses breadth-first search and can incorporate observational data, achieving state-of-the-art results on real-world causal graphs.
AINeutralarXiv – CS AI · Mar 24/106
🧠Researchers propose a dispatcher/executor principle for multi-task Reinforcement Learning that partitions controllers into task-understanding and device-specific components connected by a regularized communication channel. This structural approach aims to improve generalization and data efficiency as an alternative to simply scaling large neural networks with vast datasets.
AINeutralarXiv – CS AI · Mar 24/109
🧠Researchers propose a new framework called Operator Learning with Domain Decomposition to solve partial differential equations (PDEs) on arbitrary geometries using neural operators. The approach addresses data efficiency and geometry generalization challenges by breaking complex domains into smaller subdomains that can be solved locally and then combined into global solutions.