y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#data-efficiency News & Analysis

36 articles tagged with #data-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

36 articles
AINeutralarXiv – CS AI · May 116/10
🧠

Accelerated and data-efficient flow prediction in stirred tanks via physics-informed learning

Researchers demonstrate that physics-informed machine learning can predict fluid flows in industrial stirred tanks with significantly less training data than purely data-driven approaches. The study reveals diminishing returns in accuracy beyond moderate dataset sizes, with physics-based constraints proving most valuable in low-data regimes.

AINeutralarXiv – CS AI · May 116/10
🧠

Graph-Structured Hyperdimensional Computing for Data-Efficient and Explainable Process-Structure-Property Prediction

Researchers developed PSP-HDC, a graph-structured hyperdimensional computing framework for predicting material properties in 3D microstructure fabrication with sparse, heterogeneous data. The approach achieves 91% accuracy while providing inherent explainability—a critical advantage over conventional machine learning models that struggle with limited datasets and poor generalization.

AINeutralarXiv – CS AI · May 116/10
🧠

TopoPrune: Robust Data Pruning via Unified Latent Space Topology

TopoPrune introduces a topology-based framework for data pruning that addresses instability issues in geometric methods by leveraging intrinsic data structure rather than extrinsic geometry. The approach combines manifold approximation with persistent homology to achieve high accuracy at extreme pruning rates (90%) while maintaining robustness across architectures and noise conditions.

AINeutralarXiv – CS AI · Apr 206/10
🧠

Distribution Shift Alignment Helps LLMs Simulate Survey Response Distributions

Researchers introduced Distribution Shift Alignment (DSA), a novel fine-tuning method that enables large language models to more accurately simulate human survey responses by learning distribution patterns rather than memorizing training data. DSA outperforms existing methods across five public datasets and reduces required real-world data by 53-69%, offering significant cost savings for large-scale survey research.

AIBullisharXiv – CS AI · Mar 36/106
🧠

VisNec: Measuring and Leveraging Visual Necessity for Multimodal Instruction Tuning

Researchers developed VisNec, a framework that identifies which training samples truly require visual reasoning for multimodal AI instruction tuning. The method achieves equivalent performance using only 15% of training data by filtering out visually redundant samples, potentially making multimodal AI training more efficient.

AIBullisharXiv – CS AI · Mar 26/1014
🧠

From Generator to Embedder: Harnessing Innate Abilities of Multimodal LLMs via Building Zero-Shot Discriminative Embedding Model

Researchers propose a data-efficient framework to convert generative Multimodal Large Language Models into universal embedding models without extensive pre-training. The method uses hierarchical embedding prompts and Self-aware Hard Negative Sampling to achieve competitive performance on embedding benchmarks using minimal training data.

AIBullisharXiv – CS AI · Feb 276/105
🧠

NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning

Researchers introduced NoRD (No Reasoning for Driving), a Vision-Language-Action model for autonomous driving that achieves competitive performance using 60% less training data and no reasoning annotations. The model incorporates Dr. GRPO algorithm to overcome difficulty bias issues in reinforcement learning, demonstrating successful results on Waymo and NAVSIM benchmarks.

AIBullisharXiv – CS AI · Apr 65/10
🧠

Efficient Causal Graph Discovery Using Large Language Models

Researchers propose a new framework using Large Language Models for causal graph discovery that requires only linear queries instead of quadratic, making it more efficient for larger datasets. The method uses breadth-first search and can incorporate observational data, achieving state-of-the-art results on real-world causal graphs.

AINeutralarXiv – CS AI · Mar 24/106
🧠

Less is more -- the Dispatcher/ Executor principle for multi-task Reinforcement Learning

Researchers propose a dispatcher/executor principle for multi-task Reinforcement Learning that partitions controllers into task-understanding and device-specific components connected by a regularized communication channel. This structural approach aims to improve generalization and data efficiency as an alternative to simply scaling large neural networks with vast datasets.

AINeutralarXiv – CS AI · Mar 24/109
🧠

Operator Learning with Domain Decomposition for Geometry Generalization in PDE Solving

Researchers propose a new framework called Operator Learning with Domain Decomposition to solve partial differential equations (PDEs) on arbitrary geometries using neural operators. The approach addresses data efficiency and geometry generalization challenges by breaking complex domains into smaller subdomains that can be solved locally and then combined into global solutions.

← PrevPage 2 of 2