y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#computational-efficiency News & Analysis

133 articles tagged with #computational-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

133 articles
AIBullisharXiv – CS AI · Mar 36/104
🧠

AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition

Researchers introduce AdaptVision, a new Vision-Language Model that reduces computational overhead by adaptively determining the minimum visual tokens needed per sample. The model uses a coarse-to-fine approach with reinforcement learning to balance accuracy and efficiency, achieving superior performance while consuming fewer visual tokens than existing methods.

AIBullisharXiv – CS AI · Mar 36/104
🧠

Contribution-aware Token Compression for Efficient Video Understanding via Reinforcement Learning

Researchers developed CaCoVID, a reinforcement learning-based algorithm that compresses video tokens for large language models by selecting tokens based on their actual contribution to correct predictions rather than attention scores. The method uses combinatorial policy optimization to reduce computational overhead while maintaining video understanding performance.

AIBullisharXiv – CS AI · Mar 36/103
🧠

Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding

Researchers have developed EDT-Former, an Entropy-guided Dynamic Token Transformer that improves how Large Language Models understand molecular graphs. The system achieves state-of-the-art results on molecular understanding benchmarks while being computationally efficient by avoiding costly LLM backbone fine-tuning.

AIBullisharXiv – CS AI · Mar 36/103
🧠

Learning from Complexity: Exploring Dynamic Sample Pruning of Spatio-Temporal Training

Researchers have developed ST-Prune, a dynamic sample pruning technique that accelerates training of deep learning models for spatio-temporal forecasting by intelligently selecting the most informative data samples. The method significantly improves training efficiency while maintaining or enhancing model performance on real-world datasets from transportation, climate science, and urban planning domains.

AIBullisharXiv – CS AI · Mar 27/1016
🧠

ODAR: Principled Adaptive Routing for LLM Reasoning via Active Inference

Researchers propose ODAR-Expert, an adaptive routing framework for large language models that optimizes accuracy-efficiency trade-offs by dynamically routing queries between fast and slow processing agents. The system achieved 98.2% accuracy on MATH benchmarks while reducing computational costs by 82%, suggesting that optimal AI scaling requires adaptive resource allocation rather than simply increasing test-time compute.

AIBullisharXiv – CS AI · Mar 26/1017
🧠

Quant Experts: Token-aware Adaptive Error Reconstruction with Mixture of Experts for Large Vision-Language Models Quantization

Researchers introduce Quant Experts (QE), a new post-training quantization technique for Vision-Language Models that uses adaptive error compensation with mixture-of-experts architecture. The method addresses computational and memory overhead issues by intelligently handling token-dependent and token-independent channels, maintaining performance comparable to full-precision models across 2B to 70B parameter scales.

AINeutralarXiv – CS AI · Mar 26/1011
🧠

Memory Caching: RNNs with Growing Memory

Researchers introduce Memory Caching (MC), a technique that enhances recurrent neural networks by allowing their memory capacity to grow with sequence length, bridging the gap between fixed-memory RNNs and growing-memory Transformers. The approach offers four variants and shows competitive performance with Transformers on language modeling and long-context tasks while maintaining better computational efficiency.

AIBullisharXiv – CS AI · Mar 26/1017
🧠

MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Researchers introduce MITS (Mutual Information Tree Search), a new framework that improves reasoning capabilities in large language models using information-theoretic principles. The method uses pointwise mutual information for step-wise evaluation and achieves better performance while being more computationally efficient than existing tree search methods like Tree-of-Thought.

AIBullisharXiv – CS AI · Mar 26/1016
🧠

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Researchers introduce SAGE (Self-Aware Guided Efficient Reasoning), a novel sampling paradigm that improves AI reasoning efficiency by helping large reasoning models know when to stop thinking. The approach addresses the problem of redundant, lengthy reasoning chains that don't improve accuracy while reducing computational costs and response times.

AIBullisharXiv – CS AI · Mar 26/1020
🧠

Stop Unnecessary Reflection: Training LRMs for Efficient Reasoning with Adaptive Reflection and Length Coordinated Penalty

Researchers developed ARLCP, a reinforcement learning framework that reduces unnecessary reflection in Large Reasoning Models, achieving 53% shorter responses while improving accuracy by 5.8% on smaller models. The method addresses computational inefficiencies in AI reasoning by dynamically balancing efficiency and accuracy through adaptive penalties.

AINeutralarXiv – CS AI · Mar 27/1013
🧠

Efficient Ensemble Conditional Independence Test Framework for Causal Discovery

Researchers introduce E-CIT (Ensemble Conditional Independence Test), a new framework that significantly reduces computational costs in causal discovery by partitioning data into subsets and aggregating results. The method achieves linear computational complexity while maintaining competitive performance, particularly on real-world datasets.

AIBullisharXiv – CS AI · Mar 26/1021
🧠

Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation

Researchers developed Speculative Verdict (SV), a training-free framework that improves large Vision-Language Models' ability to reason over information-dense images by combining multiple small draft models with a larger verdict model. The approach achieves better accuracy on visual question answering benchmarks while reducing computational costs compared to large proprietary models.

AIBullisharXiv – CS AI · Feb 276/107
🧠

ECHO: Encoding Communities via High-order Operators

Researchers introduce ECHO, a new Graph Neural Network architecture that solves community detection in large networks by overcoming computational bottlenecks and memory constraints. The system can process networks with over 1.6 million nodes and 30 million edges in minutes, achieving throughputs exceeding 2,800 nodes per second.

AIBullisharXiv – CS AI · Feb 276/106
🧠

Stable Adaptive Thinking via Advantage Shaping and Length-Aware Gradient Regulation

Researchers developed a two-stage framework to optimize large reasoning models, reducing overthinking on simple queries while maintaining accuracy on complex problems. The approach achieved up to 3.7 accuracy point improvements while reducing token generation by over 40% through hybrid fine-tuning and adaptive reinforcement learning techniques.

AIBullisharXiv – CS AI · Feb 276/108
🧠

Efficient Encoder-Free Fourier-based 3D Large Multimodal Model

Researchers introduce Fase3D, the first encoder-free 3D Large Multimodal Model that uses Fast Fourier Transform to process point cloud data efficiently. The model achieves comparable performance to encoder-based systems while being significantly more computationally efficient through novel tokenization and space-filling curve serialization.

$CRV
AIBullisharXiv – CS AI · Feb 276/106
🧠

Large Language Model Compression with Global Rank and Sparsity Optimization

Researchers propose a novel two-stage compression method for Large Language Models that uses global rank and sparsity optimization to significantly reduce model size. The approach combines low-rank and sparse matrix decomposition with probabilistic global allocation to automatically detect redundancy across different layers and manage component interactions.

AINeutralarXiv – CS AI · Feb 275/105
🧠

Scaling Laws for Precision in High-Dimensional Linear Regression

Researchers developed theoretical scaling laws for low-precision AI model training, analyzing how quantization affects model performance in high-dimensional linear regression. The study reveals that multiplicative and additive quantization schemes have distinct effects on effective model size, with multiplicative maintaining full precision while additive reduces it.

AIBullishHugging Face Blog · Feb 266/106
🧠

Mixture of Experts (MoEs) in Transformers

The article discusses Mixture of Experts (MoEs) architecture in transformer models, which allows for scaling model capacity while maintaining computational efficiency. This approach enables larger, more capable AI models by activating only relevant expert networks for specific inputs.

AIBullishMIT News – AI · Dec 46/106
🧠

A smarter way for large language models to think about hard problems

Researchers have developed a new technique that allows large language models to dynamically adjust their computational resources based on problem difficulty. This adaptive reasoning approach enables LLMs to allocate more processing power to complex questions while using less for simpler ones.

AIBullishHugging Face Blog · Nov 196/106
🧠

Apriel-H1: The Surprising Key to Distilling Efficient Reasoning Models

The article discusses Apriel-H1, a methodology or framework for creating more efficient reasoning models in AI. This approach appears to focus on distillation techniques to improve model performance while reducing computational requirements.

AIBullishHugging Face Blog · Jun 36/105
🧠

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

The article discusses optimizing GPU efficiency using co-located vLLM (virtual Large Language Model) infrastructure in TRL (Transformer Reinforcement Learning). This approach aims to maximize GPU utilization and reduce computational waste in AI model training and deployment.

AIBullishHugging Face Blog · Sep 266/107
🧠

SetFit: Efficient Few-Shot Learning Without Prompts

SetFit is a new machine learning framework that enables efficient few-shot learning without requiring prompts. This approach could significantly reduce the computational resources and data requirements for training AI models in various applications.

AIBullishOpenAI News · Nov 96/107
🧠

RL²: Fast reinforcement learning via slow reinforcement learning

The article presents RL², a meta-learning approach that uses slow reinforcement learning to enable fast adaptation to new tasks. This method allows AI agents to quickly learn new behaviors by leveraging prior training experience across multiple related tasks.

← PrevPage 5 of 6Next →