y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#computational-efficiency News & Analysis

133 articles tagged with #computational-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

133 articles
AIBullisharXiv โ€“ CS AI ยท 1d ago7/10
๐Ÿง 

Efficient Adversarial Training via Criticality-Aware Fine-Tuning

Researchers introduce Criticality-Aware Adversarial Training (CAAT), a parameter-efficient method that identifies and fine-tunes only the most robustness-critical parameters in Vision Transformers, achieving 94.3% of standard adversarial training robustness while tuning just 6% of model parameters. This breakthrough addresses the computational bottleneck preventing large-scale adversarial training deployment.

AIBullisharXiv โ€“ CS AI ยท 1d ago7/10
๐Ÿง 

Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models

Researchers present Chain-of-Models Pre-Training (CoM-PT), a novel method that accelerates vision foundation model training by up to 7.09X through sequential knowledge transfer from smaller to larger models in a unified pipeline, rather than training each model independently. The approach maintains or improves performance while significantly reducing computational costs, with efficiency gains increasing as more models are added to the training sequence.

AIBullisharXiv โ€“ CS AI ยท 2d ago7/10
๐Ÿง 

Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading

Researchers introduce Deep Optimizer States, a technique that reduces GPU memory constraints during large language model training by dynamically offloading optimizer state between host and GPU memory during computation cycles. The method achieves 2.5ร— faster iterations compared to existing approaches by better managing the memory fluctuations inherent in transformer training pipelines.

AIBullisharXiv โ€“ CS AI ยท 2d ago7/10
๐Ÿง 

Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers

Researchers introduce RL^V, a reinforcement learning method that unifies LLM reasoners with generative verifiers to improve test-time compute scaling. The approach achieves over 20% accuracy gains on MATH benchmarks and enables 8-32x more efficient test-time scaling compared to existing RL methods by preserving and leveraging learned value functions.

AIBullisharXiv โ€“ CS AI ยท 2d ago7/10
๐Ÿง 

MoEITS: A Green AI approach for simplifying MoE-LLMs

Researchers present MoEITS, a novel algorithm for simplifying Mixture-of-Experts large language models while maintaining performance and reducing computational costs. The method outperforms existing pruning techniques across multiple benchmark models including Mixtral 8ร—7B and DeepSeek-V2-Lite, addressing the energy and resource efficiency challenges of deploying advanced LLMs.

AIBullisharXiv โ€“ CS AI ยท 2d ago7/10
๐Ÿง 

PnP-CM: Consistency Models as Plug-and-Play Priors for Inverse Problems

Researchers introduce PnP-CM, a new method that reformulates consistency models as proximal operators within plug-and-play frameworks for solving inverse problems. The approach achieves high-quality image reconstructions with minimal neural function evaluations (4 NFEs), demonstrating practical efficiency gains over existing consistency model solvers and marking the first application of CMs to MRI data.

AIBullisharXiv โ€“ CS AI ยท 3d ago7/10
๐Ÿง 

Revitalizing Black-Box Interpretability: Actionable Interpretability for LLMs via Proxy Models

Researchers propose a cost-effective proxy model framework that uses smaller, efficient models to approximate the interpretability explanations of expensive Large Language Models (LLMs), achieving over 90% fidelity at just 11% of computational cost. The framework includes verification mechanisms and demonstrates practical applications in prompt compression and data cleaning, making interpretability tools viable for real-world LLM development.

AIBullisharXiv โ€“ CS AI ยท 6d ago7/10
๐Ÿง 

Do We Need Distinct Representations for Every Speech Token? Unveiling and Exploiting Redundancy in Large Speech Language Models

Researchers demonstrate that large speech language models contain significant redundancy in their token representations, particularly in deeper layers. By introducing Affinity Pooling, a training-free token merging technique, they achieve 27.48% reduction in prefilling FLOPs and up to 1.7ร— memory savings while maintaining semantic accuracy, challenging the necessity of fully distinct tokens for acoustic processing.

AIBullisharXiv โ€“ CS AI ยท 6d ago7/10
๐Ÿง 

Q-Zoom: Query-Aware Adaptive Perception for Efficient Multimodal Large Language Models

Q-Zoom is a new framework that improves the efficiency of multimodal large language models by intelligently processing high-resolution visual inputs. Using adaptive query-aware perception, the system achieves 2.5-4.4x faster inference speeds on document and high-resolution tasks while maintaining or exceeding baseline accuracy across multiple MLLM architectures.

AIBullisharXiv โ€“ CS AI ยท 6d ago7/10
๐Ÿง 

SPICE: Submodular Penalized Information-Conflict Selection for Efficient Large Language Model Training

Researchers introduce SPICE, a data selection algorithm that reduces large language model training data requirements by 90% while maintaining performance by identifying and minimizing gradient conflicts between training samples. The method combines information-theoretic principles with practical efficiency improvements, enabling effective model tuning on just 10% of typical datasets across multiple benchmarks.

AIBullisharXiv โ€“ CS AI ยท Apr 77/10
๐Ÿง 

k-Maximum Inner Product Attention for Graph Transformers and the Expressive Power of GraphGPS The Expressive Power of GraphGPS

Researchers introduce k-Maximum Inner Product (k-MIP) attention for graph transformers, enabling linear memory complexity and up to 10x speedups while maintaining full expressive power. The innovation allows processing of graphs with over 500k nodes on a single GPU and demonstrates top performance on benchmark datasets.

AIBullisharXiv โ€“ CS AI ยท Mar 277/10
๐Ÿง 

Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model

Researchers propose HIVE, a new framework for training large language models more efficiently in reinforcement learning by selecting high-utility prompts before rollout. The method uses historical reward data and prompt entropy to identify the 'learning edge' where models learn most effectively, significantly reducing computational overhead without performance loss.

AINeutralarXiv โ€“ CS AI ยท Mar 267/10
๐Ÿง 

Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding

Researchers propose DIG, a training-free framework that improves long-form video understanding by adapting frame selection strategies based on query types. The system uses uniform sampling for global queries and specialized selection for localized queries, achieving better performance than existing methods while scaling to 256 input frames.

AIBullisharXiv โ€“ CS AI ยท Mar 177/10
๐Ÿง 

EcoAlign: An Economically Rational Framework for Efficient LVLM Alignment

Researchers introduce EcoAlign, a new framework for aligning Large Vision-Language Models that treats alignment as an economic optimization problem. The method balances safety, utility, and computational costs while preventing harmful reasoning disguised with benign justifications, showing superior performance across multiple models and datasets.

AIBullisharXiv โ€“ CS AI ยท Mar 177/10
๐Ÿง 

Why Inference in Large Models Becomes Decomposable After Training

Researchers have discovered that large AI models develop decomposable internal structures during training, with many parameter dependencies remaining statistically unchanged from initialization. They propose a post-training method to identify and remove unsupported dependencies, enabling parallel inference without modifying model functionality.

AIBullisharXiv โ€“ CS AI ยท Mar 177/10
๐Ÿง 

Reducing Cost of LLM Agents with Trajectory Reduction

Researchers introduce AgentDiet, a trajectory reduction technique that cuts computational costs for LLM-based agents by 39.9%-59.7% in input tokens and 21.1%-35.9% in total costs while maintaining performance. The approach removes redundant and expired information from agent execution trajectories during inference time.

AINeutralarXiv โ€“ CS AI ยท Mar 177/10
๐Ÿง 

Accelerating Suffix Jailbreak attacks with Prefix-Shared KV-cache

Researchers developed Prefix-Shared KV Cache (PSKV), a new technique that accelerates jailbreak attacks on Large Language Models by 40% while reducing memory usage by 50%. The method optimizes the red-teaming process by sharing cached prefixes across multiple attack attempts, enabling more efficient parallel inference without compromising attack success rates.

AIBullisharXiv โ€“ CS AI ยท Mar 177/10
๐Ÿง 

Mixture-of-Depths Attention

Researchers introduce Mixture-of-Depths Attention (MoDA), a new mechanism for large language models that allows attention heads to access key-value pairs from both current and preceding layers to combat signal degradation in deeper models. Testing on 1.5B-parameter models shows MoDA improves perplexity by 0.2 and downstream task performance by 2.11% with only 3.7% computational overhead while maintaining 97.3% of FlashAttention-2's efficiency.

๐Ÿข Perplexity
AIBullisharXiv โ€“ CS AI ยท Mar 177/10
๐Ÿง 

AutoTool: Automatic Scaling of Tool-Use Capabilities in RL via Decoupled Entropy Constraints

Researchers introduce AutoTool, a new reinforcement learning approach that enables AI agents to automatically scale their reasoning capabilities for tool use. The method uses entropy-based optimization and supervised fine-tuning to help models efficiently determine appropriate thinking lengths for simple versus complex problems, achieving 9.8% accuracy improvements while reducing computational overhead by 81%.

AIBullisharXiv โ€“ CS AI ยท Mar 167/10
๐Ÿง 

Spend Less, Reason Better: Budget-Aware Value Tree Search for LLM Agents

Researchers propose Budget-Aware Value Tree (BAVT), a training-free framework that improves LLM agent efficiency by intelligently managing computational resources during multi-hop reasoning tasks. The system outperforms traditional approaches while using 4x fewer resources, demonstrating that smart budget management beats brute-force compute scaling.

AIBullisharXiv โ€“ CS AI ยท Mar 167/10
๐Ÿง 

Efficient Reasoning with Balanced Thinking

Researchers propose ReBalance, a training-free framework that optimizes Large Reasoning Models by addressing overthinking and underthinking issues through confidence-based guidance. The solution dynamically adjusts reasoning trajectories without requiring model retraining, showing improved accuracy across multiple AI benchmarks.

AIBullisharXiv โ€“ CS AI ยท Mar 117/10
๐Ÿง 

Reasoning Efficiently Through Adaptive Chain-of-Thought Compression: A Self-Optimizing Framework

Researchers propose SEER (Self-Enhancing Efficient Reasoning), a framework that compresses Chain-of-Thought reasoning in Large Language Models while maintaining accuracy. The study found that longer reasoning chains don't always improve performance and can increase latency by up to 5x, leading to a 42.1% reduction in CoT length while improving accuracy.

Page 1 of 6Next โ†’