133 articles tagged with #computational-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv – CS AI · Mar 126/10
🧠Researchers conducted the first comprehensive evaluation of parameter-efficient fine-tuning (PEFT) for multi-task code analysis, showing that a single PEFT module can match full fine-tuning performance while reducing computational costs by up to 85%. The study found that even 1B-parameter models with multi-task PEFT outperform large general-purpose LLMs like DeepSeek and CodeLlama on code analysis tasks.
AIBullisharXiv – CS AI · Mar 126/10
🧠Researchers propose Dynamics-Predictive Sampling (DPS), a new method that improves reinforcement learning finetuning of large language models by predicting which training prompts will be most informative without expensive computational rollouts. The technique models each prompt's learning progress as a dynamical system and uses Bayesian inference to select better training data, reducing computational overhead while achieving superior reasoning performance.
AIBullisharXiv – CS AI · Mar 116/10
🧠Facebook Research introduces the Latent Speech-Text Transformer (LST), which aggregates speech tokens into higher-level patches to improve computational efficiency and cross-modal alignment. The model achieves up to +6.5% absolute gain on speech HellaSwag benchmarks while maintaining text performance and reducing inference costs for ASR and TTS tasks.
AIBullisharXiv – CS AI · Mar 96/10
🧠Researchers developed A-3PO, an optimization technique for training large language models that eliminates computational overhead in reinforcement learning algorithms. The approach achieves 1.8x training speedup while maintaining comparable performance by approximating proximal policy through interpolation rather than explicit computation.
AIBullisharXiv – CS AI · Mar 96/10
🧠Researchers introduce Dynamic Chunking Diffusion Transformer (DC-DiT), a new AI model that adaptively processes images by allocating more computational resources to detail-rich regions and fewer to uniform backgrounds. The system improves image generation quality while reducing computational costs by up to 16x compared to traditional diffusion transformers.
AIBullisharXiv – CS AI · Mar 96/10
🧠Researchers present CASA, a new approach using cross-attention over self-attention for vision-language models that maintains competitive performance while significantly reducing memory and compute costs. The method shows particular advantages for real-time applications like video captioning by avoiding expensive token insertion into language model streams.
AIBearisharXiv – CS AI · Mar 96/10
🧠Research reveals that speech LLMs don't perform significantly better than traditional ASR→LLM pipelines in most deployed scenarios. The study shows speech LLMs essentially function as expensive cascades that perform worse under noisy conditions, with advantages reversing by up to 7.6% at 0dB noise levels.
$LLM
AIBullisharXiv – CS AI · Mar 55/10
🧠Researchers developed a new variance-reduced EXP4-based algorithm for optimizing routing policies in multi-layer hierarchical inference systems. The solution addresses the challenge of sparse, policy-dependent feedback in AI systems where prediction errors are only revealed at terminal layers, improving stability and performance over standard importance-weighted approaches.
AIBullisharXiv – CS AI · Mar 55/10
🧠Researchers have developed MeanFlowSE, a new generative AI model for speech enhancement that performs single-step inference instead of requiring multiple computational steps. The method achieves strong audio quality with substantially lower computational costs, making it suitable for real-time applications without needing knowledge distillation or external teachers.
AIBullisharXiv – CS AI · Mar 55/10
🧠Researchers propose JPmHC (Jacobian-spectrum Preserving manifold-constrained Hyper-Connections), a new deep learning framework that improves upon existing Hyper-Connections by replacing identity skips with trainable linear mixers while controlling gradient conditioning. The framework addresses training instability and memory overhead issues in current deep learning architectures through constrained optimization on specific mathematical manifolds.
AINeutralarXiv – CS AI · Mar 55/10
🧠Researchers propose Local Shapley, a new method that dramatically reduces computational complexity in data valuation by focusing only on training data points that actually influence specific predictions. The approach achieves substantial speedups while maintaining accuracy by leveraging model-induced locality properties.
AINeutralarXiv – CS AI · Mar 45/102
🧠Researchers developed a method to extract numerical prediction distributions from Large Language Models without costly autoregressive sampling by training probes on internal representations. The approach can predict statistical functionals like mean and quantiles directly from LLM embeddings, potentially offering a more efficient alternative for uncertainty-aware numerical predictions.
AINeutralarXiv – CS AI · Mar 45/103
🧠Researchers introduce MELODI, a framework for monitoring energy consumption during large language model inference, revealing substantial disparities in energy efficiency across different deployment scenarios. The study creates a comprehensive dataset analyzing how prompt attributes like length and complexity correlate with energy expenditure, highlighting significant opportunities for optimization in LLM deployment.
AIBullisharXiv – CS AI · Mar 37/108
🧠Researchers introduce DenoiseFlow, a framework that addresses reliability issues in AI agent workflows by managing uncertainty through adaptive computation allocation and error correction. The system achieves 83.3% average accuracy across benchmarks while reducing computational costs by 40-56% through intelligent branching decisions.
$COMP
AINeutralarXiv – CS AI · Mar 36/107
🧠Researchers found that AI agents perform better when their training data matches their deployment environment, specifically regarding interpreter state persistence. Models trained with persistent state but deployed in stateless environments trigger errors in 80% of cases, while the reverse wastes 3.5x more tokens through redundant computations.
AIBullisharXiv – CS AI · Mar 37/108
🧠Researchers introduce Coupled Discrete Diffusion (CoDD), a breakthrough framework that solves the "factorization barrier" in diffusion language models by enabling parallel token generation without sacrificing coherence. The approach uses a lightweight probabilistic inference layer to model complex joint dependencies while maintaining computational efficiency.
AIBullisharXiv – CS AI · Mar 36/1012
🧠Researchers developed FMCT/EFMCT, a new Flow Matching-based framework for CT medical imaging reconstruction that significantly improves computational efficiency over existing diffusion models. The method uses deterministic ordinary differential equations and velocity field reuse to reduce neural network evaluations while maintaining reconstruction quality.
AIBullisharXiv – CS AI · Mar 36/108
🧠Researchers have developed ESENSC_rev2, a polynomial-time alternative to SHAP for AI feature attribution that offers similar accuracy with significantly improved computational efficiency. The method uses cooperative game theory and provides theoretical foundations through axiomatic characterization, making it suitable for high-dimensional explainability tasks.
AIBullisharXiv – CS AI · Mar 37/107
🧠Researchers propose FastBUS, a new Bayesian framework for weakly-supervised machine learning that addresses computational inefficiencies in existing methods. The framework uses probabilistic transitions and belief propagation to achieve state-of-the-art results while delivering up to hundreds of times faster processing speeds than current general methods.
AIBullisharXiv – CS AI · Mar 36/108
🧠Researchers propose FAST-DIPS, a new training-free diffusion prior method for solving inverse problems that achieves up to 19.5x speedup while maintaining competitive image quality metrics. The method replaces computationally expensive inner optimization loops with closed-form projections and analytic step sizes, significantly reducing the number of required denoiser evaluations.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers propose FreeAct, a new quantization framework for Large Language Models that improves efficiency by using dynamic transformation matrices for different token types. The method achieves up to 5.3% performance improvement over existing approaches by addressing the memory and computational overhead challenges in LLMs.
AIBullisharXiv – CS AI · Mar 36/103
🧠Researchers introduce MatRIS, a new machine learning interaction potential model for materials science that achieves comparable accuracy to leading equivariant models while being significantly more computationally efficient. The model uses attention-based three-body interactions with linear O(N) complexity, demonstrating strong performance on benchmarks like Matbench-Discovery with an F1 score of 0.847.
AINeutralarXiv – CS AI · Mar 36/103
🧠Researchers identified 'internal bias' as a key cause of overthinking in AI reasoning models, where models form preliminary guesses that conflict with systematic reasoning. The study found that excessive attention to input questions triggers redundant reasoning steps, and current mitigation methods have proven ineffective.
AINeutralarXiv – CS AI · Mar 36/104
🧠Researchers present a new framework for adaptive reasoning in large language models, addressing the problem that current LLMs use uniform reasoning strategies regardless of task complexity. The survey formalizes adaptive reasoning as a control-augmented policy optimization problem and proposes a taxonomy of training-based and training-free approaches to achieve more efficient reasoning allocation.
AIBullisharXiv – CS AI · Mar 36/104
🧠Researchers have developed FMIP, a new generative AI framework that models both integer and continuous variables simultaneously to solve Mixed-Integer Linear Programming problems more efficiently. The approach reduces the primal gap by 41.34% on average compared to existing baselines and is compatible with various downstream solvers.