#parameter-efficiency News & Analysis

71 articles tagged with #parameter-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

71 articles

AINeutralarXiv – CS AI · Jun 236/10

🧠

CADRE: Stable, Parameter Efficient Adaptation of Medical Vision Language Models with Bounded Forgetting and Prior Drift

Researchers present CADRE, a parameter-efficient adaptation framework for medical vision-language models that addresses catastrophic forgetting and model drift when updating deployed systems. By combining low-rank adaptation with elastic weight consolidation and prior-anchoring penalties, CADRE reduces forgetting sevenfold while training only 0.23% of parameters, demonstrating improved stability across different medical imaging modalities.

AINeutralarXiv – CS AI · Jun 196/10

🧠

QC-GAN: A Parameter-Efficient Quaternion Conformer GAN for High-Fidelity Speech Enhancement

Researchers introduce QC-GAN, a parameter-efficient speech enhancement model combining Quaternion Conformer architecture with MetricGAN training. The framework achieves state-of-the-art speech quality scores while using less than half the parameters of comparable models, with a 35K-parameter variant demonstrating viable ultra-lightweight performance.

AIBullisharXiv – CS AI · Jun 196/10

🧠

SoftSkill: Behavioral Compression for Contextual Adaptation

SoftSkill introduces a method to compress natural-language AI agent skills into compact continuous context objects that improve task performance without retraining frozen language models. By replacing lengthy Markdown skill files with 32-token soft prefixes, the approach demonstrates significant accuracy gains across multiple benchmarks while reducing computational overhead.

AINeutralarXiv – CS AI · Jun 196/10

🧠

Zero-Inflated Gaussian Distributions Enable Parameter-Space Sparsity in Estimation-of-Distribution Algorithms

Researchers introduce zero-inflated Gaussian (ZIG) distributions for estimation-of-distribution algorithms (EDAs) to optimize sparse parameter spaces where most solution coefficients are zero. This approach eliminates the need for hand-crafted sparsity operators and outperforms existing sparse optimization methods on benchmarks.

AINeutralarXiv – CS AI · Jun 196/10

🧠

LOKI: Memory-Free Null-Space Constrained Lifelong Knowledge Editing

LOKI is a new method for lifelong knowledge editing in language models that dynamically selects which layers to update and avoids catastrophic forgetting without requiring access to previous training data. The approach achieves up to 14% improvement in accuracy over existing methods by using the Hilbert-Schmidt Independence Criterion and null-space projection techniques.

AINeutralarXiv – CS AI · Jun 116/10

🧠

PermDoRA -- Understanding Adapter Interference in Language Models: Limits of Parameter-Space Geometry

Researchers challenge the conventional wisdom that adapter interference in language models stems from parameter-space geometry by testing whether orthogonal or directionally independent updates reduce cross-domain interference. Their findings using DoRA-RBAC on multiple LLMs show geometry-aware merging provides no consistent advantage, suggesting interference mechanisms operate in shared nonlinear representations rather than linear parameter space.

AINeutralarXiv – CS AI · Jun 116/10

🧠

Soft-Prompt Tuning for Fair and Efficient LLM Benchmark Evaluation

Researchers propose soft-prompt tuning, a parameter-efficient method that adapts large language models to benchmark formatting requirements by optimizing only 0.0006% of model parameters. This technique reveals that benchmark scores often underestimate base model knowledge due to formatting constraints, enabling fairer evaluation across different model architectures and pre-training approaches.

🏢 Meta

AINeutralarXiv – CS AI · Jun 106/10

🧠

GRID: Scaling Task-Agnostic Inference in Continual Prompt Tuning

Researchers introduce GRID, a framework addressing scalability and task-agnostic inference challenges in continual prompt tuning for large language models. The method combines output-aware decoding with gradient-guided prompt selection to improve backward transfer while reducing memory consumption across multiple LLM architectures.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Revisiting Training Scale: An Empirical Study of Token Count, Power Consumption, and Parameter Efficiency

A new empirical study challenges the assumption that scaling training token counts linearly improves large language model performance, revealing instead that increased token counts lead to strictly declining training efficiency when energy consumption and execution duration are measured alongside traditional metrics.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Seq103: A Unified Neuroevolution Framework for Compact Sequence Architecture Discovery

Seq103 introduces a unified neuroevolution framework that automatically discovers compact neural network architectures for sequence tasks, achieving 81-87% of baseline accuracy while using 11-3,200x fewer parameters. The framework applies the same evolutionary search pipeline to both recurrent and feedforward sequence classification, offering significant efficiency gains for resource-constrained deployments.

AINeutralarXiv – CS AI · Jun 96/10

🧠

VFEM: Visual Feature Empowered Multivariate Time Series Forecasting with Cross-Modal Fusion

Researchers present VFEM, a cross-modal forecasting model that combines pre-trained vision models with time series data to improve multivariate forecasting by capturing cross-channel dependencies. The approach transforms time series into visual representations and uses cross-modal attention fusion, achieving competitive performance while training only 7.45% of total parameters.

AINeutralarXiv – CS AI · Jun 56/10

🧠

Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

Researchers introduce Code2LoRA, a hypernetwork framework that generates repository-specific LoRA adapters for code language models, eliminating the need for expensive fine-tuning or lengthy context injection. The approach achieves competitive performance with lower computational overhead and introduces RepoPeftBench, a 604-repository benchmark for evaluating code model adaptation techniques.

🏢 Hugging Face

AIBullisharXiv – CS AI · Jun 46/10

🧠

Signed Dual Attention: Capturing Signed Dependencies in Time Series Forecasting

Researchers introduce Signed Dual Attention, a novel transformer attention mechanism that captures both positive and negative dependencies in time series data without requiring additional parameters. By using a dual message-passing approach inspired by correlation structures, this technique achieves greater expressiveness while maintaining parameter efficiency, potentially improving forecasting accuracy in applications requiring signed relational modeling.

AINeutralarXiv – CS AI · Jun 25/10

🧠

Finer Parameter Steps for Low-Rank PEFT: A Controlled Study with CP Tensor Adapters

Researchers compare canonical polyadic (CP) tensor adapters with LoRA for low-rank parameter-efficient fine-tuning, finding that finer parameter increments enable better budget sensitivity diagnosis but don't guarantee superior accuracy-budget trade-offs across all tasks.

AIBullisharXiv – CS AI · Jun 26/10

🧠

Logit Distillation on Manifolds: Mapping by Learning

Researchers introduce a layer-wise projection mapping technique for knowledge distillation that enables efficient model compression, reducing trainable parameters to under 1% of the teacher model while maintaining performance improvements. Combined with LoRA injection, this approach significantly outperforms traditional distillation methods in word error rate metrics and enables rapid parallel training without the computational overhead of mixture-of-experts models.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Collaborative and Efficient Fine-tuning: Leveraging Task Similarity

Researchers propose CoLoRA (Collaborative Low-Rank Adaptation), a novel fine-tuning method that improves foundation model adaptation by leveraging task similarity across multiple users. The approach combines shared adapters capturing common task patterns with personalized adapters for user-specific needs, demonstrating significant performance gains when similar tasks are trained together.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Graph-Conditioned Mixture of Graph Neural Network Experts for Traffic Forecasting

Researchers propose GC-MoE, a graph-conditioned mixture of experts framework that improves traffic forecasting by assigning specialized neural network experts to different road segments based on graph topology. The approach trains only 17K parameters while leveraging 1.5M frozen expert weights, achieving competitive results across four standard traffic prediction benchmarks.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Beyond Classification: Dynamic Adapter Routing for Continual Multimodal Retrieval

Researchers introduce Dynamic Adapter Routing (DAR), a novel approach to continual multimodal retrieval that moves beyond traditional class-incremental learning methods. The study presents a new evaluation framework for vision-language models that better captures real-world retrieval dynamics, with DAR demonstrating superior performance and strong generalization capabilities.

AINeutralarXiv – CS AI · May 296/10

🧠

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

Researchers propose EKSFT, a novel fine-tuning method that selectively masks high-entropy and high-KL divergence tokens during supervised fine-tuning of large language models. The approach aims to preserve pre-trained model distributions while efficiently activating task-relevant capabilities in low-data regimes, demonstrating improved performance on mathematical reasoning benchmarks.

AINeutralarXiv – CS AI · May 296/10

🧠

Context Distillation as Latent Memory Management

Researchers propose a novel approach to context distillation that treats compressed contextual information as a latent memory management problem, using modular LoRA adapters with intelligent retrieval and self-gating mechanisms to improve efficiency and robustness in machine learning systems.

AIBullisharXiv – CS AI · May 296/10

🧠

DynSess: Dynamic Session-Level Evaluation and Optimization Framework for Role-Playing Agents

Researchers introduce DynSess, a framework that evaluates and optimizes role-playing agents at the session level rather than individual turns, enabling LLMs to maintain character consistency across extended conversations. The framework includes improved evaluation metrics, optimized training methods (DSPO and GSRPO), and demonstrates performance matching larger models with fewer parameters.

AINeutralarXiv – CS AI · May 286/10

🧠

Energy-Structured Low-Rank Adaptation for Continual Learning

Researchers propose E²-LoRA, a novel continual learning method that addresses task interference by concentrating knowledge into low-rank representations rather than spreading it across multiple basis vectors. The approach theoretically proves that preserving parameters along principal drift directions minimizes reconstruction error while freeing model capacity for future tasks.

AINeutralarXiv – CS AI · May 286/10

🧠

UniMaia: Steering Chess Policies with Language for Human-like Play

UniMaia is a new AI framework that uses natural language prompts to control chess-playing policy networks, enabling semantic control over gameplay elements like opening selection and player strength without requiring large-scale multimodal training. The system combines a frozen Lc0 chess engine with a parameter-efficient text encoder and demonstrates competitive performance on prompt-conditioned benchmarks while maintaining domain-specific expertise.

AIBullisharXiv – CS AI · May 286/10

🧠

Extracting Small Translation Specialists from LLMs by Aggressively Pruning Experts

Researchers present a method for aggressively pruning expert modules from mixture-of-experts large language models to create specialized translation systems. The approach removes up to 90% of experts with minimal performance degradation, demonstrating that translation tasks require only a fraction of a full LLM's parameters, enabling substantial model compression.

AINeutralarXiv – CS AI · May 126/10

🧠

Agentic Performance at the Edge: Insights from Benchmarking

Researchers benchmark agentic AI performance on edge devices constrained to 8 billion parameters or smaller, finding that model quality loss isn't simply proportional to parameter reduction. The study reveals that optimal edge-agent deployment requires joint optimization of model selection and tool workflows, with distinct failure patterns across model families guiding practical deployment strategies.

← PrevPage 2 of 3Next →